1. Introduction

1. 引言

| Contents | How to Run Python Code >

如何运行 Python 代码

Conceived in the late 1980s as a teaching and scripting language, Python has since become an essential tool for many programmers, engineers, researchers, and data scientists across academia and industry. As an astronomer focused on building and promoting the free open tools for data-intensive science, I've found Python to be a near-perfect fit for the types of problems I face day to day, whether it's extracting meaning from large astronomical datasets, scraping and munging data sources from the Web, or automating day-to-day research tasks.

在20世纪80年代后期,Python 被认为是一个教学和脚本语言工具,从那时起,它已经成为许多程序员、工程师、研究人员和数据科学家在学术界和工业界的必不可少的工具。作为一名专注于为数据密集型科学构建和推广免费开放工具的天文学家,我发现 Python 几乎完美地适合解决我日常面临的各种问题,无论是从大型天文数据集中提取意义,还是从网络上搜集和整理数据源,还是自动化日常研究任务。

The appeal of Python is in its simplicity and beauty, as well as the convenience of the large ecosystem of domain-specific tools that have been built on top of it. For example, most of the Python code in scientific computing and data science is built around a group of mature and useful packages:

Python 的吸引力在于它的简单性和美观性,以及在其基础上构建的大型特定于领域的工具生态系统的便利性。例如,科学计算和数据科学中的大多数 Python 代码都是围绕一组成熟而有用的包构建的:

  • NumPy provides efficient storage and computation for multi-dimensional data arrays.
  • NumPy 为多维数据数组提供了高效的存储和计算。
  • SciPy contains a wide array of numerical tools such as numerical integration and interpolation.
  • SciPy 包含了大量的数值工具,比如数值积分和插值。
  • Pandas provides a DataFrame object along with a powerful set of methods to manipulate, filter, group, and transform data.
  • Pandas 提供了一个 DataFrame 对象以及一组用于操作、过滤、分组和转换数据的强大方法。
  • Matplotlib provides a useful interface for creation of publication-quality plots and figures.
  • Matplotlib 为创建具有出版质量的图和数据提供了一个有用的界面。
  • Scikit-Learn provides a uniform toolkit for applying common machine learning algorithms to data.
  • Scikit-Learn 提供了一个统一的工具包,用于将通用的机器学习算法应用于数据。
  • IPython/Jupyter provides an enhanced terminal and an interactive notebook environment that is useful for exploratory analysis, as well as creation of interactive, executable documents. For example, the manuscript for this report was composed entirely in Jupyter notebooks.
  • IPython/Jupyter 提供了一个增强的终端和交互式笔记本环境,这对于探索性分析以及交互式可执行文档的创建非常有用。例如,这份报告的手稿完全是在 Jupyter 的笔记本上完成的。

No less important are the numerous other tools and packages which accompany these: if there is a scientific or data analysis task you want to perform, chances are someone has written a package that will do it for you.

同样重要的还有其他大量的工具和软件包: 如果你想执行一项科学或数据分析任务,很可能已经有人编写了一个软件包来帮你完成。

To tap into the power of this data science ecosystem, however, first requires familiarity with the Python language itself. I often encounter students and colleagues who have (sometimes extensive) backgrounds in computing in some language – MATLAB, IDL, R, Java, C++, etc. – and are looking for a brief but comprehensive tour of the Python language that respects their level of knowledge rather than starting from ground zero. This report seeks to fill that niche.

然而,要利用这个数据科学生态系统的力量,首先需要熟悉 Python 语言本身。我经常遇到学生和同事,他们有(有时是广泛的)某种语言的计算背景-MATLAB,IDL,r,Java,c + + 等。- 并且正在寻找一个简短而全面的 Python 语言之旅,尊重他们的知识水平,而不是从零开始。本报告试图填补这一空白。

As such, this report in no way aims to be a comprehensive introduction to programming, or a full introduction to the Python language itself; if that is what you are looking for, you might check out one of the recommended references listed in Resources for Learning. Instead, this will provide a whirlwind tour of some of Python's essential syntax and semantics, built-in data types and structures, function definitions, control flow statements, and other aspects of the language. My aim is that readers will walk away with a solid foundation from which to explore the data science stack just outlined.

因此,本报告绝不是对编程的全面介绍,也不是对 Python 语言本身的全面介绍; 如果您正在寻找这样的介绍,那么您可以查看参考资料中列出的推荐参考资料。相反,这将提供 Python 的一些基本语法和语义、内置数据类型和结构、函数定义、控制流语句以及语言的其他方面的旋风式旅行。我的目标是,读者将走出一个坚实的基础,从中探索数据科学堆栈刚才概述。

Using Code Examples 使用代码示例

Supplemental material (code examples, exercises, etc.) is available for download at https://github.com/jakevdp/WhirlwindTourOfPython/. This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

补充材料(代码例子,练习等)可在 https://github.com/jakevdp/whirlwindtourofpython/下载。这本书是为了帮助你完成你的工作。一般来说,如果本书提供示例代码,您可以在程序和文档中使用它。您不需要与我们联系获得许可,除非您正在复制代码的重要部分。例如,编写一个使用本书中多段代码的程序不需要许可。销售或分发 o’reilly 书籍中的样本光盘确实需要许可。通过引用本书并引用示例代码来回答问题不需要许可。将本书中大量的示例代码合并到您的产品文档中确实需要许可。

We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: "A Whirlwind Tour of Python by Jake VanderPlas (O’Reilly). Copyright 2016 O’Reilly Media, Inc., 978-1-491-96465-1."

我们欣赏,但不要求,归属。署名通常包括书名、作者、出版商和 ISBN。例如: “由 Jake VanderPlas (o’reilly)设计的 Python 旋风之旅。版权所有2016 o’reilly Media,inc. ,978-1-491-96465-1。”

If you feel your use of code examples falls outside fair use or the per‐ mission given above, feel free to contact us at permissions@oreilly.com.

如果你觉得你使用的代码例子不属于合理使用或者上面给出的每个任务,请随时联系我们 permissions@oreilly. com。

Installation and Practical Considerations 安装与实用考虑

Installing Python and the suite of libraries that enable scientific computing is straightforward whether you use Windows, Linux, or Mac OS X. This section will outline some of the considerations when setting up your computer.

无论您使用 Windows、 Linux 还是 macosx,安装 Python 和支持科学计算的库套件都很简单。本节将概述安装计算机时的一些注意事项。

Python 2 vs Python 3 2 vs Python 3

This report uses the syntax of Python 3, which contains language enhancements that are not compatible with the 2.x series of Python. Though Python 3.0 was first released in 2008, adoption has been relatively slow, particularly in the scientific and web development communities. This is primarily because it took some time for many of the essential packages and toolkits to be made compatible with the new language internals. Since early 2014, however, stable releases of the most important tools in the data science ecosystem have been fully-compatible with both Python 2 and 3, and so this book will use the newer Python 3 syntax. Even though that is the case, the vast majority of code snippets in this book will also work without modification in Python 2: in cases where a Py2-incompatible syntax is used, I will make every effort to note it explicitly.

本报告使用 Python 3的语法,其中包含与 Python 的2.x 系列不兼容的语言增强。尽管 Python 3.0在2008年首次发布,但其应用相对缓慢,尤其是在科学和网络开发社区。这主要是因为许多基本的软件包和工具包需要一些时间才能与新的语言内部兼容。然而,自2014年初以来,数据科学生态系统中最重要的工具的稳定版本已经与 Python 2和3完全兼容,因此本书将使用更新的 Python 3语法。尽管如此,本书中的绝大多数代码片段在 Python 2中也不需要修改就可以工作: 在使用不兼容 py2的语法的情况下,我将尽一切努力明确地注意到它。

Installation with conda 带圆锥的安装

Though there are various ways to install Python, the one I would suggest – particularly if you wish to eventually use the data science tools mentioned above – is via the cross-platform Anaconda distribution. There are two flavors of the Anaconda distribution:

虽然有各种方法可以安装 Python,但是我建议——特别是如果您希望最终使用上面提到的数据科学工具——通过跨平台的 Anaconda 发行版进行安装。水蟒分布有两种类型:

  • Miniconda gives you Python interpreter itself, along with a command-line tool called conda which operates as a cross-platform package manager geared toward Python packages, similar in spirit to the apt or yum tools that Linux users might be familiar with.
  • Miniconda 为您提供 Python 解释器本身,以及一个名为 conda 的命令行工具,它是一个跨平台的面向 Python 包的包管理器,在精神上类似于 Linux 用户可能熟悉的 apt 或 yum 工具。
  • Anaconda includes both Python and conda, and additionally bundles a suite of other pre-installed packages geared toward scientific computing.
  • Anaconda 包括 Python 和 conda,另外还包含一套其他预先安装的面向科学计算的包。

Any of the packages included with Anaconda can also be installed manually on top of Miniconda; for this reason I suggest starting with Miniconda.

附带的任何软件包都可以手动安装在 Miniconda 上,因此我建议从 Miniconda 开始。

To get started, download and install the Miniconda package – make sure to choose a version with Python 3 – and then install the IPython notebook package:

首先,下载并安装 Miniconda 包——确保选择一个 Python 3版本——然后安装 IPython 笔记本包:

[~]$ conda install ipython-notebook

For more information on conda, including information about creating and using conda environments, refer to the Miniconda package documentation linked at the above page.

有关 conda 的更多信息,包括创建和使用 conda 环境的信息,请参考上面页面链接的 Miniconda 包文档。

The Zen of Python 巨蟒之禅

Python aficionados are often quick to point out how "intuitive", "beautiful", or "fun" Python is. While I tend to agree, I also recognize that beauty, intuition, and fun often go hand in hand with familiarity, and so for those familiar with other languages such florid sentiments can come across as a bit smug. Nevertheless, I hope that if you give Python a chance, you'll see where such impressions might come from. And if you really want to dig into the programming philosophy that drives much of the coding practice of Python power-users, a nice little Easter egg exists in the Python interpreter: simply close your eyes, meditate for a few minutes, and import this:

Python 爱好者通常会很快指出 Python 是多么“直观”、“美观”或“有趣”。虽然我倾向于同意,但我也认识到,美丽、直觉和乐趣常常与熟悉程度密切相关,因此对于那些熟悉其他语言的人来说,这种华丽的感觉可能会给人一种沾沾自喜的感觉。不过,我希望如果您给 Python 一个机会,您将看到这种印象可能来自哪里。如果你真的想深入了解驱动 Python 高级用户大量编程实践的编程哲学,Python 解释器中存在一个不错的复活节彩蛋: 只需闭上眼睛,冥想几分钟,然后导入以下命令:

In [1]: 在[1]中:

import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

With that, let's start our tour of the Python language.

接下来,让我们开始我们的 Python 语言之旅。