Loading...
墨滴

逸之

2022/01/04  阅读:20  主题:红绯

DCIC数据驱动计算导论1

1.1What This Book is About

1.1本书的内容

This book is an introduction to computer science. It will teach you to program, and do so in ways that are of practical value and importance. However, it will also go beyond programming to computer science, a rich, deep, fascinating, and beautiful intellectual discipline. You will learn many useful things that you can apply right away, but we will also show you some of what lies beneath and beyond.

这本书是计算机科学的介绍。它将教你编程,并且以具有实际价值和重要性的方式去做。然而,它也将超越编程,进入计算机科学,一门丰富的、深刻的、迷人的、美丽的智力学科。你会学到许多有用的东西,你可以马上应用,但我们也会向你展示一些什么在下面和更远。

Most of all, we want to give you ways of thinking about solving problems using computation. Some of these ways are technical methods, such as working from data and examples to construct solutions to problems. Others are scientific methods, such as ways of making sure that programs are reliable and do what they claim. Finally, some are social, thinking about the impacts that programs have on people.

最重要的是,我们想给你们一些思考,如何用计算来解决问题的方法。其中一些方法是技术方法,比如根据数据和示例构造问题的解决方案。还有一些是科学的方法,比如确保程序是可靠的,并且能够做到它们声称的那样。最后,一些是社会性的,考虑项目对人们的影响。

1.2The Values That Drive This Book

1.2驱动这本书的价值观

Our perspective is guided by our decades of experience as software developers, researchers, and educators. This has instilled in us the following beliefs:我们的观点是由我们作为软件开发人员、研究人员和教育工作者几十年的经验引导的。这给我们灌输了以下信念:

  • Software is not written only to be run. It must also be written to be read and maintained by others. Often, that “other” person is you, six months later, who has forgotten what they did and why.

    软件不是为了运行而编写的。它还必须由他人阅读和维护。通常,那个“另一个”人就是六个月后的你,你已经忘记了他们做了什么,为什么这么做。

  • Programmers are responsible for their software meeting its desired goals and being reliable. This is reflected in a variety of disciplines inside computer science, such as testing and verification.

    程序员有责任保证他们的软件达到预期的目标并且是可靠的。这反映在计算机科学的各种学科中,例如测试和验证。

  • Programs ought to be be amenable to prediction. We need to know, as much as possible, before a program runs, how it will behave. This behavior includes not only technical characteristics such as running time, space, power, and so on, but also social impacts, benefits, and harms. Programmers have been notoriously poor at thinking about the latter.

    程序应该接受预测。在程序运行之前,我们需要尽可能多地了解它的行为。这种行为不仅包括运行时间、空间、功率等技术特征,还包括社会影响、利益和危害。众所周知,程序员很难考虑后者。

1.3Our Perspective on Data

1.3我们对数据的看法

These concerns intersect with our belief about how computer science has evolved as a discipline. It is a truism that we live in a world awash with data, but what consequence does that have?

这些担忧与我们对计算机科学作为一门学科是如何发展的信念交织在一起。众所周知,我们生活在一个充斥着大量数据的世界,但这又有什么后果呢?

At a computational level, data have had a profound effect. Traditionally, the only way to make a program better was to improve the program directly, which often meant making it more complicated and impacting the values we discuss above. But there are classes of programs for which there is another method: simply give the same program more or better data, and the program can improve. These data-driven programs lie at the heart of many innovations we see around us.

在计算层面上,数据产生了深远的影响。传统上,使程序变得更好的唯一方法是直接改进程序,这通常意味着使程序变得更加复杂并影响我们上面讨论的值。但是有些程序还有另外一种方法: 只要给同一个程序更多或更好的数据,程序就可以改进。这些数据驱动的程序是我们周围许多创新的核心。

In addition to this technical effect, data can have a profound pedagogic impact, too. Most introductory programming is plagued by artificial data that have no real meaning, interest, or consequence (and often, artificial problems to accompany them). With real data, learners can personalize their education, focusing on problems they find meaningful, enriching, or just plain fun—asking and answering questions they find worthwhile. Indeed, from this perspective, programs interrogate data: that is, programs are tools for answering questions. In turn, the emphasis on real data and real questions enables us to discuss the social impacts of computing.

除了这种技术效应,数据也可以产生深远的教育影响。大多数初级编程都被没有实际意义、兴趣或结果的人工数据所困扰(而且常常伴随着人工问题)。有了真实的数据,学习者可以将他们的教育个性化,专注于他们认为有意义的、丰富的或者仅仅是纯粹的乐趣的问题ーー提出和回答他们认为有价值的问题。事实上,从这个角度来看,程序询问数据: 也就是说,程序是回答问题的工具。反过来,对真实数据和真实问题的强调使我们能够讨论计算的社会影响。

These phenomena have given rise to whole new areas of study, typically called data science. However, typical data science curricula also have many limitations. They pay little attention to what we know about the difficulties of learning to program. They have little emphasis on software reliability. And they fail to recognize that their data are often quite limited in their structure. These limitations, where data science typically ends, are where computer science begins. In particular, the structure of data serve as a point of departure for thinking about and achieving some of the values above—performance, reliability, and predictability—using the many tools of computer science.

这些现象催生了一个全新的研究领域,通常被称为数据科学。然而,典型的数据科学课程也有许多局限性。他们很少注意我们所知道的学习编程的困难。他们很少强调软件的可靠性。他们没有意识到他们的数据结构往往是相当有限的。这些数据科学通常结束的局限,正是计算机科学开始的地方。特别是,数据结构是思考和利用计算机科学的许多工具实现上述价值——性能、可靠性和可预测性——的出发点。

1.4What Makes This Book Unique

1.4是什么让这本书与众不同

First, we propose a new perspective on structuring computing curricula, which we call 首先,我们提出了一个新的视角结构化计算课程,我们称之为data centricity 数据中心性.For more about this, read our essay.想了解更多,请阅读我们的文章。 We view a data-centric curriculum as 我们将以数据为中心的课程视为

data centric = data science + data structures

数据中心 = 数据科学 + 数据结构

in that order: we begin with ideas from data science, before shifting to classical ideas from data structures and the rest of computer science. This book lays out this vision concretely and in detail.我们从数据科学的观点开始,然后从数据结构和计算机科学的其他部分,转向经典的观点。这本书具体而详细地展示了这一愿景。

Second, computing education talks a great deal about notional machines—abstractions of program behavior meant to help students understand how programs work—but few curricula actually use one. We take notional machines seriously, developing a sequence of them and weaving them through the curriculum. This ties to our belief that programs are not only objects that run, but also objects that we reason about.

其次,计算机教育谈论了很多关于概念机器的内容,但很少有课程真正使用概念机器。概念机器是一种抽象的程序行为,旨在帮助学生理解程序是如何工作的。我们认真对待概念机器,开发出一系列的概念机器,并将它们编入课程。这与我们的信念有关,即程序不仅是运行的对象,也是我们推理的对象。

Third, we weave content on socially-responsible computing into the text. Unlike other efforts that focus on exposing students to ethics or the pitfalls of technology in general, we aim to show students how the constructs and concepts that they are turning into code right now can lead to adverse impacts unless used with care. In keeping with our focus on testing and concrete examples, we introduce several topics by getting students to think about assumptions at the level of concrete data. This material is called out explicitly throughout the book.

第三,我们将社会责任计算的内容编入文本。我们的目标是向学生展示他们现在正在转化为代码的构造和概念如何能够导致不利的影响,除非小心使用。为了与我们对测试和具体例子的关注保持一致,我们通过让学生在具体数据的层面上思考假设来介绍几个主题。这些材料在整本书中都被明确地提到了。

Finally, this book is deeply informed by recent and ongoing research results. Our choices of material, order of presentation, programming methods, and more are driven by what we know from the research literature. In many cases, we ourselves are the ones doing the research, so the curriculum and research live in a symbiotic relationship. You can find our papers (some with each other, others not) on our respective pages.

最后,本书从最近和正在进行的研究成果中获得了深刻的信息。我们选择的材料,演示的顺序,编程方法,以及更多是由我们所知道的研究文献。在许多情况下,我们自己就是做研究的人,所以课程和研究是一种共生关系。你可以在我们各自的页面上找到我们的论文(有些是相互的,有些不是)。

1.5Who This Book is For

1.5这本书是写给谁的

This book is written primarily for students who are in the early stages of computing education at the tertiary level (college or university). However, many—especially the earlier—parts of it are also suitable for secondary education (in the USA, for instance, roughly grades 6–12, or ages 12–18). Indeed, we see a natural continuum between secondary and tertiary education, and think this book can serve as a useful bridge between the two.

这本书主要是写给那些在高等教育(学院或大学)计算教育的早期阶段的学生。然而,许多——尤其是早期的部分——也适合中等教育(例如,在美国,大约6-12年级,或12-18岁)。事实上,我们在中学和高等教育之间看到了一个自然的连续统一体,并且认为这本书可以作为二者之间的一个有用的桥梁。

1.6The Structure of This Book

1.6本书的结构

Unlike some other textbooks, this one does not follow a top-down narrative. Rather it has the flow of a conversation, with backtracking. We will often build up programs incrementally, just as a pair of programmers would. We will include mistakes, not because we don’t know better, but because this is the best way for you to learn. Including mistakes makes it impossible for you to read passively: you must instead engage with the material, because you can never be sure of the veracity of what you’re reading.

与其他一些教科书不同,这本教科书没有遵循自上而下的叙述方式。更确切地说,它具有对话的流程,带有回溯。我们经常以增量方式构建程序,就像两个程序员一样。我们将包括错误,不是因为我们不知道更好,而是因为这是你学习的最好方式。包括错误使你不可能被动地阅读: 你必须转而专注于材料,因为你永远不能确定你所阅读的内容的真实性。

At the end, you’ll always get to the right answer. However, this non-linear path is more frustrating in the short term (you will often be tempted to say, “Just tell me the answer, already!”), and it makes the book a poor reference guide (you can’t open up to a random page and be sure what it says is correct). However, that feeling of frustration is the sensation of learning. We don’t know of a way around it.

最后,你总会得到正确的答案。然而,这种非线性的路径在短期内更令人沮丧(你经常会说,“快告诉我答案吧!”) ,这使得这本书成为一个糟糕的参考指南(你不能打开一个随机的页面,并确保它说的是正确的)。然而,这种挫折感就是学习的感觉。我们不知道有什么办法可以绕过它。

We use visual formatting to higlight some of these points. Thus, in several places you will encounter this:

我们使用视觉格式来强调其中的一些要点。因此,在一些地方你会遇到这样的情况:

Exercise

运动

This is an exercise. Do try it.

这是一个练习,一定要试试。

This is a traditional textbook exercise. It’s something you need to do on your own. If you’re using this book as part of a course, this may very well have been assigned as homework. In contrast, you will also find exercise-like questions that look like this:

这是一个传统的教科书练习。这是你需要自己去做的事情。如果你把这本书作为课程的一部分,这本书很可能已经被指定为家庭作业。相比之下,你也会发现类似运动的问题,看起来像这样:

Do Now!

现在就做!

There’s an activity here! Do you see it?

这里有活动! 你看到了吗?

When you get to one of these, stop. Read, think, and formulate an answer before you proceed. You must do this because this is actually an exercise, but the answer is already in the book—most often in the text immediately following (i.e., in the part you’re reading right now)—or is something you can determine for yourself by running a program. If you just read on, you’ll see the answer without having thought about it (or not see it at all, if the instructions are to run a program), so you will get to neither (a) test your knowledge, nor (b) improve your intuitions. In other words, these are additional, explicit attempts to encourage active learning. Ultimately, however, we can only encourage it; it’s up to you to practice it.

当你看到这些东西时,停下来。阅读,思考,并且在你继续之前制定一个答案。你必须这样做,因为这实际上是一个练习,但答案已经在书中(大多数情况下是紧随其后的文本(即你现在正在阅读的部分)中,或者是你可以通过运行一个程序自己确定的东西。如果你只是继续读下去,你会在没有想过的情况下看到答案(或者根本看不到,如果指令是运行程序的话) ,所以你既不会(a)测试你的知识,也不会(b)提高你的直觉。换句话说,这些是额外的,明确的尝试,以鼓励积极的学习。然而,最终,我们只能鼓励它; 这取决于你去实践它。

Specific strategies for program design and development get highlighted in boxes that look like this:

项目设计和开发的具体策略以如下方框的形式突出显示:

Strategy: How to ...

策略: 如何..。

here’s a summary of how to do something.

这里是如何做某事的总结。

Finally, we also call out content on socially-responsible computing with visually distinctive regions like this:

最后,我们还提出了社会责任计算的内容,其视觉区域如下:

Responsible Computing: Did you consider ...

负责任的计算: 你有没有考虑过..。

Here are social pitfalls from using material naively.

以下是天真地使用材料所带来的社会陷阱。

1.7Organization of the Material

1.7材料的组织

This book contains four parts:

这本书包括四个部分:

  1. Foundations: A introduction to programming for beginners that teaches programming and rudimentary data analysis. It introduces core programming concepts through composing images and processing tables, before covering lists, trees, and writing reactive programs, all through a data-centric lens. The notional machine throughout this section is based on substitution.

    基础: 初学者的编程入门,教授编程和基本数据分析。它通过组合图像和处理表介绍了核心编程概念,然后再通过一个以数据为中心的镜头介绍列表、树和编写反应式程序。本节中的概念机器是基于替换的。

  2. Algorithms: Covers asymptotic complexity, recurrences, and fundamental graph algorithms.

    算法: 包括渐近复杂性,递归,和基本的图算法。

  3. Programming with State: Covers working with mutable variables and mutable structured data, building up to understanding (and working with) mutable lists and hashtables. This section transitions to Python. It extends testing to cover the nuances of programs with mutation. The notional machine in this section separate the naming environment (here called the directory) from a heap of structured data values.

    使用状态进行编程: 包括使用可变变量和可变结构化数据,逐步理解(并使用)可变列表和哈希表。本节转换到 Python。它扩展了测试的范围,覆盖了变异程序的细微差别。本节中的概念机器将命名环境(这里称为目录)与结构化数据值堆分开。

  4. Advanced Topics: Returns to algorithms topics that build on an understanding of state and stateful data structures.

    高级主题: 返回基于对状态和有状态数据结构的理解的算法主题。

These parts have been carefully crafted to make sure there are no dependencies from Algorithms to Programming with State. This allows flexibility in offering several different kinds of courses. For instance, we already offer two very different courses by remixing this material, which others could follow:这些部分是精心设计的,以确保不存在从算法到状态编程的依赖关系。这样就可以灵活地提供多种不同的课程。例如,我们已经提供了两个非常不同的课程,通过重新混合这些材料,其他人可以效仿:

  • An introductory course can use Foundations and Programming with State (without Algorithms) to cover the data-centric view of computer science and leaving students with basic skills in Python.

    一门入门课程可以使用状态基础和编程(不包括算法)来涵盖以数据为中心的计算机科学观点,让学生掌握 Python 的基本技能。

  • A more advanced course that assumes students already know some beginning functional programming (e.g., from the early parts of How to Design Programs) could start directly in Algorithms, perhaps with select sections of Foundations either to cover missing material (such as working with tables). This course could continue into Programming with State, followed by Advanced Topics.

    一个更高级的课程,假设学生已经知道一些初级的函数式编程(例如,从如何设计程序的早期部分) ,可以直接从算法开始,也许可以选择基础的某些部分来覆盖遗漏的材料(例如使用表格)。本课程可以继续进入编程与国家,其次是高级主题。

These correspond, respectively, to CSCI 0111 and CSCI 0190 at Brown University. The course pages archive all prior instances of the courses, which include all the assignments and related materials. Readers are welcome to use these in their own courses.

这些数据分别对应于布朗大学的 CSCI 0111和 CSCI 0190。课程页面存档所有课程之前的实例,其中包括所有的作业和相关材料。欢迎读者在他们自己的课程中使用这些。

Many of these courses will have entering students who have programmed with state before (in Python, Java, Scratch, or other languages). In our experience, most of these students have been given either vastly incomplete, or outright misleading, explanations of and metaphors for state (e.g., “a variable is a box”). Thus, they have a poor understanding of it beyond the absolute basics, especially when they get to important topics like aliasing. As a result, many of these students have found it both novel and insightful to properly understand how state really works through our notional machine. For that reason, we recommend going through that material slowly and carefully.

这些课程中的许多课程都将招收以前用过 state 编程的学生(用 Python、 Java、 Scratch 或其他语言编写)。根据我们的经验,这些学生中的大多数要么被给予极不完整的解释,要么被完全误导(比如,“变量是一个盒子”)。因此,除了绝对的基础知识之外,他们对它的理解很差,尤其是当他们涉及到像别名这样的重要话题时。因此,这些学生中的许多人发现,通过我们的概念机器正确理解状态是如何工作的,这既新颖又有见地。出于这个原因,我们建议慢慢地仔细地检查这些材料。

We of course invite readers to create their own mashups of the chapters within the sections. We would love to hear about others’ designs.

当然,我们邀请读者在章节中创建他们自己的混搭。我们很乐意听听其他人的设计。

1.8Our Programming Language Choice

1.8编程语言的选择

If we wanted to get rich, we’d have written this book entirely in Python. As of this writing, Python is enjoying its instructional-use heyday (just like Java before it, C++ before that, C before that, Pascal earlier, and so on). And there are, indeed, many attractive aspects of Python, not least its presence next to bullet points on job listings. However, we’ve been repeatedly frustrated by Python as an entrypoint into learning programming.

如果我们想致富,我们完全可以用 Python 来写这本书。在撰写本文时,Python 正在享受其指导性使用的全盛期(就像它之前的 Java、之前的 c + + 、之前的 c、 Pascal 等等)。的确,Python 有许多吸引人的方面,尤其是在工作列表的要点旁边。然而,Python 作为学习编程的入口点,一再使我们感到失望。

As a result, this book features two programming languages. It starts with a language, called Pyret, that we designed to address our needs and frustrations. It has been expressly designed for the style of programming in this book, so the two can grow in harmony. It draws on Python, but also on many other excellent programming languages. Beginning programmers can therefore rest in the knowledge they are being cared for, while programmers with past acquaintance of the language menagerie, from serpents to dromedaries, should find Pyret familiar and comfortable.

因此,这本书以两种编程语言为特色。它始于一种叫做 Pyret 的语言,我们设计这种语言是为了解决我们的需求和挫折。在这本书中,它被明确地设计为编程的风格,所以两者可以和谐地发展。它借鉴了 Python,但也借鉴了许多其他优秀的编程语言。因此,初级程序员可以在他们正在被关心的知识中休息,而熟悉语言动物园的程序员,从毒蛇到单峰骆驼,应该会觉得 Pyret 很熟悉和舒适。

Then, recognizing the value of Python both as a standard language of communication and for its extensive libraries, the Programming with State part of this book explicitly covers Python. Rather than starting from scratch in Python, we present a systematic and gradual transition to it from the earlier material. We believe this will make you learn general programming better than if you had seen only one programming language. However, we believe this will help you understand Python better, too: just like you learn to appreciate your own language, country, or culture better once you’ve stepped outside and been exposed to other ones.

然后,认识到 Python 作为标准通信语言及其扩展库的价值,本书的编程与状态部分明确地涵盖了 Python。我们不是在 Python 中从头开始,而是从早期的材料系统地、逐渐地过渡到 Python。我们相信这将使你学习一般编程比如果你只看过一种编程语言更好。然而,我们相信这也会帮助你更好地理解 Python: 就像你一旦走出去,接触到其他语言,就会更好地欣赏自己的语言、国家或文化一样。

2Acknowledgments 鸣谢

This book has benefited from the attention of many.

这本书受益于许多人的关注。

Special thanks to the students at Brown University, who have been drafted into acting as a crucible for every iteration of this book. They have supported it with unusual grace, creating a welcoming and rewarding environment for pedagogic effort. Thanks also to our academic homes—Brown, Northeastern, and UC San Diego—for comfort and encouragement.

特别感谢布朗大学的学生们,他们被征召到这本书的每一次重复中扮演一个熔炉的角色。他们以不同寻常的优雅支持它,为教学努力创造了一个受欢迎和有益的环境。还要感谢我们的学术家庭ー布朗大学、东北大学和加州大学圣地亚哥分校ー给予我们安慰和鼓励。

The following people have helpfully provided information on typos and other infelicities:下列人士提供了有关打字错误和其他不足之处的信息:

Abhabongse Janthong, Alex Kleiman, Athyuttam Eleti, Benjamin S. Shapiro, Cheng Xie, Danil Braun, Dave Lee, Doug Kearns, Ebube Chuba, Harrison Pincket, Igor Moreno Santos, Iuliu Balibanu, Jason Bennett, Jon Sailor, Josh Paley, Kelechi Ukadike, Kendrick Cole, Marc Smith, Michael Morehouse, Rafał Gwoździński, Raymond Plante, Samuel Ainsworth, Samuel Kortchmar, Noah Tye, frodokomodo (on github).

阿巴东斯 · 扬松,亚历克斯 · 克莱曼,阿斯尤特姆 · 埃莱蒂,本杰明 · s · 夏皮罗,程谢,丹尼尔 · 布劳恩,戴夫 · 李,道格 · 基恩斯,埃布布 · 丘巴,哈里森 · 平克特,伊戈尔 · 莫雷诺 · 桑托斯,尤利巴努,杰森 · 贝内特,乔恩 · 赛勒,乔希 · 佩利,凯勒希 · 尤卡迪克,肯德里克 · 科尔,马克 · 史密斯,迈克尔 · 莫雷豪斯,拉法 · 戈沃兹斯基,雷蒙德 · 普兰特,萨缪尔 · 安斯沃思,萨缪尔 · 科尔奇,马尔 · 马尔,诺亚 · 诺亚,科莫多(frogithub)。

The following have done the same, but in much greater quantity or depth:下面这些也做了同样的工作,但数量或深度要大得多:

Dorai Sitaram, John Palmer, Kartik Singhal, Kenichi Asai, Lev Litichevskiy.

多莱 · 西塔拉姆(Dorai Sitaram) ,约翰 · 帕尔默(John Palmer) ,Kartik Singhal,肯尼奇 · 阿赛(Kenichi Asai) ,Lev Litichevskiy。

Even amongst the problem-spotters, one is hors catégorie:即使在那些问题观察者中,也有一个是专业人士:

Sorawee Porncharoenwase.

索拉维色情杂志。

This book is completely dependent on Pyret, and thus on the many people who have created and sustained it.

这本书完全依赖于派瑞特,因此依赖于许多创造和维持它的人。

We thank Matthew Butterick for his help with book styling (though the ultimate style is ours, so don’t blame him!).

我们感谢 Matthew Butterick 在书本造型方面的帮助(尽管最终的风格是我们的,所以不要责怪他!)。

Many, many years ago, Alejandro Schäffer introduced SK to the idea of nature as a fat-fingered typist. Alejandro’s fingerprints are over many parts of this book, even if he wouldn’t necessarily approve of what has come of his patient instruction.

许多年前,亚历杭德罗 · 舍费尔(Alejandro Schäffer)向 SK 介绍了自然的概念,认为它是一个手指肥胖的打字员。亚历杭德罗的指纹遍布这本书的许多部分,即使他不一定赞同他耐心的指导。

We are deeply inspired by the work and ideas of Matthias Felleisen, Matthew Flatt, and Robby Findler. Matthias, in particular, inspired our ideas on program design. Even where we disagree, he continues to engage with and challenge our ideas in ways that force us to grow and improve. Our work is better than it would be in incalculable ways due to his influence.

我们深受马蒂亚斯 · 费雷森、马修 · 弗拉特和罗比 · 芬德勒的作品和思想的启发。尤其是马提亚斯,他启发了我们关于程序设计的想法。即使在我们意见不一致的地方,他也会继续参与并挑战我们的想法,以迫使我们成长和改进的方式。由于他的影响,我们的工作比以前要好得多。

The chapter on Interactive Games as Reactive Systems is translated from How to Design Worlds, and owes thanks to all the people acknowledged there.

互动游戏作为反应系统这一章是从如何设计世界翻译过来的,感谢所有在那里得到承认的人。

This book is written in Scribble, the authoring tool of choice for the discerning programmer.

这本书是用 Scribble 写的,它是有眼光的程序员的首选创作工具。

We thank cloudconvert for their free conversion tools.

我们感谢 cloudconvert 提供的免费转换工具。

逸之

2022/01/04  阅读:20  主题:红绯

作者介绍

逸之