You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/05/29 07:55:47 UTC

[GitHub] [flink] klion26 commented on a change in pull request #12311: [FLINK-17269][docs] Translate new Training Overview to Chinese

klion26 commented on a change in pull request #12311:
URL: https://github.com/apache/flink/pull/12311#discussion_r431590877



##########
File path: docs/training/index.md
##########
@@ -130,7 +130,7 @@ Streams can transport data between two operators in a *one-to-one* (or
 ## Timely Stream Processing
 
 For most streaming applications it is very valuable to be able re-process historic data with the
-same code that is used to process live data -- and to produce deterministic, consistent results,
+same code that is used to process live data and to produce deterministic, consistent results,

Review comment:
       这个去掉吧,可以单独提一个 hotfix 的 pr,这个 PR 只做翻译相关的事情

##########
File path: docs/training/index.zh.md
##########
@@ -29,158 +29,90 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Goals and Scope of this Training
+## 本章教程的目标及涵盖范围
 
-This training presents an introduction to Apache Flink that includes just enough to get you started
-writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of
-(ultimately important) details. The focus is on providing straightforward introductions to Flink's
-APIs for managing state and time, with the expectation that having mastered these fundamentals,
-you'll be much better equipped to pick up the rest of what you need to know from the more detailed
-reference documentation. The links at the end of each section will lead you to where you
-can learn more.
+本章教程对 Apache Flink 的基本概念进行了介绍,虽然省略了许多重要细节,但是如果你掌握了本章内容,就足以实现可扩展并行度的 ETL、数据分析以及事件驱动的流式应用程序。本章重点对 Flink API 中的状态管理和时间进行了介绍,掌握了这些基础知识后,你将能更好地从其他详细参考文档中获取和掌握你所需要的知识。每小节结尾都有链接去引导你了解更多内容。
 
-Specifically, you will learn:
+具体来说,你将在本章学习到以下内容:
 
-- how to implement streaming data processing pipelines
-- how and why Flink manages state
-- how to use event time to consistently compute accurate analytics
-- how to build event-driven applications on continuous streams
-- how Flink is able to provide fault-tolerant, stateful stream processing with exactly-once semantics
+- 如何实现流数据处理管道(pipelines)
+- Flink 如何管理状态以及为何需要管理状态
+- 如何使用事件时间(event time)来一致并准确地进行计算分析
+- 如何在源源不断的数据流上构建事件驱动的应用程序
+- Flink 如何提供具有精确一次(exactly-once)计算语义的可容错、有状态流处理
 
-This training focuses on four critical concepts: continuous processing of streaming data, event
-time, stateful stream processing, and state snapshots. This page introduces these concepts.
+本章教程着重介绍四个概念:源源不断的流式数据处理、事件时间、有状态流处理和状态快照。基本概念介绍如下。
 
-{% info Note %} Accompanying this training is a set of hands-on exercises that will
-guide you through learning how to work with the concepts being presented. A link to the relevant
-exercise is provided at the end of each section.
+{% info Note %} 每小节教程都有实践练习部分去引导你如何在程序中使用其所述的概念,并在小节结尾都提供了相关实践练习的代码链接。
 
 {% top %}
 
-## Stream Processing
+## 流处理
 
-Streams are data's natural habitat. Whether it is events from web servers, trades from a stock
-exchange, or sensor readings from a machine on a factory floor, data is created as part of a stream.
-But when you analyze data, you can either organize your processing around _bounded_ or _unbounded_
-streams, and which of these paradigms you choose has profound consequences.
+在自然环境中,数据的产生原本就是流式的。无论是来自 Web 服务器的事件数据,证券交易所的交易数据,还是来自工厂车间机器上的传感器数据,其数据都是流式的进行生成。但是当你分析数据时,可以围绕 _有界流_(_bounded_)或 _无界流_(_unbounded_)两种模型来组织处理数据,当然,选择不同的模型,程序的执行和处理方式也都会不同。
 
 <img src="{{ site.baseurl }}/fig/bounded-unbounded.png" alt="Bounded and unbounded streams" class="offset" width="90%" />
 
-**Batch processing** is the paradigm at work when you process a bounded data stream. In this mode of
-operation you can choose to ingest the entire dataset before producing any results, which means that
-it is possible, for example, to sort the data, compute global statistics, or produce a final report
-that summarizes all of the input.
+**批处理**是有界数据流处理的范例。在这种模式下,你可以选择在计算结果输出之前输入整个数据集,这也就意味着你可以对整个数据集的数据进行排序、统计或汇总计算后再输出结果。
 
-**Stream processing**, on the other hand, involves unbounded data streams. Conceptually, at least,
-the input may never end, and so you are forced to continuously process the data as it arrives. 
+**流处理**正相反,其包括了无界数据流。至少理论上来说,它的数据输入永远不会结束,因此程序必须持续不断地对到达的数据进行处理。
 
-In Flink, applications are composed of **streaming dataflows** that may be transformed by
-user-defined **operators**. These dataflows form directed graphs that start with one or more
-**sources**, and end in one or more **sinks**.
+在 Flink 中,应用程序由用户自定义**算子**转换而来的**流式 dataflows** 所组成。这些流式 dataflows 形成了有向图,其可以以一个或多个**源**(source)开始,并以一个或多个**汇**(sink)结束。
 
 <img src="{{ site.baseurl }}/fig/program_dataflow.svg" alt="A DataStream program, and its dataflow." class="offset" width="80%" />
 
-Often there is a one-to-one correspondence between the transformations in the program and the
-operators in the dataflow. Sometimes, however, one transformation may consist of multiple operators.
+通常,程序代码中的 transformation 和 dataflow 中的算子(operator)之间是一一对应的。但有时也会出现一个 transformation 包含多个算子的情况,如上图所示。
 
-An application may consume real-time data from streaming sources such as message queues or
-distributed logs, like Apache Kafka or Kinesis. But flink can also consume bounded, historic data
-from a variety of data sources. Similarly, the streams of results being produced by a Flink
-application can be sent to a wide variety of systems that can be connected as sinks.
+Flink 应用程序可以消费来自消息队列或分布式日志这类流式数据源(例如 Apache Kafka 或 Kinesis)的实时数据,也可以从各种的数据源中消费有界的历史数据。同样,Flink 应用程序生成的结果流也可以发送到各种可以连接到程序中的数据汇中。
 
 <img src="{{ site.baseurl }}/fig/flink-application-sources-sinks.png" alt="Flink application with sources and sinks" class="offset" width="90%" />
 
-### Parallel Dataflows
+### 并行 Dataflows
 
-Programs in Flink are inherently parallel and distributed. During execution, a
-*stream* has one or more **stream partitions**, and each *operator* has one or
-more **operator subtasks**. The operator subtasks are independent of one
-another, and execute in different threads and possibly on different machines or
-containers.
+Flink 程序本质上是分布式并行程序。在程序执行期间,一个流有一个或多个**流分区**(Stream Partition),每个算子有一个或多个**算子子任务**(Operator Subtask)。算子子任务彼此独立,并在不同的线程中运行,或在不同的计算机或容器中运行。

Review comment:
       一个*流*有一个...
   每个*算子*有一个
   `算子子任务彼此独立` -> `每个子任务彼此独立` 会更好一些吗? 这里不加算子,是因为前面有描述流

##########
File path: docs/training/index.zh.md
##########
@@ -29,158 +29,90 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Goals and Scope of this Training
+## 本章教程的目标及涵盖范围
 
-This training presents an introduction to Apache Flink that includes just enough to get you started
-writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of
-(ultimately important) details. The focus is on providing straightforward introductions to Flink's
-APIs for managing state and time, with the expectation that having mastered these fundamentals,
-you'll be much better equipped to pick up the rest of what you need to know from the more detailed
-reference documentation. The links at the end of each section will lead you to where you
-can learn more.
+本章教程对 Apache Flink 的基本概念进行了介绍,虽然省略了许多重要细节,但是如果你掌握了本章内容,就足以实现可扩展并行度的 ETL、数据分析以及事件驱动的流式应用程序。本章重点对 Flink API 中的状态管理和时间进行了介绍,掌握了这些基础知识后,你将能更好地从其他详细参考文档中获取和掌握你所需要的知识。每小节结尾都有链接去引导你了解更多内容。
 
-Specifically, you will learn:
+具体来说,你将在本章学习到以下内容:
 
-- how to implement streaming data processing pipelines
-- how and why Flink manages state
-- how to use event time to consistently compute accurate analytics
-- how to build event-driven applications on continuous streams
-- how Flink is able to provide fault-tolerant, stateful stream processing with exactly-once semantics
+- 如何实现流数据处理管道(pipelines)
+- Flink 如何管理状态以及为何需要管理状态
+- 如何使用事件时间(event time)来一致并准确地进行计算分析
+- 如何在源源不断的数据流上构建事件驱动的应用程序
+- Flink 如何提供具有精确一次(exactly-once)计算语义的可容错、有状态流处理
 
-This training focuses on four critical concepts: continuous processing of streaming data, event
-time, stateful stream processing, and state snapshots. This page introduces these concepts.
+本章教程着重介绍四个概念:源源不断的流式数据处理、事件时间、有状态流处理和状态快照。基本概念介绍如下。
 
-{% info Note %} Accompanying this training is a set of hands-on exercises that will
-guide you through learning how to work with the concepts being presented. A link to the relevant
-exercise is provided at the end of each section.
+{% info Note %} 每小节教程都有实践练习部分去引导你如何在程序中使用其所述的概念,并在小节结尾都提供了相关实践练习的代码链接。
 
 {% top %}
 
-## Stream Processing
+## 流处理
 
-Streams are data's natural habitat. Whether it is events from web servers, trades from a stock
-exchange, or sensor readings from a machine on a factory floor, data is created as part of a stream.
-But when you analyze data, you can either organize your processing around _bounded_ or _unbounded_
-streams, and which of these paradigms you choose has profound consequences.
+在自然环境中,数据的产生原本就是流式的。无论是来自 Web 服务器的事件数据,证券交易所的交易数据,还是来自工厂车间机器上的传感器数据,其数据都是流式的进行生成。但是当你分析数据时,可以围绕 _有界流_(_bounded_)或 _无界流_(_unbounded_)两种模型来组织处理数据,当然,选择不同的模型,程序的执行和处理方式也都会不同。
 
 <img src="{{ site.baseurl }}/fig/bounded-unbounded.png" alt="Bounded and unbounded streams" class="offset" width="90%" />
 
-**Batch processing** is the paradigm at work when you process a bounded data stream. In this mode of
-operation you can choose to ingest the entire dataset before producing any results, which means that
-it is possible, for example, to sort the data, compute global statistics, or produce a final report
-that summarizes all of the input.
+**批处理**是有界数据流处理的范例。在这种模式下,你可以选择在计算结果输出之前输入整个数据集,这也就意味着你可以对整个数据集的数据进行排序、统计或汇总计算后再输出结果。
 
-**Stream processing**, on the other hand, involves unbounded data streams. Conceptually, at least,
-the input may never end, and so you are forced to continuously process the data as it arrives. 
+**流处理**正相反,其包括了无界数据流。至少理论上来说,它的数据输入永远不会结束,因此程序必须持续不断地对到达的数据进行处理。
 
-In Flink, applications are composed of **streaming dataflows** that may be transformed by
-user-defined **operators**. These dataflows form directed graphs that start with one or more
-**sources**, and end in one or more **sinks**.
+在 Flink 中,应用程序由用户自定义**算子**转换而来的**流式 dataflows** 所组成。这些流式 dataflows 形成了有向图,其可以以一个或多个**源**(source)开始,并以一个或多个**汇**(sink)结束。
 
 <img src="{{ site.baseurl }}/fig/program_dataflow.svg" alt="A DataStream program, and its dataflow." class="offset" width="80%" />
 
-Often there is a one-to-one correspondence between the transformations in the program and the
-operators in the dataflow. Sometimes, however, one transformation may consist of multiple operators.
+通常,程序代码中的 transformation 和 dataflow 中的算子(operator)之间是一一对应的。但有时也会出现一个 transformation 包含多个算子的情况,如上图所示。
 
-An application may consume real-time data from streaming sources such as message queues or
-distributed logs, like Apache Kafka or Kinesis. But flink can also consume bounded, historic data
-from a variety of data sources. Similarly, the streams of results being produced by a Flink
-application can be sent to a wide variety of systems that can be connected as sinks.
+Flink 应用程序可以消费来自消息队列或分布式日志这类流式数据源(例如 Apache Kafka 或 Kinesis)的实时数据,也可以从各种的数据源中消费有界的历史数据。同样,Flink 应用程序生成的结果流也可以发送到各种可以连接到程序中的数据汇中。

Review comment:
       `也可以发送到各种可以连接到程序中的数据汇中`  -> `也可以发送到各种数据汇中` 会好一些吗

##########
File path: docs/training/index.zh.md
##########
@@ -29,158 +29,90 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Goals and Scope of this Training
+## 本章教程的目标及涵盖范围
 
-This training presents an introduction to Apache Flink that includes just enough to get you started
-writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of
-(ultimately important) details. The focus is on providing straightforward introductions to Flink's
-APIs for managing state and time, with the expectation that having mastered these fundamentals,
-you'll be much better equipped to pick up the rest of what you need to know from the more detailed
-reference documentation. The links at the end of each section will lead you to where you
-can learn more.
+本章教程对 Apache Flink 的基本概念进行了介绍,虽然省略了许多重要细节,但是如果你掌握了本章内容,就足以实现可扩展并行度的 ETL、数据分析以及事件驱动的流式应用程序。本章重点对 Flink API 中的状态管理和时间进行了介绍,掌握了这些基础知识后,你将能更好地从其他详细参考文档中获取和掌握你所需要的知识。每小节结尾都有链接去引导你了解更多内容。
 
-Specifically, you will learn:
+具体来说,你将在本章学习到以下内容:
 
-- how to implement streaming data processing pipelines
-- how and why Flink manages state
-- how to use event time to consistently compute accurate analytics
-- how to build event-driven applications on continuous streams
-- how Flink is able to provide fault-tolerant, stateful stream processing with exactly-once semantics
+- 如何实现流数据处理管道(pipelines)
+- Flink 如何管理状态以及为何需要管理状态
+- 如何使用事件时间(event time)来一致并准确地进行计算分析
+- 如何在源源不断的数据流上构建事件驱动的应用程序
+- Flink 如何提供具有精确一次(exactly-once)计算语义的可容错、有状态流处理
 
-This training focuses on four critical concepts: continuous processing of streaming data, event
-time, stateful stream processing, and state snapshots. This page introduces these concepts.
+本章教程着重介绍四个概念:源源不断的流式数据处理、事件时间、有状态流处理和状态快照。基本概念介绍如下。
 
-{% info Note %} Accompanying this training is a set of hands-on exercises that will
-guide you through learning how to work with the concepts being presented. A link to the relevant
-exercise is provided at the end of each section.
+{% info Note %} 每小节教程都有实践练习部分去引导你如何在程序中使用其所述的概念,并在小节结尾都提供了相关实践练习的代码链接。
 
 {% top %}
 
-## Stream Processing
+## 流处理
 
-Streams are data's natural habitat. Whether it is events from web servers, trades from a stock
-exchange, or sensor readings from a machine on a factory floor, data is created as part of a stream.
-But when you analyze data, you can either organize your processing around _bounded_ or _unbounded_
-streams, and which of these paradigms you choose has profound consequences.
+在自然环境中,数据的产生原本就是流式的。无论是来自 Web 服务器的事件数据,证券交易所的交易数据,还是来自工厂车间机器上的传感器数据,其数据都是流式的进行生成。但是当你分析数据时,可以围绕 _有界流_(_bounded_)或 _无界流_(_unbounded_)两种模型来组织处理数据,当然,选择不同的模型,程序的执行和处理方式也都会不同。
 
 <img src="{{ site.baseurl }}/fig/bounded-unbounded.png" alt="Bounded and unbounded streams" class="offset" width="90%" />
 
-**Batch processing** is the paradigm at work when you process a bounded data stream. In this mode of
-operation you can choose to ingest the entire dataset before producing any results, which means that
-it is possible, for example, to sort the data, compute global statistics, or produce a final report
-that summarizes all of the input.
+**批处理**是有界数据流处理的范例。在这种模式下,你可以选择在计算结果输出之前输入整个数据集,这也就意味着你可以对整个数据集的数据进行排序、统计或汇总计算后再输出结果。
 
-**Stream processing**, on the other hand, involves unbounded data streams. Conceptually, at least,
-the input may never end, and so you are forced to continuously process the data as it arrives. 
+**流处理**正相反,其包括了无界数据流。至少理论上来说,它的数据输入永远不会结束,因此程序必须持续不断地对到达的数据进行处理。
 
-In Flink, applications are composed of **streaming dataflows** that may be transformed by
-user-defined **operators**. These dataflows form directed graphs that start with one or more
-**sources**, and end in one or more **sinks**.
+在 Flink 中,应用程序由用户自定义**算子**转换而来的**流式 dataflows** 所组成。这些流式 dataflows 形成了有向图,其可以以一个或多个**源**(source)开始,并以一个或多个**汇**(sink)结束。
 
 <img src="{{ site.baseurl }}/fig/program_dataflow.svg" alt="A DataStream program, and its dataflow." class="offset" width="80%" />
 
-Often there is a one-to-one correspondence between the transformations in the program and the
-operators in the dataflow. Sometimes, however, one transformation may consist of multiple operators.
+通常,程序代码中的 transformation 和 dataflow 中的算子(operator)之间是一一对应的。但有时也会出现一个 transformation 包含多个算子的情况,如上图所示。
 
-An application may consume real-time data from streaming sources such as message queues or
-distributed logs, like Apache Kafka or Kinesis. But flink can also consume bounded, historic data
-from a variety of data sources. Similarly, the streams of results being produced by a Flink
-application can be sent to a wide variety of systems that can be connected as sinks.
+Flink 应用程序可以消费来自消息队列或分布式日志这类流式数据源(例如 Apache Kafka 或 Kinesis)的实时数据,也可以从各种的数据源中消费有界的历史数据。同样,Flink 应用程序生成的结果流也可以发送到各种可以连接到程序中的数据汇中。
 
 <img src="{{ site.baseurl }}/fig/flink-application-sources-sinks.png" alt="Flink application with sources and sinks" class="offset" width="90%" />
 
-### Parallel Dataflows
+### 并行 Dataflows
 
-Programs in Flink are inherently parallel and distributed. During execution, a
-*stream* has one or more **stream partitions**, and each *operator* has one or
-more **operator subtasks**. The operator subtasks are independent of one
-another, and execute in different threads and possibly on different machines or
-containers.
+Flink 程序本质上是分布式并行程序。在程序执行期间,一个流有一个或多个**流分区**(Stream Partition),每个算子有一个或多个**算子子任务**(Operator Subtask)。算子子任务彼此独立,并在不同的线程中运行,或在不同的计算机或容器中运行。
 
-The number of operator subtasks is the **parallelism** of that particular
-operator.
-Different operators of the same program may have different levels of
-parallelism.
+算子子任务数就是其对应算子的**并行度**。在同一程序中,不同算子也可能具有不同的并行度。
 
 <img src="{{ site.baseurl }}/fig/parallel_dataflow.svg" alt="A parallel dataflow" class="offset" width="80%" />
 
-Streams can transport data between two operators in a *one-to-one* (or
-*forwarding*) pattern, or in a *redistributing* pattern:
-
-  - **One-to-one** streams (for example between the *Source* and the *map()*
-    operators in the figure above) preserve the partitioning and ordering of
-    the elements. That means that subtask[1] of the *map()* operator will see
-    the same elements in the same order as they were produced by subtask[1] of
-    the *Source* operator.
-
-  - **Redistributing** streams (as between *map()* and *keyBy/window* above, as
-    well as between *keyBy/window* and *Sink*) change the partitioning of
-    streams. Each *operator subtask* sends data to different target subtasks,
-    depending on the selected transformation. Examples are *keyBy()* (which
-    re-partitions by hashing the key), *broadcast()*, or *rebalance()* (which
-    re-partitions randomly). In a *redistributing* exchange the ordering among
-    the elements is only preserved within each pair of sending and receiving
-    subtasks (for example, subtask[1] of *map()* and subtask[2] of
-    *keyBy/window*). So, for example, the redistribution between the keyBy/window and
-    the Sink operators shown above introduces non-determinism regarding the 
-    order in which the aggregated results for different keys arrive at the Sink.
+Flink 算子之间可以通过*一对一*(*直传*)模式或*重新分发*模式传输数据:
+
+  - **一对一**模式(例如上图中的 *Source* 和 *map()* 算子之间)可以保留元素的分区和顺序信息。这意味着 *map()* 算子的 subtask[1] 输入的数据以及其顺序与 *Source* 算子的 subtask[1] 输出的数据和顺序完全相同。

Review comment:
       `保留元素的分区和顺序信息` 这里我理解是说,能保证下游和上游分区一致(同样分区的数据进入到同样分区的下游),且元素的顺序也一致。现在的描述中,保留顺序信息我觉得没问题,保留分区信息能否有更好的描述呢?

##########
File path: docs/training/index.zh.md
##########
@@ -29,158 +29,90 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Goals and Scope of this Training
+## 本章教程的目标及涵盖范围
 
-This training presents an introduction to Apache Flink that includes just enough to get you started
-writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of
-(ultimately important) details. The focus is on providing straightforward introductions to Flink's
-APIs for managing state and time, with the expectation that having mastered these fundamentals,
-you'll be much better equipped to pick up the rest of what you need to know from the more detailed
-reference documentation. The links at the end of each section will lead you to where you
-can learn more.
+本章教程对 Apache Flink 的基本概念进行了介绍,虽然省略了许多重要细节,但是如果你掌握了本章内容,就足以实现可扩展并行度的 ETL、数据分析以及事件驱动的流式应用程序。本章重点对 Flink API 中的状态管理和时间进行了介绍,掌握了这些基础知识后,你将能更好地从其他详细参考文档中获取和掌握你所需要的知识。每小节结尾都有链接去引导你了解更多内容。
 
-Specifically, you will learn:
+具体来说,你将在本章学习到以下内容:
 
-- how to implement streaming data processing pipelines
-- how and why Flink manages state
-- how to use event time to consistently compute accurate analytics
-- how to build event-driven applications on continuous streams
-- how Flink is able to provide fault-tolerant, stateful stream processing with exactly-once semantics
+- 如何实现流数据处理管道(pipelines)
+- Flink 如何管理状态以及为何需要管理状态
+- 如何使用事件时间(event time)来一致并准确地进行计算分析
+- 如何在源源不断的数据流上构建事件驱动的应用程序
+- Flink 如何提供具有精确一次(exactly-once)计算语义的可容错、有状态流处理
 
-This training focuses on four critical concepts: continuous processing of streaming data, event
-time, stateful stream processing, and state snapshots. This page introduces these concepts.
+本章教程着重介绍四个概念:源源不断的流式数据处理、事件时间、有状态流处理和状态快照。基本概念介绍如下。
 
-{% info Note %} Accompanying this training is a set of hands-on exercises that will
-guide you through learning how to work with the concepts being presented. A link to the relevant
-exercise is provided at the end of each section.
+{% info Note %} 每小节教程都有实践练习部分去引导你如何在程序中使用其所述的概念,并在小节结尾都提供了相关实践练习的代码链接。
 
 {% top %}
 
-## Stream Processing
+## 流处理
 
-Streams are data's natural habitat. Whether it is events from web servers, trades from a stock
-exchange, or sensor readings from a machine on a factory floor, data is created as part of a stream.
-But when you analyze data, you can either organize your processing around _bounded_ or _unbounded_
-streams, and which of these paradigms you choose has profound consequences.
+在自然环境中,数据的产生原本就是流式的。无论是来自 Web 服务器的事件数据,证券交易所的交易数据,还是来自工厂车间机器上的传感器数据,其数据都是流式的进行生成。但是当你分析数据时,可以围绕 _有界流_(_bounded_)或 _无界流_(_unbounded_)两种模型来组织处理数据,当然,选择不同的模型,程序的执行和处理方式也都会不同。
 
 <img src="{{ site.baseurl }}/fig/bounded-unbounded.png" alt="Bounded and unbounded streams" class="offset" width="90%" />
 
-**Batch processing** is the paradigm at work when you process a bounded data stream. In this mode of
-operation you can choose to ingest the entire dataset before producing any results, which means that
-it is possible, for example, to sort the data, compute global statistics, or produce a final report
-that summarizes all of the input.
+**批处理**是有界数据流处理的范例。在这种模式下,你可以选择在计算结果输出之前输入整个数据集,这也就意味着你可以对整个数据集的数据进行排序、统计或汇总计算后再输出结果。
 
-**Stream processing**, on the other hand, involves unbounded data streams. Conceptually, at least,
-the input may never end, and so you are forced to continuously process the data as it arrives. 
+**流处理**正相反,其包括了无界数据流。至少理论上来说,它的数据输入永远不会结束,因此程序必须持续不断地对到达的数据进行处理。
 
-In Flink, applications are composed of **streaming dataflows** that may be transformed by
-user-defined **operators**. These dataflows form directed graphs that start with one or more
-**sources**, and end in one or more **sinks**.
+在 Flink 中,应用程序由用户自定义**算子**转换而来的**流式 dataflows** 所组成。这些流式 dataflows 形成了有向图,其可以以一个或多个**源**(source)开始,并以一个或多个**汇**(sink)结束。
 
 <img src="{{ site.baseurl }}/fig/program_dataflow.svg" alt="A DataStream program, and its dataflow." class="offset" width="80%" />
 
-Often there is a one-to-one correspondence between the transformations in the program and the
-operators in the dataflow. Sometimes, however, one transformation may consist of multiple operators.
+通常,程序代码中的 transformation 和 dataflow 中的算子(operator)之间是一一对应的。但有时也会出现一个 transformation 包含多个算子的情况,如上图所示。
 
-An application may consume real-time data from streaming sources such as message queues or
-distributed logs, like Apache Kafka or Kinesis. But flink can also consume bounded, historic data
-from a variety of data sources. Similarly, the streams of results being produced by a Flink
-application can be sent to a wide variety of systems that can be connected as sinks.
+Flink 应用程序可以消费来自消息队列或分布式日志这类流式数据源(例如 Apache Kafka 或 Kinesis)的实时数据,也可以从各种的数据源中消费有界的历史数据。同样,Flink 应用程序生成的结果流也可以发送到各种可以连接到程序中的数据汇中。
 
 <img src="{{ site.baseurl }}/fig/flink-application-sources-sinks.png" alt="Flink application with sources and sinks" class="offset" width="90%" />
 
-### Parallel Dataflows
+### 并行 Dataflows
 
-Programs in Flink are inherently parallel and distributed. During execution, a
-*stream* has one or more **stream partitions**, and each *operator* has one or
-more **operator subtasks**. The operator subtasks are independent of one
-another, and execute in different threads and possibly on different machines or
-containers.
+Flink 程序本质上是分布式并行程序。在程序执行期间,一个流有一个或多个**流分区**(Stream Partition),每个算子有一个或多个**算子子任务**(Operator Subtask)。算子子任务彼此独立,并在不同的线程中运行,或在不同的计算机或容器中运行。
 
-The number of operator subtasks is the **parallelism** of that particular
-operator.
-Different operators of the same program may have different levels of
-parallelism.
+算子子任务数就是其对应算子的**并行度**。在同一程序中,不同算子也可能具有不同的并行度。
 
 <img src="{{ site.baseurl }}/fig/parallel_dataflow.svg" alt="A parallel dataflow" class="offset" width="80%" />
 
-Streams can transport data between two operators in a *one-to-one* (or
-*forwarding*) pattern, or in a *redistributing* pattern:
-
-  - **One-to-one** streams (for example between the *Source* and the *map()*
-    operators in the figure above) preserve the partitioning and ordering of
-    the elements. That means that subtask[1] of the *map()* operator will see
-    the same elements in the same order as they were produced by subtask[1] of
-    the *Source* operator.
-
-  - **Redistributing** streams (as between *map()* and *keyBy/window* above, as
-    well as between *keyBy/window* and *Sink*) change the partitioning of
-    streams. Each *operator subtask* sends data to different target subtasks,
-    depending on the selected transformation. Examples are *keyBy()* (which
-    re-partitions by hashing the key), *broadcast()*, or *rebalance()* (which
-    re-partitions randomly). In a *redistributing* exchange the ordering among
-    the elements is only preserved within each pair of sending and receiving
-    subtasks (for example, subtask[1] of *map()* and subtask[2] of
-    *keyBy/window*). So, for example, the redistribution between the keyBy/window and
-    the Sink operators shown above introduces non-determinism regarding the 
-    order in which the aggregated results for different keys arrive at the Sink.
+Flink 算子之间可以通过*一对一*(*直传*)模式或*重新分发*模式传输数据:
+
+  - **一对一**模式(例如上图中的 *Source* 和 *map()* 算子之间)可以保留元素的分区和顺序信息。这意味着 *map()* 算子的 subtask[1] 输入的数据以及其顺序与 *Source* 算子的 subtask[1] 输出的数据和顺序完全相同。
+
+  - **重新分发**模式(例如上图中的 *map()* 和 *keyBy/window* 之间,以及 *keyBy/window* 和 *Sink* 之间)则会更改数据所在的流分区。当你在程序中选择使用不同的 *transformation*,每个*算子子任务*也会根据不同的 transformation 将数据发送到不同的目标子任务。例如以下这几种 transformation 和其对应分发数据的模式:*keyBy()*(通过散列键重新分区)、*broadcast()*(广播)或 *rebalance()*(随机重新分发)。在*重新分发*数据的过程中,元素顺序信息只有在每对输出和输入子任务之间才会被保留(例如,*keyBy/window* 的 subtask[2] 接收到的 *map()* 的 subtask[1] 中的元素都是有序的)。因此,上图所示的 *keyBy/window* 和 *Sink* 算子之间数据的重新分发时,不同键(key)的聚合结果到达 Sink 的顺序是不确定的。
 
 {% top %}
 
-## Timely Stream Processing
+## 实时流处理
 
-For most streaming applications it is very valuable to be able re-process historic data with the
-same code that is used to process live data -- and to produce deterministic, consistent results,
-regardless.
+对于大多数流数据处理应用程序而言,能够使用处理实时数据的代码重新处理历史数据并产生确定并一致的结果非常有价值。
 
-It can also be crucial to pay attention to the order in which events occurred, rather than the order
-in which they are delivered for processing, and to be able to reason about when a set of events is
-(or should be) complete. For example, consider the set of events involved in an e-commerce
-transaction, or financial trade.
+在处理流式数据时,我们通常更需要关注事件本身发生的顺序而不是事件被传输以及处理的顺序,因为这能够帮助我们推理出一组事件(事件集合)是何时发生以及结束的。例如电子商务交易或金融交易中涉及到的事件集合。
 
-These requirements for timely stream processing can be met by using event time timestamps that are
-recorded in the data stream, rather than using the clocks of the machines processing the data.
+为了满足上述这类的实时流处理场景,我们通常会使用记录在数据流中的事件时间的时间戳,而不是处理数据的机器时钟的时间戳。
 
 {% top %}
 
-## Stateful Stream Processing
+## 有状态流处理
 
-Flink's operations can be stateful. This means that how one event is handled can depend on the
-accumulated effect of all the events that came before it. State may be used for something simple,
-such as counting events per minute to display on a dashboard, or for something more complex, such as
-computing features for a fraud detection model.
+Flink 中的算子可以是有状态的。这意味着如何处理一个事件可能取决于该事件之前所有事件数据的累积结果。Flink 中的状态不仅可以用于简单的场景(例如统计仪表板上每分钟显示的数据),也可以用于复杂的场景(例如训练作弊检测模型)。
 
-A Flink application is run in parallel on a distributed cluster. The various parallel instances of a
-given operator will execute independently, in separate threads, and in general will be running on
-different machines.
+Flink 应用程序可以在分布式群集上并行运行,其中每个算子的各个并行实例会在单独的线程中独立运行,并且通常情况下是会在不同的机器上运行。
 
-The set of parallel instances of a stateful operator is effectively a sharded key-value store. Each
-parallel instance is responsible for handling events for a specific group of keys, and the state for
-those keys is kept locally.
+有状态算子的并行实例组在存储其对应状态时通常是按照键(key)进行分片存储的。每个并行实例算子负责处理一组特定键的事件数据,并且这组键对应的状态会保存在本地。
 
-The diagram below shows a job running with a parallelism of two across the first three operators in
-the job graph, terminating in a sink that has a parallelism of one. The third operator is stateful,
-and you can see that a fully-connected network shuffle is occurring between the second and third
-operators. This is being done to partition the stream by some key, so that all of the events that
-need to be processed together, will be.
+如下图的 Flink 作业,其前三个算子的并行度为2,最后一个 sink 算子的并行度为1,其中第三个算子是有状态的,并且你可以看到第二个算子和第三个算子之间是全互联的(fully-connected),它们之间正在通过网络进行数据分发。通常情况下,实现这种类型的 Flink 程序是为了通过某些键对数据流进行分区,以便将需要一起处理的事件进行汇合,然后做统一计算处理。
 
 <img src="{{ site.baseurl }}/fig/parallel-job.png" alt="State is sharded" class="offset" width="65%" />
 
-State is always accessed locally, which helps Flink applications achieve high throughput and
-low-latency. You can choose to keep state on the JVM heap, or if it is too large, in efficiently
-organized on-disk data structures. 
+Flink 应用程序访问状态时始终访问的是本地状态,因为这有助于其提高吞吐量和降低延迟。通常情况下 Flink 应用程序都是将状态存储在 JVM 堆上,但如果状态太大,我们也可以选择将其以结构化数据格式存储在高速磁盘中。

Review comment:
       `Flink 应用程序访问状态时始终访问的是本地状态` 这句话一眼看到,也可能理解为,Flink 有`本地状态` 和 `远程状态`,但是 Flink 仅访问`本地状态`。能否修改下描述?Flink 现在所有的状态都只保存在本地

##########
File path: docs/training/index.zh.md
##########
@@ -29,158 +29,90 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Goals and Scope of this Training
+## 本章教程的目标及涵盖范围
 
-This training presents an introduction to Apache Flink that includes just enough to get you started
-writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of
-(ultimately important) details. The focus is on providing straightforward introductions to Flink's
-APIs for managing state and time, with the expectation that having mastered these fundamentals,
-you'll be much better equipped to pick up the rest of what you need to know from the more detailed
-reference documentation. The links at the end of each section will lead you to where you
-can learn more.
+本章教程对 Apache Flink 的基本概念进行了介绍,虽然省略了许多重要细节,但是如果你掌握了本章内容,就足以实现可扩展并行度的 ETL、数据分析以及事件驱动的流式应用程序。本章重点对 Flink API 中的状态管理和时间进行了介绍,掌握了这些基础知识后,你将能更好地从其他详细参考文档中获取和掌握你所需要的知识。每小节结尾都有链接去引导你了解更多内容。
 
-Specifically, you will learn:
+具体来说,你将在本章学习到以下内容:
 
-- how to implement streaming data processing pipelines
-- how and why Flink manages state
-- how to use event time to consistently compute accurate analytics
-- how to build event-driven applications on continuous streams
-- how Flink is able to provide fault-tolerant, stateful stream processing with exactly-once semantics
+- 如何实现流数据处理管道(pipelines)
+- Flink 如何管理状态以及为何需要管理状态
+- 如何使用事件时间(event time)来一致并准确地进行计算分析
+- 如何在源源不断的数据流上构建事件驱动的应用程序
+- Flink 如何提供具有精确一次(exactly-once)计算语义的可容错、有状态流处理
 
-This training focuses on four critical concepts: continuous processing of streaming data, event
-time, stateful stream processing, and state snapshots. This page introduces these concepts.
+本章教程着重介绍四个概念:源源不断的流式数据处理、事件时间、有状态流处理和状态快照。基本概念介绍如下。
 
-{% info Note %} Accompanying this training is a set of hands-on exercises that will
-guide you through learning how to work with the concepts being presented. A link to the relevant
-exercise is provided at the end of each section.
+{% info Note %} 每小节教程都有实践练习部分去引导你如何在程序中使用其所述的概念,并在小节结尾都提供了相关实践练习的代码链接。
 
 {% top %}
 
-## Stream Processing
+## 流处理
 
-Streams are data's natural habitat. Whether it is events from web servers, trades from a stock
-exchange, or sensor readings from a machine on a factory floor, data is created as part of a stream.
-But when you analyze data, you can either organize your processing around _bounded_ or _unbounded_
-streams, and which of these paradigms you choose has profound consequences.
+在自然环境中,数据的产生原本就是流式的。无论是来自 Web 服务器的事件数据,证券交易所的交易数据,还是来自工厂车间机器上的传感器数据,其数据都是流式的进行生成。但是当你分析数据时,可以围绕 _有界流_(_bounded_)或 _无界流_(_unbounded_)两种模型来组织处理数据,当然,选择不同的模型,程序的执行和处理方式也都会不同。
 
 <img src="{{ site.baseurl }}/fig/bounded-unbounded.png" alt="Bounded and unbounded streams" class="offset" width="90%" />
 
-**Batch processing** is the paradigm at work when you process a bounded data stream. In this mode of
-operation you can choose to ingest the entire dataset before producing any results, which means that
-it is possible, for example, to sort the data, compute global statistics, or produce a final report
-that summarizes all of the input.
+**批处理**是有界数据流处理的范例。在这种模式下,你可以选择在计算结果输出之前输入整个数据集,这也就意味着你可以对整个数据集的数据进行排序、统计或汇总计算后再输出结果。
 
-**Stream processing**, on the other hand, involves unbounded data streams. Conceptually, at least,
-the input may never end, and so you are forced to continuously process the data as it arrives. 
+**流处理**正相反,其包括了无界数据流。至少理论上来说,它的数据输入永远不会结束,因此程序必须持续不断地对到达的数据进行处理。
 
-In Flink, applications are composed of **streaming dataflows** that may be transformed by
-user-defined **operators**. These dataflows form directed graphs that start with one or more
-**sources**, and end in one or more **sinks**.
+在 Flink 中,应用程序由用户自定义**算子**转换而来的**流式 dataflows** 所组成。这些流式 dataflows 形成了有向图,其可以以一个或多个**源**(source)开始,并以一个或多个**汇**(sink)结束。
 
 <img src="{{ site.baseurl }}/fig/program_dataflow.svg" alt="A DataStream program, and its dataflow." class="offset" width="80%" />
 
-Often there is a one-to-one correspondence between the transformations in the program and the
-operators in the dataflow. Sometimes, however, one transformation may consist of multiple operators.
+通常,程序代码中的 transformation 和 dataflow 中的算子(operator)之间是一一对应的。但有时也会出现一个 transformation 包含多个算子的情况,如上图所示。
 
-An application may consume real-time data from streaming sources such as message queues or
-distributed logs, like Apache Kafka or Kinesis. But flink can also consume bounded, historic data
-from a variety of data sources. Similarly, the streams of results being produced by a Flink
-application can be sent to a wide variety of systems that can be connected as sinks.
+Flink 应用程序可以消费来自消息队列或分布式日志这类流式数据源(例如 Apache Kafka 或 Kinesis)的实时数据,也可以从各种的数据源中消费有界的历史数据。同样,Flink 应用程序生成的结果流也可以发送到各种可以连接到程序中的数据汇中。
 
 <img src="{{ site.baseurl }}/fig/flink-application-sources-sinks.png" alt="Flink application with sources and sinks" class="offset" width="90%" />
 
-### Parallel Dataflows
+### 并行 Dataflows
 
-Programs in Flink are inherently parallel and distributed. During execution, a
-*stream* has one or more **stream partitions**, and each *operator* has one or
-more **operator subtasks**. The operator subtasks are independent of one
-another, and execute in different threads and possibly on different machines or
-containers.
+Flink 程序本质上是分布式并行程序。在程序执行期间,一个流有一个或多个**流分区**(Stream Partition),每个算子有一个或多个**算子子任务**(Operator Subtask)。算子子任务彼此独立,并在不同的线程中运行,或在不同的计算机或容器中运行。
 
-The number of operator subtasks is the **parallelism** of that particular
-operator.
-Different operators of the same program may have different levels of
-parallelism.
+算子子任务数就是其对应算子的**并行度**。在同一程序中,不同算子也可能具有不同的并行度。
 
 <img src="{{ site.baseurl }}/fig/parallel_dataflow.svg" alt="A parallel dataflow" class="offset" width="80%" />
 
-Streams can transport data between two operators in a *one-to-one* (or
-*forwarding*) pattern, or in a *redistributing* pattern:
-
-  - **One-to-one** streams (for example between the *Source* and the *map()*
-    operators in the figure above) preserve the partitioning and ordering of
-    the elements. That means that subtask[1] of the *map()* operator will see
-    the same elements in the same order as they were produced by subtask[1] of
-    the *Source* operator.
-
-  - **Redistributing** streams (as between *map()* and *keyBy/window* above, as
-    well as between *keyBy/window* and *Sink*) change the partitioning of
-    streams. Each *operator subtask* sends data to different target subtasks,
-    depending on the selected transformation. Examples are *keyBy()* (which
-    re-partitions by hashing the key), *broadcast()*, or *rebalance()* (which
-    re-partitions randomly). In a *redistributing* exchange the ordering among
-    the elements is only preserved within each pair of sending and receiving
-    subtasks (for example, subtask[1] of *map()* and subtask[2] of
-    *keyBy/window*). So, for example, the redistribution between the keyBy/window and
-    the Sink operators shown above introduces non-determinism regarding the 
-    order in which the aggregated results for different keys arrive at the Sink.
+Flink 算子之间可以通过*一对一*(*直传*)模式或*重新分发*模式传输数据:
+
+  - **一对一**模式(例如上图中的 *Source* 和 *map()* 算子之间)可以保留元素的分区和顺序信息。这意味着 *map()* 算子的 subtask[1] 输入的数据以及其顺序与 *Source* 算子的 subtask[1] 输出的数据和顺序完全相同。
+
+  - **重新分发**模式(例如上图中的 *map()* 和 *keyBy/window* 之间,以及 *keyBy/window* 和 *Sink* 之间)则会更改数据所在的流分区。当你在程序中选择使用不同的 *transformation*,每个*算子子任务*也会根据不同的 transformation 将数据发送到不同的目标子任务。例如以下这几种 transformation 和其对应分发数据的模式:*keyBy()*(通过散列键重新分区)、*broadcast()*(广播)或 *rebalance()*(随机重新分发)。在*重新分发*数据的过程中,元素顺序信息只有在每对输出和输入子任务之间才会被保留(例如,*keyBy/window* 的 subtask[2] 接收到的 *map()* 的 subtask[1] 中的元素都是有序的)。因此,上图所示的 *keyBy/window* 和 *Sink* 算子之间数据的重新分发时,不同键(key)的聚合结果到达 Sink 的顺序是不确定的。
 
 {% top %}
 
-## Timely Stream Processing
+## 实时流处理
 
-For most streaming applications it is very valuable to be able re-process historic data with the
-same code that is used to process live data -- and to produce deterministic, consistent results,
-regardless.
+对于大多数流数据处理应用程序而言,能够使用处理实时数据的代码重新处理历史数据并产生确定并一致的结果非常有价值。
 
-It can also be crucial to pay attention to the order in which events occurred, rather than the order
-in which they are delivered for processing, and to be able to reason about when a set of events is
-(or should be) complete. For example, consider the set of events involved in an e-commerce
-transaction, or financial trade.
+在处理流式数据时,我们通常更需要关注事件本身发生的顺序而不是事件被传输以及处理的顺序,因为这能够帮助我们推理出一组事件(事件集合)是何时发生以及结束的。例如电子商务交易或金融交易中涉及到的事件集合。
 
-These requirements for timely stream processing can be met by using event time timestamps that are
-recorded in the data stream, rather than using the clocks of the machines processing the data.
+为了满足上述这类的实时流处理场景,我们通常会使用记录在数据流中的事件时间的时间戳,而不是处理数据的机器时钟的时间戳。
 
 {% top %}
 
-## Stateful Stream Processing
+## 有状态流处理
 
-Flink's operations can be stateful. This means that how one event is handled can depend on the
-accumulated effect of all the events that came before it. State may be used for something simple,
-such as counting events per minute to display on a dashboard, or for something more complex, such as
-computing features for a fraud detection model.
+Flink 中的算子可以是有状态的。这意味着如何处理一个事件可能取决于该事件之前所有事件数据的累积结果。Flink 中的状态不仅可以用于简单的场景(例如统计仪表板上每分钟显示的数据),也可以用于复杂的场景(例如训练作弊检测模型)。
 
-A Flink application is run in parallel on a distributed cluster. The various parallel instances of a
-given operator will execute independently, in separate threads, and in general will be running on
-different machines.
+Flink 应用程序可以在分布式群集上并行运行,其中每个算子的各个并行实例会在单独的线程中独立运行,并且通常情况下是会在不同的机器上运行。
 
-The set of parallel instances of a stateful operator is effectively a sharded key-value store. Each
-parallel instance is responsible for handling events for a specific group of keys, and the state for
-those keys is kept locally.
+有状态算子的并行实例组在存储其对应状态时通常是按照键(key)进行分片存储的。每个并行实例算子负责处理一组特定键的事件数据,并且这组键对应的状态会保存在本地。
 
-The diagram below shows a job running with a parallelism of two across the first three operators in
-the job graph, terminating in a sink that has a parallelism of one. The third operator is stateful,
-and you can see that a fully-connected network shuffle is occurring between the second and third
-operators. This is being done to partition the stream by some key, so that all of the events that
-need to be processed together, will be.
+如下图的 Flink 作业,其前三个算子的并行度为2,最后一个 sink 算子的并行度为1,其中第三个算子是有状态的,并且你可以看到第二个算子和第三个算子之间是全互联的(fully-connected),它们之间正在通过网络进行数据分发。通常情况下,实现这种类型的 Flink 程序是为了通过某些键对数据流进行分区,以便将需要一起处理的事件进行汇合,然后做统一计算处理。
 
 <img src="{{ site.baseurl }}/fig/parallel-job.png" alt="State is sharded" class="offset" width="65%" />
 
-State is always accessed locally, which helps Flink applications achieve high throughput and
-low-latency. You can choose to keep state on the JVM heap, or if it is too large, in efficiently
-organized on-disk data structures. 
+Flink 应用程序访问状态时始终访问的是本地状态,因为这有助于其提高吞吐量和降低延迟。通常情况下 Flink 应用程序都是将状态存储在 JVM 堆上,但如果状态太大,我们也可以选择将其以结构化数据格式存储在高速磁盘中。
 
 <img src="{{ site.baseurl }}/fig/local-state.png" alt="State is local" class="offset" width="90%" />
 
 {% top %}
 
-## Fault Tolerance via State Snapshots
+## 通过状态快照实现的容错
 
-Flink is able to provide fault-tolerant, exactly-once semantics through a combination of state
-snapshots and stream replay. These snapshots capture the entire state of the distributed pipeline,
-recording offsets into the input queues as well as the state throughout the job graph that has
-resulted from having ingested the data up to that point. When a failure occurs, the sources are
-rewound, the state is restored, and processing is resumed. As depicted above, these state snapshots
-are captured asynchronously, without impeding the ongoing processing.
+通过状态快照和流重放两种方式的组合,Flink 能够提供可容错的,精确一次计算的语义。这些状态快照在执行时会获取并存储分布式 pipeline 中整体的状态,它会将数据源中消费数据的偏移量记录下来,并将整个 job graph 中算子获取到该数据(记录的偏移量对应的数据)时的状态记录并存储下来。当发生故障时,Flink 作业会恢复上次存储的状态,倒转数据源从状态中记录的上次消费的偏移量开始重新进行消费处理。而且状态快照在执行时会异步获取状态并存储,并不会阻塞正在进行的数据处理逻辑。

Review comment:
       `倒转数据源` -> `重置数据源`(或 `恢复`) 会更好一些吗?

##########
File path: docs/training/index.zh.md
##########
@@ -29,158 +29,90 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Goals and Scope of this Training
+## 本章教程的目标及涵盖范围
 
-This training presents an introduction to Apache Flink that includes just enough to get you started
-writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of
-(ultimately important) details. The focus is on providing straightforward introductions to Flink's
-APIs for managing state and time, with the expectation that having mastered these fundamentals,
-you'll be much better equipped to pick up the rest of what you need to know from the more detailed
-reference documentation. The links at the end of each section will lead you to where you
-can learn more.
+本章教程对 Apache Flink 的基本概念进行了介绍,虽然省略了许多重要细节,但是如果你掌握了本章内容,就足以实现可扩展并行度的 ETL、数据分析以及事件驱动的流式应用程序。本章重点对 Flink API 中的状态管理和时间进行了介绍,掌握了这些基础知识后,你将能更好地从其他详细参考文档中获取和掌握你所需要的知识。每小节结尾都有链接去引导你了解更多内容。
 
-Specifically, you will learn:
+具体来说,你将在本章学习到以下内容:
 
-- how to implement streaming data processing pipelines
-- how and why Flink manages state
-- how to use event time to consistently compute accurate analytics
-- how to build event-driven applications on continuous streams
-- how Flink is able to provide fault-tolerant, stateful stream processing with exactly-once semantics
+- 如何实现流数据处理管道(pipelines)
+- Flink 如何管理状态以及为何需要管理状态
+- 如何使用事件时间(event time)来一致并准确地进行计算分析
+- 如何在源源不断的数据流上构建事件驱动的应用程序
+- Flink 如何提供具有精确一次(exactly-once)计算语义的可容错、有状态流处理
 
-This training focuses on four critical concepts: continuous processing of streaming data, event
-time, stateful stream processing, and state snapshots. This page introduces these concepts.
+本章教程着重介绍四个概念:源源不断的流式数据处理、事件时间、有状态流处理和状态快照。基本概念介绍如下。
 
-{% info Note %} Accompanying this training is a set of hands-on exercises that will
-guide you through learning how to work with the concepts being presented. A link to the relevant
-exercise is provided at the end of each section.
+{% info Note %} 每小节教程都有实践练习部分去引导你如何在程序中使用其所述的概念,并在小节结尾都提供了相关实践练习的代码链接。
 
 {% top %}
 
-## Stream Processing
+## 流处理
 
-Streams are data's natural habitat. Whether it is events from web servers, trades from a stock
-exchange, or sensor readings from a machine on a factory floor, data is created as part of a stream.
-But when you analyze data, you can either organize your processing around _bounded_ or _unbounded_
-streams, and which of these paradigms you choose has profound consequences.
+在自然环境中,数据的产生原本就是流式的。无论是来自 Web 服务器的事件数据,证券交易所的交易数据,还是来自工厂车间机器上的传感器数据,其数据都是流式的进行生成。但是当你分析数据时,可以围绕 _有界流_(_bounded_)或 _无界流_(_unbounded_)两种模型来组织处理数据,当然,选择不同的模型,程序的执行和处理方式也都会不同。

Review comment:
       `其数据都是流式的进行生成` -> `其数据都是流式的` 会好一些吗?
   `也都会不同` -> `也会不同`

##########
File path: docs/concepts/index.zh.md
##########
@@ -27,66 +27,24 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-The [Hands-on Training]({% link training/index.zh.md %}) explains the basic concepts
-of stateful and timely stream processing that underlie Flink's APIs, and provides examples of how
-these mechanisms are used in applications. Stateful stream processing is introduced in the context
-of [Data Pipelines & ETL]({% link training/etl.zh.md %}#stateful-transformations)
-and is further developed in the section on [Fault Tolerance]({% link
-training/fault_tolerance.zh.md %}). Timely stream processing is introduced in the section on
-[Streaming Analytics]({% link training/streaming_analytics.zh.md %}).
+[实践练习]({% link training/index.zh.md %})章节介绍了作为 Flink API 根基的有状态实时流处理的基本概念,并且举例说明了如何在 Flink 应用中使用这些机制。其中 [Data Pipelines & ETL]({% link training/etl.zh.md %}#stateful-transformations) 小节介绍了有状态流处理的概念,并且在 [Fault Tolerance]({% link training/fault_tolerance.zh.md %}) 小节中进行了深入介绍。[Streaming Analytics]({% link training/streaming_analytics.zh.md %}) 小节介绍了实时流处理的概念。
 
-This _Concepts in Depth_ section provides a deeper understanding of how Flink's architecture and runtime 
-implement these concepts.
+本章将深入分析 Flink 分布式运行时架构是怎样实现的这些概念。

Review comment:
       `本章将深入分析 Flink 分布式运行时架构是怎样实现的这些概念` -> `本章将深入分析 Flink 分布式运行时架构如何实现这些概念`

##########
File path: docs/training/index.zh.md
##########
@@ -29,158 +29,90 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Goals and Scope of this Training
+## 本章教程的目标及涵盖范围
 
-This training presents an introduction to Apache Flink that includes just enough to get you started
-writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of
-(ultimately important) details. The focus is on providing straightforward introductions to Flink's
-APIs for managing state and time, with the expectation that having mastered these fundamentals,
-you'll be much better equipped to pick up the rest of what you need to know from the more detailed
-reference documentation. The links at the end of each section will lead you to where you
-can learn more.
+本章教程对 Apache Flink 的基本概念进行了介绍,虽然省略了许多重要细节,但是如果你掌握了本章内容,就足以实现可扩展并行度的 ETL、数据分析以及事件驱动的流式应用程序。本章重点对 Flink API 中的状态管理和时间进行了介绍,掌握了这些基础知识后,你将能更好地从其他详细参考文档中获取和掌握你所需要的知识。每小节结尾都有链接去引导你了解更多内容。
 
-Specifically, you will learn:
+具体来说,你将在本章学习到以下内容:
 
-- how to implement streaming data processing pipelines
-- how and why Flink manages state
-- how to use event time to consistently compute accurate analytics
-- how to build event-driven applications on continuous streams
-- how Flink is able to provide fault-tolerant, stateful stream processing with exactly-once semantics
+- 如何实现流数据处理管道(pipelines)
+- Flink 如何管理状态以及为何需要管理状态
+- 如何使用事件时间(event time)来一致并准确地进行计算分析
+- 如何在源源不断的数据流上构建事件驱动的应用程序
+- Flink 如何提供具有精确一次(exactly-once)计算语义的可容错、有状态流处理
 
-This training focuses on four critical concepts: continuous processing of streaming data, event
-time, stateful stream processing, and state snapshots. This page introduces these concepts.
+本章教程着重介绍四个概念:源源不断的流式数据处理、事件时间、有状态流处理和状态快照。基本概念介绍如下。
 
-{% info Note %} Accompanying this training is a set of hands-on exercises that will
-guide you through learning how to work with the concepts being presented. A link to the relevant
-exercise is provided at the end of each section.
+{% info Note %} 每小节教程都有实践练习部分去引导你如何在程序中使用其所述的概念,并在小节结尾都提供了相关实践练习的代码链接。
 
 {% top %}
 
-## Stream Processing
+## 流处理
 
-Streams are data's natural habitat. Whether it is events from web servers, trades from a stock
-exchange, or sensor readings from a machine on a factory floor, data is created as part of a stream.
-But when you analyze data, you can either organize your processing around _bounded_ or _unbounded_
-streams, and which of these paradigms you choose has profound consequences.
+在自然环境中,数据的产生原本就是流式的。无论是来自 Web 服务器的事件数据,证券交易所的交易数据,还是来自工厂车间机器上的传感器数据,其数据都是流式的进行生成。但是当你分析数据时,可以围绕 _有界流_(_bounded_)或 _无界流_(_unbounded_)两种模型来组织处理数据,当然,选择不同的模型,程序的执行和处理方式也都会不同。
 
 <img src="{{ site.baseurl }}/fig/bounded-unbounded.png" alt="Bounded and unbounded streams" class="offset" width="90%" />
 
-**Batch processing** is the paradigm at work when you process a bounded data stream. In this mode of
-operation you can choose to ingest the entire dataset before producing any results, which means that
-it is possible, for example, to sort the data, compute global statistics, or produce a final report
-that summarizes all of the input.
+**批处理**是有界数据流处理的范例。在这种模式下,你可以选择在计算结果输出之前输入整个数据集,这也就意味着你可以对整个数据集的数据进行排序、统计或汇总计算后再输出结果。
 
-**Stream processing**, on the other hand, involves unbounded data streams. Conceptually, at least,
-the input may never end, and so you are forced to continuously process the data as it arrives. 
+**流处理**正相反,其包括了无界数据流。至少理论上来说,它的数据输入永远不会结束,因此程序必须持续不断地对到达的数据进行处理。

Review comment:
       `其包括了无界数据流` -> `涉及无界数据流` 会更好一些吗?

##########
File path: docs/training/index.zh.md
##########
@@ -29,158 +29,90 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Goals and Scope of this Training
+## 本章教程的目标及涵盖范围
 
-This training presents an introduction to Apache Flink that includes just enough to get you started
-writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of
-(ultimately important) details. The focus is on providing straightforward introductions to Flink's
-APIs for managing state and time, with the expectation that having mastered these fundamentals,
-you'll be much better equipped to pick up the rest of what you need to know from the more detailed
-reference documentation. The links at the end of each section will lead you to where you
-can learn more.
+本章教程对 Apache Flink 的基本概念进行了介绍,虽然省略了许多重要细节,但是如果你掌握了本章内容,就足以实现可扩展并行度的 ETL、数据分析以及事件驱动的流式应用程序。本章重点对 Flink API 中的状态管理和时间进行了介绍,掌握了这些基础知识后,你将能更好地从其他详细参考文档中获取和掌握你所需要的知识。每小节结尾都有链接去引导你了解更多内容。
 
-Specifically, you will learn:
+具体来说,你将在本章学习到以下内容:
 
-- how to implement streaming data processing pipelines
-- how and why Flink manages state
-- how to use event time to consistently compute accurate analytics
-- how to build event-driven applications on continuous streams
-- how Flink is able to provide fault-tolerant, stateful stream processing with exactly-once semantics
+- 如何实现流数据处理管道(pipelines)
+- Flink 如何管理状态以及为何需要管理状态
+- 如何使用事件时间(event time)来一致并准确地进行计算分析
+- 如何在源源不断的数据流上构建事件驱动的应用程序
+- Flink 如何提供具有精确一次(exactly-once)计算语义的可容错、有状态流处理
 
-This training focuses on four critical concepts: continuous processing of streaming data, event
-time, stateful stream processing, and state snapshots. This page introduces these concepts.
+本章教程着重介绍四个概念:源源不断的流式数据处理、事件时间、有状态流处理和状态快照。基本概念介绍如下。
 
-{% info Note %} Accompanying this training is a set of hands-on exercises that will
-guide you through learning how to work with the concepts being presented. A link to the relevant
-exercise is provided at the end of each section.
+{% info Note %} 每小节教程都有实践练习部分去引导你如何在程序中使用其所述的概念,并在小节结尾都提供了相关实践练习的代码链接。
 
 {% top %}
 
-## Stream Processing
+## 流处理
 
-Streams are data's natural habitat. Whether it is events from web servers, trades from a stock
-exchange, or sensor readings from a machine on a factory floor, data is created as part of a stream.
-But when you analyze data, you can either organize your processing around _bounded_ or _unbounded_
-streams, and which of these paradigms you choose has profound consequences.
+在自然环境中,数据的产生原本就是流式的。无论是来自 Web 服务器的事件数据,证券交易所的交易数据,还是来自工厂车间机器上的传感器数据,其数据都是流式的进行生成。但是当你分析数据时,可以围绕 _有界流_(_bounded_)或 _无界流_(_unbounded_)两种模型来组织处理数据,当然,选择不同的模型,程序的执行和处理方式也都会不同。
 
 <img src="{{ site.baseurl }}/fig/bounded-unbounded.png" alt="Bounded and unbounded streams" class="offset" width="90%" />
 
-**Batch processing** is the paradigm at work when you process a bounded data stream. In this mode of
-operation you can choose to ingest the entire dataset before producing any results, which means that
-it is possible, for example, to sort the data, compute global statistics, or produce a final report
-that summarizes all of the input.
+**批处理**是有界数据流处理的范例。在这种模式下,你可以选择在计算结果输出之前输入整个数据集,这也就意味着你可以对整个数据集的数据进行排序、统计或汇总计算后再输出结果。
 
-**Stream processing**, on the other hand, involves unbounded data streams. Conceptually, at least,
-the input may never end, and so you are forced to continuously process the data as it arrives. 
+**流处理**正相反,其包括了无界数据流。至少理论上来说,它的数据输入永远不会结束,因此程序必须持续不断地对到达的数据进行处理。
 
-In Flink, applications are composed of **streaming dataflows** that may be transformed by
-user-defined **operators**. These dataflows form directed graphs that start with one or more
-**sources**, and end in one or more **sinks**.
+在 Flink 中,应用程序由用户自定义**算子**转换而来的**流式 dataflows** 所组成。这些流式 dataflows 形成了有向图,其可以以一个或多个**源**(source)开始,并以一个或多个**汇**(sink)结束。

Review comment:
       这里 dataflow 应该进行翻译?
   `其可以以一个或多个**源**(source)开始` -> `以一个或多个**源**(source)开始`

##########
File path: docs/training/index.zh.md
##########
@@ -29,158 +29,90 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Goals and Scope of this Training
+## 本章教程的目标及涵盖范围
 
-This training presents an introduction to Apache Flink that includes just enough to get you started
-writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of
-(ultimately important) details. The focus is on providing straightforward introductions to Flink's
-APIs for managing state and time, with the expectation that having mastered these fundamentals,
-you'll be much better equipped to pick up the rest of what you need to know from the more detailed
-reference documentation. The links at the end of each section will lead you to where you
-can learn more.
+本章教程对 Apache Flink 的基本概念进行了介绍,虽然省略了许多重要细节,但是如果你掌握了本章内容,就足以实现可扩展并行度的 ETL、数据分析以及事件驱动的流式应用程序。本章重点对 Flink API 中的状态管理和时间进行了介绍,掌握了这些基础知识后,你将能更好地从其他详细参考文档中获取和掌握你所需要的知识。每小节结尾都有链接去引导你了解更多内容。
 
-Specifically, you will learn:
+具体来说,你将在本章学习到以下内容:
 
-- how to implement streaming data processing pipelines
-- how and why Flink manages state
-- how to use event time to consistently compute accurate analytics
-- how to build event-driven applications on continuous streams
-- how Flink is able to provide fault-tolerant, stateful stream processing with exactly-once semantics
+- 如何实现流数据处理管道(pipelines)
+- Flink 如何管理状态以及为何需要管理状态
+- 如何使用事件时间(event time)来一致并准确地进行计算分析
+- 如何在源源不断的数据流上构建事件驱动的应用程序
+- Flink 如何提供具有精确一次(exactly-once)计算语义的可容错、有状态流处理
 
-This training focuses on four critical concepts: continuous processing of streaming data, event
-time, stateful stream processing, and state snapshots. This page introduces these concepts.
+本章教程着重介绍四个概念:源源不断的流式数据处理、事件时间、有状态流处理和状态快照。基本概念介绍如下。
 
-{% info Note %} Accompanying this training is a set of hands-on exercises that will
-guide you through learning how to work with the concepts being presented. A link to the relevant
-exercise is provided at the end of each section.
+{% info Note %} 每小节教程都有实践练习部分去引导你如何在程序中使用其所述的概念,并在小节结尾都提供了相关实践练习的代码链接。

Review comment:
       `每小节教程都有实践练习部分去引导` -> `每小节教程都有实践练习引导` 会好一些吗

##########
File path: docs/training/index.zh.md
##########
@@ -29,158 +29,90 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Goals and Scope of this Training
+## 本章教程的目标及涵盖范围
 
-This training presents an introduction to Apache Flink that includes just enough to get you started
-writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of
-(ultimately important) details. The focus is on providing straightforward introductions to Flink's
-APIs for managing state and time, with the expectation that having mastered these fundamentals,
-you'll be much better equipped to pick up the rest of what you need to know from the more detailed
-reference documentation. The links at the end of each section will lead you to where you
-can learn more.
+本章教程对 Apache Flink 的基本概念进行了介绍,虽然省略了许多重要细节,但是如果你掌握了本章内容,就足以实现可扩展并行度的 ETL、数据分析以及事件驱动的流式应用程序。本章重点对 Flink API 中的状态管理和时间进行了介绍,掌握了这些基础知识后,你将能更好地从其他详细参考文档中获取和掌握你所需要的知识。每小节结尾都有链接去引导你了解更多内容。
 
-Specifically, you will learn:
+具体来说,你将在本章学习到以下内容:
 
-- how to implement streaming data processing pipelines
-- how and why Flink manages state
-- how to use event time to consistently compute accurate analytics
-- how to build event-driven applications on continuous streams
-- how Flink is able to provide fault-tolerant, stateful stream processing with exactly-once semantics
+- 如何实现流数据处理管道(pipelines)
+- Flink 如何管理状态以及为何需要管理状态
+- 如何使用事件时间(event time)来一致并准确地进行计算分析
+- 如何在源源不断的数据流上构建事件驱动的应用程序
+- Flink 如何提供具有精确一次(exactly-once)计算语义的可容错、有状态流处理
 
-This training focuses on four critical concepts: continuous processing of streaming data, event
-time, stateful stream processing, and state snapshots. This page introduces these concepts.
+本章教程着重介绍四个概念:源源不断的流式数据处理、事件时间、有状态流处理和状态快照。基本概念介绍如下。
 
-{% info Note %} Accompanying this training is a set of hands-on exercises that will
-guide you through learning how to work with the concepts being presented. A link to the relevant
-exercise is provided at the end of each section.
+{% info Note %} 每小节教程都有实践练习部分去引导你如何在程序中使用其所述的概念,并在小节结尾都提供了相关实践练习的代码链接。
 
 {% top %}
 
-## Stream Processing
+## 流处理
 
-Streams are data's natural habitat. Whether it is events from web servers, trades from a stock
-exchange, or sensor readings from a machine on a factory floor, data is created as part of a stream.
-But when you analyze data, you can either organize your processing around _bounded_ or _unbounded_
-streams, and which of these paradigms you choose has profound consequences.
+在自然环境中,数据的产生原本就是流式的。无论是来自 Web 服务器的事件数据,证券交易所的交易数据,还是来自工厂车间机器上的传感器数据,其数据都是流式的进行生成。但是当你分析数据时,可以围绕 _有界流_(_bounded_)或 _无界流_(_unbounded_)两种模型来组织处理数据,当然,选择不同的模型,程序的执行和处理方式也都会不同。
 
 <img src="{{ site.baseurl }}/fig/bounded-unbounded.png" alt="Bounded and unbounded streams" class="offset" width="90%" />
 
-**Batch processing** is the paradigm at work when you process a bounded data stream. In this mode of
-operation you can choose to ingest the entire dataset before producing any results, which means that
-it is possible, for example, to sort the data, compute global statistics, or produce a final report
-that summarizes all of the input.
+**批处理**是有界数据流处理的范例。在这种模式下,你可以选择在计算结果输出之前输入整个数据集,这也就意味着你可以对整个数据集的数据进行排序、统计或汇总计算后再输出结果。
 
-**Stream processing**, on the other hand, involves unbounded data streams. Conceptually, at least,
-the input may never end, and so you are forced to continuously process the data as it arrives. 
+**流处理**正相反,其包括了无界数据流。至少理论上来说,它的数据输入永远不会结束,因此程序必须持续不断地对到达的数据进行处理。
 
-In Flink, applications are composed of **streaming dataflows** that may be transformed by
-user-defined **operators**. These dataflows form directed graphs that start with one or more
-**sources**, and end in one or more **sinks**.
+在 Flink 中,应用程序由用户自定义**算子**转换而来的**流式 dataflows** 所组成。这些流式 dataflows 形成了有向图,其可以以一个或多个**源**(source)开始,并以一个或多个**汇**(sink)结束。
 
 <img src="{{ site.baseurl }}/fig/program_dataflow.svg" alt="A DataStream program, and its dataflow." class="offset" width="80%" />
 
-Often there is a one-to-one correspondence between the transformations in the program and the
-operators in the dataflow. Sometimes, however, one transformation may consist of multiple operators.
+通常,程序代码中的 transformation 和 dataflow 中的算子(operator)之间是一一对应的。但有时也会出现一个 transformation 包含多个算子的情况,如上图所示。
 
-An application may consume real-time data from streaming sources such as message queues or
-distributed logs, like Apache Kafka or Kinesis. But flink can also consume bounded, historic data
-from a variety of data sources. Similarly, the streams of results being produced by a Flink
-application can be sent to a wide variety of systems that can be connected as sinks.
+Flink 应用程序可以消费来自消息队列或分布式日志这类流式数据源(例如 Apache Kafka 或 Kinesis)的实时数据,也可以从各种的数据源中消费有界的历史数据。同样,Flink 应用程序生成的结果流也可以发送到各种可以连接到程序中的数据汇中。
 
 <img src="{{ site.baseurl }}/fig/flink-application-sources-sinks.png" alt="Flink application with sources and sinks" class="offset" width="90%" />
 
-### Parallel Dataflows
+### 并行 Dataflows
 
-Programs in Flink are inherently parallel and distributed. During execution, a
-*stream* has one or more **stream partitions**, and each *operator* has one or
-more **operator subtasks**. The operator subtasks are independent of one
-another, and execute in different threads and possibly on different machines or
-containers.
+Flink 程序本质上是分布式并行程序。在程序执行期间,一个流有一个或多个**流分区**(Stream Partition),每个算子有一个或多个**算子子任务**(Operator Subtask)。算子子任务彼此独立,并在不同的线程中运行,或在不同的计算机或容器中运行。
 
-The number of operator subtasks is the **parallelism** of that particular
-operator.
-Different operators of the same program may have different levels of
-parallelism.
+算子子任务数就是其对应算子的**并行度**。在同一程序中,不同算子也可能具有不同的并行度。
 
 <img src="{{ site.baseurl }}/fig/parallel_dataflow.svg" alt="A parallel dataflow" class="offset" width="80%" />
 
-Streams can transport data between two operators in a *one-to-one* (or
-*forwarding*) pattern, or in a *redistributing* pattern:
-
-  - **One-to-one** streams (for example between the *Source* and the *map()*
-    operators in the figure above) preserve the partitioning and ordering of
-    the elements. That means that subtask[1] of the *map()* operator will see
-    the same elements in the same order as they were produced by subtask[1] of
-    the *Source* operator.
-
-  - **Redistributing** streams (as between *map()* and *keyBy/window* above, as
-    well as between *keyBy/window* and *Sink*) change the partitioning of
-    streams. Each *operator subtask* sends data to different target subtasks,
-    depending on the selected transformation. Examples are *keyBy()* (which
-    re-partitions by hashing the key), *broadcast()*, or *rebalance()* (which
-    re-partitions randomly). In a *redistributing* exchange the ordering among
-    the elements is only preserved within each pair of sending and receiving
-    subtasks (for example, subtask[1] of *map()* and subtask[2] of
-    *keyBy/window*). So, for example, the redistribution between the keyBy/window and
-    the Sink operators shown above introduces non-determinism regarding the 
-    order in which the aggregated results for different keys arrive at the Sink.
+Flink 算子之间可以通过*一对一*(*直传*)模式或*重新分发*模式传输数据:
+
+  - **一对一**模式(例如上图中的 *Source* 和 *map()* 算子之间)可以保留元素的分区和顺序信息。这意味着 *map()* 算子的 subtask[1] 输入的数据以及其顺序与 *Source* 算子的 subtask[1] 输出的数据和顺序完全相同。
+
+  - **重新分发**模式(例如上图中的 *map()* 和 *keyBy/window* 之间,以及 *keyBy/window* 和 *Sink* 之间)则会更改数据所在的流分区。当你在程序中选择使用不同的 *transformation*,每个*算子子任务*也会根据不同的 transformation 将数据发送到不同的目标子任务。例如以下这几种 transformation 和其对应分发数据的模式:*keyBy()*(通过散列键重新分区)、*broadcast()*(广播)或 *rebalance()*(随机重新分发)。在*重新分发*数据的过程中,元素顺序信息只有在每对输出和输入子任务之间才会被保留(例如,*keyBy/window* 的 subtask[2] 接收到的 *map()* 的 subtask[1] 中的元素都是有序的)。因此,上图所示的 *keyBy/window* 和 *Sink* 算子之间数据的重新分发时,不同键(key)的聚合结果到达 Sink 的顺序是不确定的。

Review comment:
       个人感觉 `元素顺序信息只有在每对输出和输入子任务之间才会被保留` 中的 `顺序信息被保留` 感觉 `顺序信息` 属于一个`属性`,能否有其他更好的描述呢,比如`保证顺序信息` 等




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org