You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/07/03 13:11:48 UTC

[GitHub] [flink] klion26 commented on a change in pull request #12237: [FLINK-17290] [chinese-translation, Documentation / Training] Transla…

klion26 commented on a change in pull request #12237:
URL: https://github.com/apache/flink/pull/12237#discussion_r449562002



##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -29,123 +29,86 @@ under the License.
 
 ## Event Time and Watermarks
 
-### Introduction
+<a name="Introduction"></a>
+### 概要
 
-Flink explicitly supports three different notions of time:
+Flink 明确支持以下三种时间语义:
 
-* _event time:_ the time when an event occurred, as recorded by the device producing (or storing) the event
+* _事件时间(event time):_ 事件产生的时间,记录的是设备生产(或者存储)事件的时间
 
-* _ingestion time:_ a timestamp recorded by Flink at the moment it ingests the event
+* _摄取时间(ingestion time):_ Flink 读取事件时记录的时间
 
-* _processing time:_ the time when a specific operator in your pipeline is processing the event
+* _处理时间(processing time):_ Flink pipeline 中具体算子处理事件的时间
 
-For reproducible results, e.g., when computing the maximum price a stock reached during the first
-hour of trading on a given day, you should use event time. In this way the result won't depend on
-when the calculation is performed. This kind of real-time application is sometimes performed using
-processing time, but then the results are determined by the events that happen to be processed
-during that hour, rather than the events that occurred then. Computing analytics based on processing
-time causes inconsistencies, and makes it difficult to re-analyze historic data or test new
-implementations.
+为了获得可重现的结果,例如在计算过去的特定一天里第一个小时股票的最高价格时,我们应该使用事件时间。这样的话,无论什么时间去计算都不会影响输出结果。然而如果使用 processing time 的话,实时应用程序的结果是由程序运行的时间所决定。多次运行基于 processing time 的实时程序,可能得到的结果都不想同,也可能会导致再次分析历史数据或者测试新代码变得异常困难。

Review comment:
       ```suggestion
   为了获得可重现的结果,例如在计算过去的特定一天里第一个小时股票的最高价格时,我们应该使用事件时间。这样的话,无论什么时间去计算都不会影响输出结果。然而如果使用 processing time 的话,实时应用程序的结果是由程序运行的时间所决定。多次运行基于 processing time 的实时程序,可能得到的结果都不相同,也可能会导致再次分析历史数据或者测试新代码变得异常困难。
   ```

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -29,123 +29,86 @@ under the License.
 
 ## Event Time and Watermarks
 
-### Introduction
+<a name="Introduction"></a>
+### 概要
 
-Flink explicitly supports three different notions of time:
+Flink 明确支持以下三种时间语义:
 
-* _event time:_ the time when an event occurred, as recorded by the device producing (or storing) the event
+* _事件时间(event time):_ 事件产生的时间,记录的是设备生产(或者存储)事件的时间
 
-* _ingestion time:_ a timestamp recorded by Flink at the moment it ingests the event
+* _摄取时间(ingestion time):_ Flink 读取事件时记录的时间
 
-* _processing time:_ the time when a specific operator in your pipeline is processing the event
+* _处理时间(processing time):_ Flink pipeline 中具体算子处理事件的时间
 
-For reproducible results, e.g., when computing the maximum price a stock reached during the first
-hour of trading on a given day, you should use event time. In this way the result won't depend on
-when the calculation is performed. This kind of real-time application is sometimes performed using
-processing time, but then the results are determined by the events that happen to be processed
-during that hour, rather than the events that occurred then. Computing analytics based on processing
-time causes inconsistencies, and makes it difficult to re-analyze historic data or test new
-implementations.
+为了获得可重现的结果,例如在计算过去的特定一天里第一个小时股票的最高价格时,我们应该使用事件时间。这样的话,无论什么时间去计算都不会影响输出结果。然而如果使用 processing time 的话,实时应用程序的结果是由程序运行的时间所决定。多次运行基于 processing time 的实时程序,可能得到的结果都不想同,也可能会导致再次分析历史数据或者测试新代码变得异常困难。
 
-### Working with Event Time
+<a name="Working-with-Event-Time"></a>
+### 使用 Event Time
 
-By default, Flink will use processing time. To change this, you can set the Time Characteristic:
+Flink 在默认情况下是使用处理时间。也可以通过下面配置来告诉 Flink 选择哪种时间语义:
 
 {% highlight java %}
 final StreamExecutionEnvironment env =
     StreamExecutionEnvironment.getExecutionEnvironment();
 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
 {% endhighlight %}
 
-If you want to use event time, you will also need to supply a Timestamp Extractor and Watermark
-Generator that Flink will use to track the progress of event time. This will be covered in the
-section below on [Working with Watermarks]({% link
-learn-flink/streaming_analytics.zh.md %}#working-with-watermarks), but first we should explain what
-watermarks are.
+如果想要使用事件时间,需要额外给 Flink 提供一个时间戳提取器和 Watermark 生成器,Flink 将使用它们来跟踪事件时间的进度。这将在选节[使用 Watermarks]({% link learn-flink/streaming_analytics.zh.md %}#Working-with-Watermarks) 中介绍,但是首先我们需要解释一下 watermarks 是什么。
 
 ### Watermarks
 
-Let's work through a simple example that will show why watermarks are needed, and how they work.
+让我们通过一个简单的示例来演示为什么需要 watermarks 及其工作方式。
 
-In this example you have a stream of timestamped events that arrive somewhat out of order, as shown
-below. The numbers shown are timestamps that indicate when these events actually occurred. The first
-event to arrive happened at time 4, and it is followed by an event that happened earlier, at time 2,
-and so on:
+在此示例中,我们将看到带有混乱时间戳的事件流,如下所示。显示的数字表达的是这些事件实际发生时间的时间戳。到达的第一个事件发生在时间 4,随后发生的事件发生在更早的时间 2,依此类推:
 
 <div class="text-center" style="font-size: x-large; word-spacing: 0.5em; margin: 1em 0em;">
 ··· 23 19 22 24 21 14 17 13 12 15 9 11 7 2 4 →
 </div>
 
-Now imagine that you are trying create a stream sorter. This is meant to be an application that
-processes each event from a stream as it arrives, and emits a new stream containing the same events,
-but ordered by their timestamps.
+假设我们要对数据流排序,我们想要达到的目的是:应用程序应该在数据流里的事件到达时就有一个算子(我们暂且称之为排序)开始处理事件,这个算子所输出的流是按照时间戳排序好的。
 
-Some observations:
+让我们重新审视这些数据:
 
-(1) The first element your stream sorter sees is the 4, but you can't just immediately release it as
-the first element of the sorted stream. It may have arrived out of order, and an earlier event might
-yet arrive. In fact, you have the benefit of some god-like knowledge of this stream's future, and
-you can see that your stream sorter should wait at least until the 2 arrives before producing any
-results.
+(1) 我们的排序器第一个看到的数据是 4,但是我们不能立即将其作为已排序流的第一个元素释放。因为我们并不能确定它是有序的,并且较早的事件有可能并未到达。事实上,如果站在上帝视角,我们知道,必须要等到 2 到来时,排序器才可以有事件输出。
 
-*Some buffering, and some delay, is necessary.*
+*需要一些缓冲,需要一些时间,但这都是值得的*
 
-(2) If you do this wrong, you could end up waiting forever. First the sorter saw an event from time
-4, and then an event from time 2. Will an event with a timestamp less than 2 ever arrive? Maybe.
-Maybe not. You could wait forever and never see a 1.
+(2) 接下来的这一步,如果我们选择的是固执的等待,我们永远不会有结果。首先,我们从时间 4 看到了一个事件,然后从时间 2 看到了一个事件。可是,时间戳小于 2 的事件接下来会不会到来呢?可能会,也可能不会。再次站在上帝视角,我们知道,我们永远不会看到 1。
 
-*Eventually you have to be courageous and emit the 2 as the start of the sorted stream.*
+*最终,我们必须勇于承担责任,并发出指令,把 2 作为已排序的事件流的开始*
 
-(3) What you need then is some sort of policy that defines when, for any given timestamped event, to
-stop waiting for the arrival of earlier events.
+(3)然后,我们需要一种策略,该策略定义:对于任何给定时间戳的事件,Flink何时停止等待较早事件的到来。
 
-*This is precisely what watermarks do* — they define when to stop waiting for earlier events.
+*这正是 watermarks 的作用* — 它们定义何时停止等待较早的事件。
 
-Event time processing in Flink depends on *watermark generators* that insert special timestamped
-elements into the stream, called *watermarks*. A watermark for time _t_ is an assertion that the
-stream is (probably) now complete up through time _t_.
+Flink 中事件时间的处理取决于 *watermark 生成器*,后者将带有时间戳的特殊元素插入流中形成 *watermarks*。事件时间 _t_ 的 watermark 代表 _t_ 之后(很可能)不会有新的元素到达。

Review comment:
       `事件时间 _t_ 的 watermark 代表 _t_ 之后(很可能)不会有新的元素到达。` 这个地方改成 `事件时间 _t_ 的 watermark 代表 _t_ 之前(很可能)都已经到达。` 会更好一些吗?
   这里的意思表达 这个时刻之前的数据都收到了,这也是 watermark 的语义

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -29,123 +29,86 @@ under the License.
 
 ## Event Time and Watermarks
 
-### Introduction
+<a name="Introduction"></a>
+### 概要
 
-Flink explicitly supports three different notions of time:
+Flink 明确支持以下三种时间语义:
 
-* _event time:_ the time when an event occurred, as recorded by the device producing (or storing) the event
+* _事件时间(event time):_ 事件产生的时间,记录的是设备生产(或者存储)事件的时间
 
-* _ingestion time:_ a timestamp recorded by Flink at the moment it ingests the event
+* _摄取时间(ingestion time):_ Flink 读取事件时记录的时间
 
-* _processing time:_ the time when a specific operator in your pipeline is processing the event
+* _处理时间(processing time):_ Flink pipeline 中具体算子处理事件的时间
 
-For reproducible results, e.g., when computing the maximum price a stock reached during the first
-hour of trading on a given day, you should use event time. In this way the result won't depend on
-when the calculation is performed. This kind of real-time application is sometimes performed using
-processing time, but then the results are determined by the events that happen to be processed
-during that hour, rather than the events that occurred then. Computing analytics based on processing
-time causes inconsistencies, and makes it difficult to re-analyze historic data or test new
-implementations.
+为了获得可重现的结果,例如在计算过去的特定一天里第一个小时股票的最高价格时,我们应该使用事件时间。这样的话,无论什么时间去计算都不会影响输出结果。然而如果使用 processing time 的话,实时应用程序的结果是由程序运行的时间所决定。多次运行基于 processing time 的实时程序,可能得到的结果都不想同,也可能会导致再次分析历史数据或者测试新代码变得异常困难。
 
-### Working with Event Time
+<a name="Working-with-Event-Time"></a>
+### 使用 Event Time
 
-By default, Flink will use processing time. To change this, you can set the Time Characteristic:
+Flink 在默认情况下是使用处理时间。也可以通过下面配置来告诉 Flink 选择哪种时间语义:
 
 {% highlight java %}
 final StreamExecutionEnvironment env =
     StreamExecutionEnvironment.getExecutionEnvironment();
 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
 {% endhighlight %}
 
-If you want to use event time, you will also need to supply a Timestamp Extractor and Watermark
-Generator that Flink will use to track the progress of event time. This will be covered in the
-section below on [Working with Watermarks]({% link
-learn-flink/streaming_analytics.zh.md %}#working-with-watermarks), but first we should explain what
-watermarks are.
+如果想要使用事件时间,需要额外给 Flink 提供一个时间戳提取器和 Watermark 生成器,Flink 将使用它们来跟踪事件时间的进度。这将在选节[使用 Watermarks]({% link learn-flink/streaming_analytics.zh.md %}#Working-with-Watermarks) 中介绍,但是首先我们需要解释一下 watermarks 是什么。
 
 ### Watermarks
 
-Let's work through a simple example that will show why watermarks are needed, and how they work.
+让我们通过一个简单的示例来演示为什么需要 watermarks 及其工作方式。
 
-In this example you have a stream of timestamped events that arrive somewhat out of order, as shown
-below. The numbers shown are timestamps that indicate when these events actually occurred. The first
-event to arrive happened at time 4, and it is followed by an event that happened earlier, at time 2,
-and so on:
+在此示例中,我们将看到带有混乱时间戳的事件流,如下所示。显示的数字表达的是这些事件实际发生时间的时间戳。到达的第一个事件发生在时间 4,随后发生的事件发生在更早的时间 2,依此类推:
 
 <div class="text-center" style="font-size: x-large; word-spacing: 0.5em; margin: 1em 0em;">
 ··· 23 19 22 24 21 14 17 13 12 15 9 11 7 2 4 →
 </div>
 
-Now imagine that you are trying create a stream sorter. This is meant to be an application that
-processes each event from a stream as it arrives, and emits a new stream containing the same events,
-but ordered by their timestamps.
+假设我们要对数据流排序,我们想要达到的目的是:应用程序应该在数据流里的事件到达时就有一个算子(我们暂且称之为排序)开始处理事件,这个算子所输出的流是按照时间戳排序好的。
 
-Some observations:
+让我们重新审视这些数据:
 
-(1) The first element your stream sorter sees is the 4, but you can't just immediately release it as
-the first element of the sorted stream. It may have arrived out of order, and an earlier event might
-yet arrive. In fact, you have the benefit of some god-like knowledge of this stream's future, and
-you can see that your stream sorter should wait at least until the 2 arrives before producing any
-results.
+(1) 我们的排序器第一个看到的数据是 4,但是我们不能立即将其作为已排序流的第一个元素释放。因为我们并不能确定它是有序的,并且较早的事件有可能并未到达。事实上,如果站在上帝视角,我们知道,必须要等到 2 到来时,排序器才可以有事件输出。
 
-*Some buffering, and some delay, is necessary.*
+*需要一些缓冲,需要一些时间,但这都是值得的*
 
-(2) If you do this wrong, you could end up waiting forever. First the sorter saw an event from time
-4, and then an event from time 2. Will an event with a timestamp less than 2 ever arrive? Maybe.
-Maybe not. You could wait forever and never see a 1.
+(2) 接下来的这一步,如果我们选择的是固执的等待,我们永远不会有结果。首先,我们从时间 4 看到了一个事件,然后从时间 2 看到了一个事件。可是,时间戳小于 2 的事件接下来会不会到来呢?可能会,也可能不会。再次站在上帝视角,我们知道,我们永远不会看到 1。

Review comment:
       `我们从时间 4 看到了一个事件,然后从时间 2 看到了一个事件。` 这个地方能否再优化一下呢?
   这里的意思是看到一个 “时间戳为 4” 的事件,然后看到一个“时间戳为 2”的事件。现在的翻译中比较难看出这个意思

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -29,123 +29,86 @@ under the License.
 
 ## Event Time and Watermarks
 
-### Introduction
+<a name="Introduction"></a>
+### 概要
 
-Flink explicitly supports three different notions of time:
+Flink 明确支持以下三种时间语义:
 
-* _event time:_ the time when an event occurred, as recorded by the device producing (or storing) the event
+* _事件时间(event time):_ 事件产生的时间,记录的是设备生产(或者存储)事件的时间
 
-* _ingestion time:_ a timestamp recorded by Flink at the moment it ingests the event
+* _摄取时间(ingestion time):_ Flink 读取事件时记录的时间
 
-* _processing time:_ the time when a specific operator in your pipeline is processing the event
+* _处理时间(processing time):_ Flink pipeline 中具体算子处理事件的时间
 
-For reproducible results, e.g., when computing the maximum price a stock reached during the first
-hour of trading on a given day, you should use event time. In this way the result won't depend on
-when the calculation is performed. This kind of real-time application is sometimes performed using
-processing time, but then the results are determined by the events that happen to be processed
-during that hour, rather than the events that occurred then. Computing analytics based on processing
-time causes inconsistencies, and makes it difficult to re-analyze historic data or test new
-implementations.
+为了获得可重现的结果,例如在计算过去的特定一天里第一个小时股票的最高价格时,我们应该使用事件时间。这样的话,无论什么时间去计算都不会影响输出结果。然而如果使用 processing time 的话,实时应用程序的结果是由程序运行的时间所决定。多次运行基于 processing time 的实时程序,可能得到的结果都不想同,也可能会导致再次分析历史数据或者测试新代码变得异常困难。
 
-### Working with Event Time
+<a name="Working-with-Event-Time"></a>
+### 使用 Event Time
 
-By default, Flink will use processing time. To change this, you can set the Time Characteristic:
+Flink 在默认情况下是使用处理时间。也可以通过下面配置来告诉 Flink 选择哪种时间语义:
 
 {% highlight java %}
 final StreamExecutionEnvironment env =
     StreamExecutionEnvironment.getExecutionEnvironment();
 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
 {% endhighlight %}
 
-If you want to use event time, you will also need to supply a Timestamp Extractor and Watermark
-Generator that Flink will use to track the progress of event time. This will be covered in the
-section below on [Working with Watermarks]({% link
-learn-flink/streaming_analytics.zh.md %}#working-with-watermarks), but first we should explain what
-watermarks are.
+如果想要使用事件时间,需要额外给 Flink 提供一个时间戳提取器和 Watermark 生成器,Flink 将使用它们来跟踪事件时间的进度。这将在选节[使用 Watermarks]({% link learn-flink/streaming_analytics.zh.md %}#Working-with-Watermarks) 中介绍,但是首先我们需要解释一下 watermarks 是什么。
 
 ### Watermarks
 
-Let's work through a simple example that will show why watermarks are needed, and how they work.
+让我们通过一个简单的示例来演示为什么需要 watermarks 及其工作方式。
 
-In this example you have a stream of timestamped events that arrive somewhat out of order, as shown
-below. The numbers shown are timestamps that indicate when these events actually occurred. The first
-event to arrive happened at time 4, and it is followed by an event that happened earlier, at time 2,
-and so on:
+在此示例中,我们将看到带有混乱时间戳的事件流,如下所示。显示的数字表达的是这些事件实际发生时间的时间戳。到达的第一个事件发生在时间 4,随后发生的事件发生在更早的时间 2,依此类推:
 
 <div class="text-center" style="font-size: x-large; word-spacing: 0.5em; margin: 1em 0em;">
 ··· 23 19 22 24 21 14 17 13 12 15 9 11 7 2 4 →
 </div>
 
-Now imagine that you are trying create a stream sorter. This is meant to be an application that
-processes each event from a stream as it arrives, and emits a new stream containing the same events,
-but ordered by their timestamps.
+假设我们要对数据流排序,我们想要达到的目的是:应用程序应该在数据流里的事件到达时就有一个算子(我们暂且称之为排序)开始处理事件,这个算子所输出的流是按照时间戳排序好的。
 
-Some observations:
+让我们重新审视这些数据:
 
-(1) The first element your stream sorter sees is the 4, but you can't just immediately release it as
-the first element of the sorted stream. It may have arrived out of order, and an earlier event might
-yet arrive. In fact, you have the benefit of some god-like knowledge of this stream's future, and
-you can see that your stream sorter should wait at least until the 2 arrives before producing any
-results.
+(1) 我们的排序器第一个看到的数据是 4,但是我们不能立即将其作为已排序流的第一个元素释放。因为我们并不能确定它是有序的,并且较早的事件有可能并未到达。事实上,如果站在上帝视角,我们知道,必须要等到 2 到来时,排序器才可以有事件输出。
 
-*Some buffering, and some delay, is necessary.*
+*需要一些缓冲,需要一些时间,但这都是值得的*
 
-(2) If you do this wrong, you could end up waiting forever. First the sorter saw an event from time
-4, and then an event from time 2. Will an event with a timestamp less than 2 ever arrive? Maybe.
-Maybe not. You could wait forever and never see a 1.
+(2) 接下来的这一步,如果我们选择的是固执的等待,我们永远不会有结果。首先,我们从时间 4 看到了一个事件,然后从时间 2 看到了一个事件。可是,时间戳小于 2 的事件接下来会不会到来呢?可能会,也可能不会。再次站在上帝视角,我们知道,我们永远不会看到 1。
 
-*Eventually you have to be courageous and emit the 2 as the start of the sorted stream.*
+*最终,我们必须勇于承担责任,并发出指令,把 2 作为已排序的事件流的开始*
 
-(3) What you need then is some sort of policy that defines when, for any given timestamped event, to
-stop waiting for the arrival of earlier events.
+(3)然后,我们需要一种策略,该策略定义:对于任何给定时间戳的事件,Flink何时停止等待较早事件的到来。
 
-*This is precisely what watermarks do* — they define when to stop waiting for earlier events.
+*这正是 watermarks 的作用* — 它们定义何时停止等待较早的事件。
 
-Event time processing in Flink depends on *watermark generators* that insert special timestamped
-elements into the stream, called *watermarks*. A watermark for time _t_ is an assertion that the
-stream is (probably) now complete up through time _t_.
+Flink 中事件时间的处理取决于 *watermark 生成器*,后者将带有时间戳的特殊元素插入流中形成 *watermarks*。事件时间 _t_ 的 watermark 代表 _t_ 之后(很可能)不会有新的元素到达。
 
-When should this stream sorter stop waiting, and push out the 2 to start the sorted stream? When a
-watermark arrives with a timestamp of 2, or greater.
+当 watermark 以 2 或更大的时间戳到达时,事件流的排序器应何时停止等待,并输出 2 作为已经排序好的流。
 
-(4) You might imagine different policies for deciding how to generate watermarks.
+(4) 我们可能会思考,如何决定 watermarks 的不同生成策略
 
-Each event arrives after some delay, and these delays vary, so some events are delayed more than
-others. One simple approach is to assume that these delays are bounded by some maximum delay. Flink
-refers to this strategy as *bounded-out-of-orderness* watermarking. It is easy to imagine more
-complex approaches to watermarking, but for most applications a fixed delay works well enough.
+每个事件都会延迟一段时间后到达,然而这些延迟有所不同,有些事件可能其他事件延迟得更多。一种简单的方法是假定这些延迟受某个最大延迟的限制。Flink 将此策略称为 *最大无序边界(bounded-out-of-orderness)* watermark。当然,我们可以想像出更好的生成 watermark 的方法,但是对于大多数应用而言,固定延迟策略已经足够了。
 
-### Latency vs. Completeness
+<a name="Latency-vs.-Completeness"></a>
+### 延迟 VS 正确性
 
-Another way to think about watermarks is that they give you, the developer of a streaming
-application, control over the tradeoff between latency and completeness. Unlike in batch processing,
-where one has the luxury of being able to have complete knowledge of the input before producing any
-results, with streaming you must eventually stop waiting to see more of the input, and produce some
-sort of result.
+watermarks 给了开发者流处理的一种选择,它们使开发人员在开发应用程序是可以控制延迟和完整性之间的权衡。与批处理不同,批处理中的奢侈之处在于可以在产生任何结果之前完全了解输入,而使用流式传输,我们不被允许等待所有的时间都产生了,才输出排序好的数据,这与流相违背。

Review comment:
       ```suggestion
   watermarks 给了开发者流处理的一种选择,它们使开发人员在开发应用程序时可以控制延迟和完整性之间的权衡。与批处理不同,批处理中的奢侈之处在于可以在产生任何结果之前完全了解输入,而使用流式传输,我们不被允许等待所有的时间都产生了,才输出排序好的数据,这与流相违背。
   ```
   这个地方 `奢侈` 能否改成一个类似 `优势` 之类的次呢?

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -207,67 +166,62 @@ stream.
     .reduce|aggregate|process(<window function>)
 {% endhighlight %}
 
-You can also use windowing with non-keyed streams, but keep in mind that in this case, the
-processing will _not_ be done in parallel:
+您不是必须使用键控事件流(keyed stream),但是值得注意的是,如果不使用键控事件流,我们的程序就不能 _并行_ 处理。
 
 {% highlight java %}
 stream.
     .windowAll(<window assigner>)
     .reduce|aggregate|process(<window function>)
 {% endhighlight %}
 
-### Window Assigners
+<a name="Window-Assigners"></a>
+### 窗口分配器
 
-Flink has several built-in types of window assigners, which are illustrated below:
+Flink 有一些内置的窗口分配器,如下所示:
 
 <img src="{{ site.baseurl }}/fig/window-assigners.svg" alt="Window assigners" class="center" width="80%" />

Review comment:
       这个地方能否把链接改成 `{%link %}` 的形式呢?具体的应该是 `{% link /fig/window-assigners.svg %}`
   最近社区[讨论决定](http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Reminder-Prefer-link-tag-in-documentation-td42362.html),把链接改成这种形式了

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -207,67 +166,62 @@ stream.
     .reduce|aggregate|process(<window function>)
 {% endhighlight %}
 
-You can also use windowing with non-keyed streams, but keep in mind that in this case, the
-processing will _not_ be done in parallel:
+您不是必须使用键控事件流(keyed stream),但是值得注意的是,如果不使用键控事件流,我们的程序就不能 _并行_ 处理。
 
 {% highlight java %}
 stream.
     .windowAll(<window assigner>)
     .reduce|aggregate|process(<window function>)
 {% endhighlight %}
 
-### Window Assigners
+<a name="Window-Assigners"></a>
+### 窗口分配器
 
-Flink has several built-in types of window assigners, which are illustrated below:
+Flink 有一些内置的窗口分配器,如下所示:
 
 <img src="{{ site.baseurl }}/fig/window-assigners.svg" alt="Window assigners" class="center" width="80%" />
 
-Some examples of what these window assigners might be used for, and how to specify them:
+通过一些示例来展示关于这些窗口如何使用,或者如何区分它们:
 
-* Tumbling time windows
-  * _page views per minute_
+* 滚动时间窗口
+  * _每分钟页面浏览量_
   * `TumblingEventTimeWindows.of(Time.minutes(1))`
-* Sliding time windows
-  * _page views per minute computed every 10 seconds_
+* 滑动时间窗口
+  * _每10秒钟计算前1分钟的页面浏览量_
   * `SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))`
-* Session windows 
-  * _page views per session, where sessions are defined by a gap of at least 30 minutes between sessions_
+* 会话窗口
+  * _每个会话的网页浏览量,其中会话之间的间隔至少为30分钟_
   * `EventTimeSessionWindows.withGap(Time.minutes(30))`
 
-Durations can be specified using one of `Time.milliseconds(n)`, `Time.seconds(n)`, `Time.minutes(n)`, `Time.hours(n)`, and `Time.days(n)`.
+以下都是一些可以使用的间隔时间 `Time.milliseconds(n)`, `Time.seconds(n)`, `Time.minutes(n)`,
+ `Time.hours(n)`, 和 `Time.days(n)`。
 
-The time-based window assigners (including session windows) come in both event time and processing
-time flavors. There are significant tradeoffs between these two types of time windows. With
-processing time windowing you have to accept these limitations:
+基于时间的窗口分配器(包括会话时间)既可以处理 `事件时间`,也可以处理 `处理时间`。这两种基于时间的处理没有哪一个更好,我们必须折衷。使用 `处理时间`,我们必须接受以下限制:
 
-* can not correctly process historic data,
-* can not correctly handle out-of-order data,
-* results will be non-deterministic,
+* 无法正确处理历史数据,
+* 无法正确处理超过最大无序边界的数据,
+* 结果将是不确定的,
 
-but with the advantage of lower latency. 
+但是有自己的优势,较低的延迟。
 
-When working with count-based windows, keep in mind that these windows will not fire until a batch
-is complete. There's no option to time-out and process a partial window, though you could implement
-that behavior yourself with a custom Trigger.
+使用基于计数的窗口时,请记住,只有窗口内的事件数量到达窗口要求的数值时,这些窗口才会触发计算。尽管可以使用自定义触发器自己实现该行为,但无法应对超时和处理部分窗口。
 
-A global window assigner assigns every event (with the same key) to the same global window. This is
-only useful if you are going to do your own custom windowing, with a custom Trigger. In many cases
-where this might seem useful you will be better off using a `ProcessFunction` as described
-[in another section]({% link learn-flink/event_driven.zh.md %}#process-functions).
+我们可能在有些场景下,想使用全局 window assigner 将每个事件(相同的 key)都分配给某一个指定的全局窗口。 很多情况下,一个比较好的建议是使用 `ProcessFunction`,具体介绍在[这里]({% link learn-flink/event_driven.md %}#process-functions)。

Review comment:
       ```suggestion
   我们可能在有些场景下,想使用全局 window assigner 将每个事件(相同的 key)都分配给某一个指定的全局窗口。 很多情况下,一个比较好的建议是使用 `ProcessFunction`,具体介绍在[这里]({% link learn-flink/event_driven.zh.md %}#process-functions)。
   ```

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -437,37 +376,29 @@ stream
     .reduce(<same reduce function>)
 {% endhighlight %}
 
-You might expect Flink's runtime to be smart enough to do this parallel pre-aggregation for you
-(provided you are using a ReduceFunction or AggregateFunction), but it's not.
+可能我们会猜测以 Flink 的能力,想要做到这样看起来是可行的(前提是你使用的是 ReduceFunction 或 AggregateFunction ),但不是。
 
-The reason why this works is that the events produced by a time window are assigned timestamps
-based on the time at the end of the window. So, for example, all of the events produced
-by an hour-long window will have timestamps marking the end of an hour. Any subsequent window
-consuming those events should have a duration that is the same as, or a multiple of, the
-previous window.
+之所以可行,是因为时间窗口产生的事件是根据窗口结束时的时间分配时间戳的。例如,一个小时小时的窗口所产生的所有事件都将带有标记一个小时结束的时间戳。后面的窗口内的数据消费和前面的流产生的数据是一致的。
 
-#### No Results for Empty TimeWindows
+<a name="No-Results-for-Empty-TimeWindows"></a>
+#### 空的时间窗口不会输出结果
 
-Windows are only created when events are assigned to them. So if there are no events in a given time
-frame, no results will be reported.
+事件会触发窗口的创建。换句话说,如果在特定的窗口内没有事件,就不会有窗口,就不会有输出结果。
 
 #### Late Events Can Cause Late Merges
 
-Session windows are based on an abstraction of windows that can _merge_. Each element is initially
-assigned to a new window, after which windows are merged whenever the gap between them is small
-enough. In this way, a late event can bridge the gap separating two previously separate sessions,
-producing a late merge.
+会话窗口的实现是基于窗口的一个抽象能力,窗口可以 _聚合_。会话窗口中的每个数据在初始被消费时,都会被分配一个新的窗口,但是如果窗口之间的间隔足够小,多个窗口就会被聚合。延迟事件可以弥合两个先前分开的会话间隔,从而产生一个虽然有延迟但是更加准确地结果。
 
 {% top %}
 
-## Hands-on
+## 动手练习
 
-The hands-on exercise that goes with this section is the [Hourly Tips
+本节附带的动手练习是[Hourly Tips

Review comment:
       ```suggestion
   本节附带的动手练习是 [Hourly Tips
   ```

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -397,36 +342,30 @@ stream.
     .process(...);
 {% endhighlight %}
 
-When the allowed lateness is greater than zero, only those events that are so late that they would
-be dropped are sent to the side output (if it has been configured).
+当允许的延迟大于零时,只有那些超过最大无序边界以至于会被丢弃的事件才会被发送到侧输出流(如果已配置)。
 
-### Surprises
+<a name="Surprises"></a>
+### 给用户的一些惊喜

Review comment:
       这些惊喜不是故意给用户的(现在的翻译有点感觉是故意给用户的)我们能否换一种描述方式呢?
   或者说不叫做“惊喜”,因为可以会出乎意外

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -29,123 +29,86 @@ under the License.
 
 ## Event Time and Watermarks
 
-### Introduction
+<a name="Introduction"></a>
+### 概要
 
-Flink explicitly supports three different notions of time:
+Flink 明确支持以下三种时间语义:
 
-* _event time:_ the time when an event occurred, as recorded by the device producing (or storing) the event
+* _事件时间(event time):_ 事件产生的时间,记录的是设备生产(或者存储)事件的时间
 
-* _ingestion time:_ a timestamp recorded by Flink at the moment it ingests the event
+* _摄取时间(ingestion time):_ Flink 读取事件时记录的时间
 
-* _processing time:_ the time when a specific operator in your pipeline is processing the event
+* _处理时间(processing time):_ Flink pipeline 中具体算子处理事件的时间
 
-For reproducible results, e.g., when computing the maximum price a stock reached during the first
-hour of trading on a given day, you should use event time. In this way the result won't depend on
-when the calculation is performed. This kind of real-time application is sometimes performed using
-processing time, but then the results are determined by the events that happen to be processed
-during that hour, rather than the events that occurred then. Computing analytics based on processing
-time causes inconsistencies, and makes it difficult to re-analyze historic data or test new
-implementations.
+为了获得可重现的结果,例如在计算过去的特定一天里第一个小时股票的最高价格时,我们应该使用事件时间。这样的话,无论什么时间去计算都不会影响输出结果。然而如果使用 processing time 的话,实时应用程序的结果是由程序运行的时间所决定。多次运行基于 processing time 的实时程序,可能得到的结果都不想同,也可能会导致再次分析历史数据或者测试新代码变得异常困难。
 
-### Working with Event Time
+<a name="Working-with-Event-Time"></a>
+### 使用 Event Time
 
-By default, Flink will use processing time. To change this, you can set the Time Characteristic:
+Flink 在默认情况下是使用处理时间。也可以通过下面配置来告诉 Flink 选择哪种时间语义:
 
 {% highlight java %}
 final StreamExecutionEnvironment env =
     StreamExecutionEnvironment.getExecutionEnvironment();
 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
 {% endhighlight %}
 
-If you want to use event time, you will also need to supply a Timestamp Extractor and Watermark
-Generator that Flink will use to track the progress of event time. This will be covered in the
-section below on [Working with Watermarks]({% link
-learn-flink/streaming_analytics.zh.md %}#working-with-watermarks), but first we should explain what
-watermarks are.
+如果想要使用事件时间,需要额外给 Flink 提供一个时间戳提取器和 Watermark 生成器,Flink 将使用它们来跟踪事件时间的进度。这将在选节[使用 Watermarks]({% link learn-flink/streaming_analytics.zh.md %}#Working-with-Watermarks) 中介绍,但是首先我们需要解释一下 watermarks 是什么。
 
 ### Watermarks
 
-Let's work through a simple example that will show why watermarks are needed, and how they work.
+让我们通过一个简单的示例来演示为什么需要 watermarks 及其工作方式。
 
-In this example you have a stream of timestamped events that arrive somewhat out of order, as shown
-below. The numbers shown are timestamps that indicate when these events actually occurred. The first
-event to arrive happened at time 4, and it is followed by an event that happened earlier, at time 2,
-and so on:
+在此示例中,我们将看到带有混乱时间戳的事件流,如下所示。显示的数字表达的是这些事件实际发生时间的时间戳。到达的第一个事件发生在时间 4,随后发生的事件发生在更早的时间 2,依此类推:
 
 <div class="text-center" style="font-size: x-large; word-spacing: 0.5em; margin: 1em 0em;">
 ··· 23 19 22 24 21 14 17 13 12 15 9 11 7 2 4 →
 </div>
 
-Now imagine that you are trying create a stream sorter. This is meant to be an application that
-processes each event from a stream as it arrives, and emits a new stream containing the same events,
-but ordered by their timestamps.
+假设我们要对数据流排序,我们想要达到的目的是:应用程序应该在数据流里的事件到达时就有一个算子(我们暂且称之为排序)开始处理事件,这个算子所输出的流是按照时间戳排序好的。
 
-Some observations:
+让我们重新审视这些数据:
 
-(1) The first element your stream sorter sees is the 4, but you can't just immediately release it as
-the first element of the sorted stream. It may have arrived out of order, and an earlier event might
-yet arrive. In fact, you have the benefit of some god-like knowledge of this stream's future, and
-you can see that your stream sorter should wait at least until the 2 arrives before producing any
-results.
+(1) 我们的排序器第一个看到的数据是 4,但是我们不能立即将其作为已排序流的第一个元素释放。因为我们并不能确定它是有序的,并且较早的事件有可能并未到达。事实上,如果站在上帝视角,我们知道,必须要等到 2 到来时,排序器才可以有事件输出。
 
-*Some buffering, and some delay, is necessary.*
+*需要一些缓冲,需要一些时间,但这都是值得的*
 
-(2) If you do this wrong, you could end up waiting forever. First the sorter saw an event from time
-4, and then an event from time 2. Will an event with a timestamp less than 2 ever arrive? Maybe.
-Maybe not. You could wait forever and never see a 1.
+(2) 接下来的这一步,如果我们选择的是固执的等待,我们永远不会有结果。首先,我们从时间 4 看到了一个事件,然后从时间 2 看到了一个事件。可是,时间戳小于 2 的事件接下来会不会到来呢?可能会,也可能不会。再次站在上帝视角,我们知道,我们永远不会看到 1。
 
-*Eventually you have to be courageous and emit the 2 as the start of the sorted stream.*
+*最终,我们必须勇于承担责任,并发出指令,把 2 作为已排序的事件流的开始*
 
-(3) What you need then is some sort of policy that defines when, for any given timestamped event, to
-stop waiting for the arrival of earlier events.
+(3)然后,我们需要一种策略,该策略定义:对于任何给定时间戳的事件,Flink何时停止等待较早事件的到来。
 
-*This is precisely what watermarks do* — they define when to stop waiting for earlier events.
+*这正是 watermarks 的作用* — 它们定义何时停止等待较早的事件。
 
-Event time processing in Flink depends on *watermark generators* that insert special timestamped
-elements into the stream, called *watermarks*. A watermark for time _t_ is an assertion that the
-stream is (probably) now complete up through time _t_.
+Flink 中事件时间的处理取决于 *watermark 生成器*,后者将带有时间戳的特殊元素插入流中形成 *watermarks*。事件时间 _t_ 的 watermark 代表 _t_ 之后(很可能)不会有新的元素到达。
 
-When should this stream sorter stop waiting, and push out the 2 to start the sorted stream? When a
-watermark arrives with a timestamp of 2, or greater.
+当 watermark 以 2 或更大的时间戳到达时,事件流的排序器应何时停止等待,并输出 2 作为已经排序好的流。
 
-(4) You might imagine different policies for deciding how to generate watermarks.
+(4) 我们可能会思考,如何决定 watermarks 的不同生成策略
 
-Each event arrives after some delay, and these delays vary, so some events are delayed more than
-others. One simple approach is to assume that these delays are bounded by some maximum delay. Flink
-refers to this strategy as *bounded-out-of-orderness* watermarking. It is easy to imagine more
-complex approaches to watermarking, but for most applications a fixed delay works well enough.
+每个事件都会延迟一段时间后到达,然而这些延迟有所不同,有些事件可能其他事件延迟得更多。一种简单的方法是假定这些延迟受某个最大延迟的限制。Flink 将此策略称为 *最大无序边界(bounded-out-of-orderness)* watermark。当然,我们可以想像出更好的生成 watermark 的方法,但是对于大多数应用而言,固定延迟策略已经足够了。

Review comment:
       `有些事件可能其他事件延迟得更多` -> `有些事件可能比其他事件延迟得更多`

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -167,38 +130,34 @@ public static class TimestampsAndWatermarks
 }
 {% endhighlight %}
 
-Note that the constructor for `BoundedOutOfOrdernessTimestampExtractor` takes a parameter which
-specifies the maximum expected out-of-orderness (10 seconds, in this example).
+请仔细观察, `BoundedOutOfOrdernessTimestampExtractor` 的构造函数使用了一个常量用来指定最大无序边界程度(在此示例中为10秒)。

Review comment:
       ```suggestion
   请仔细观察, `BoundedOutOfOrdernessTimestampExtractor` 的构造函数使用了一个常量用来指定最大无序边界程度(在此示例中为 10 秒)。
   ```

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -29,123 +29,86 @@ under the License.
 
 ## Event Time and Watermarks
 
-### Introduction
+<a name="Introduction"></a>
+### 概要
 
-Flink explicitly supports three different notions of time:
+Flink 明确支持以下三种时间语义:
 
-* _event time:_ the time when an event occurred, as recorded by the device producing (or storing) the event
+* _事件时间(event time):_ 事件产生的时间,记录的是设备生产(或者存储)事件的时间
 
-* _ingestion time:_ a timestamp recorded by Flink at the moment it ingests the event
+* _摄取时间(ingestion time):_ Flink 读取事件时记录的时间
 
-* _processing time:_ the time when a specific operator in your pipeline is processing the event
+* _处理时间(processing time):_ Flink pipeline 中具体算子处理事件的时间
 
-For reproducible results, e.g., when computing the maximum price a stock reached during the first
-hour of trading on a given day, you should use event time. In this way the result won't depend on
-when the calculation is performed. This kind of real-time application is sometimes performed using
-processing time, but then the results are determined by the events that happen to be processed
-during that hour, rather than the events that occurred then. Computing analytics based on processing
-time causes inconsistencies, and makes it difficult to re-analyze historic data or test new
-implementations.
+为了获得可重现的结果,例如在计算过去的特定一天里第一个小时股票的最高价格时,我们应该使用事件时间。这样的话,无论什么时间去计算都不会影响输出结果。然而如果使用 processing time 的话,实时应用程序的结果是由程序运行的时间所决定。多次运行基于 processing time 的实时程序,可能得到的结果都不想同,也可能会导致再次分析历史数据或者测试新代码变得异常困难。
 
-### Working with Event Time
+<a name="Working-with-Event-Time"></a>
+### 使用 Event Time
 
-By default, Flink will use processing time. To change this, you can set the Time Characteristic:
+Flink 在默认情况下是使用处理时间。也可以通过下面配置来告诉 Flink 选择哪种时间语义:
 
 {% highlight java %}
 final StreamExecutionEnvironment env =
     StreamExecutionEnvironment.getExecutionEnvironment();
 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
 {% endhighlight %}
 
-If you want to use event time, you will also need to supply a Timestamp Extractor and Watermark
-Generator that Flink will use to track the progress of event time. This will be covered in the
-section below on [Working with Watermarks]({% link
-learn-flink/streaming_analytics.zh.md %}#working-with-watermarks), but first we should explain what
-watermarks are.
+如果想要使用事件时间,需要额外给 Flink 提供一个时间戳提取器和 Watermark 生成器,Flink 将使用它们来跟踪事件时间的进度。这将在选节[使用 Watermarks]({% link learn-flink/streaming_analytics.zh.md %}#Working-with-Watermarks) 中介绍,但是首先我们需要解释一下 watermarks 是什么。
 
 ### Watermarks
 
-Let's work through a simple example that will show why watermarks are needed, and how they work.
+让我们通过一个简单的示例来演示为什么需要 watermarks 及其工作方式。
 
-In this example you have a stream of timestamped events that arrive somewhat out of order, as shown
-below. The numbers shown are timestamps that indicate when these events actually occurred. The first
-event to arrive happened at time 4, and it is followed by an event that happened earlier, at time 2,
-and so on:
+在此示例中,我们将看到带有混乱时间戳的事件流,如下所示。显示的数字表达的是这些事件实际发生时间的时间戳。到达的第一个事件发生在时间 4,随后发生的事件发生在更早的时间 2,依此类推:
 
 <div class="text-center" style="font-size: x-large; word-spacing: 0.5em; margin: 1em 0em;">
 ··· 23 19 22 24 21 14 17 13 12 15 9 11 7 2 4 →
 </div>
 
-Now imagine that you are trying create a stream sorter. This is meant to be an application that
-processes each event from a stream as it arrives, and emits a new stream containing the same events,
-but ordered by their timestamps.
+假设我们要对数据流排序,我们想要达到的目的是:应用程序应该在数据流里的事件到达时就有一个算子(我们暂且称之为排序)开始处理事件,这个算子所输出的流是按照时间戳排序好的。
 
-Some observations:
+让我们重新审视这些数据:
 
-(1) The first element your stream sorter sees is the 4, but you can't just immediately release it as
-the first element of the sorted stream. It may have arrived out of order, and an earlier event might
-yet arrive. In fact, you have the benefit of some god-like knowledge of this stream's future, and
-you can see that your stream sorter should wait at least until the 2 arrives before producing any
-results.
+(1) 我们的排序器第一个看到的数据是 4,但是我们不能立即将其作为已排序流的第一个元素释放。因为我们并不能确定它是有序的,并且较早的事件有可能并未到达。事实上,如果站在上帝视角,我们知道,必须要等到 2 到来时,排序器才可以有事件输出。
 
-*Some buffering, and some delay, is necessary.*
+*需要一些缓冲,需要一些时间,但这都是值得的*
 
-(2) If you do this wrong, you could end up waiting forever. First the sorter saw an event from time
-4, and then an event from time 2. Will an event with a timestamp less than 2 ever arrive? Maybe.
-Maybe not. You could wait forever and never see a 1.
+(2) 接下来的这一步,如果我们选择的是固执的等待,我们永远不会有结果。首先,我们从时间 4 看到了一个事件,然后从时间 2 看到了一个事件。可是,时间戳小于 2 的事件接下来会不会到来呢?可能会,也可能不会。再次站在上帝视角,我们知道,我们永远不会看到 1。
 
-*Eventually you have to be courageous and emit the 2 as the start of the sorted stream.*
+*最终,我们必须勇于承担责任,并发出指令,把 2 作为已排序的事件流的开始*
 
-(3) What you need then is some sort of policy that defines when, for any given timestamped event, to
-stop waiting for the arrival of earlier events.
+(3)然后,我们需要一种策略,该策略定义:对于任何给定时间戳的事件,Flink何时停止等待较早事件的到来。
 
-*This is precisely what watermarks do* — they define when to stop waiting for earlier events.
+*这正是 watermarks 的作用* — 它们定义何时停止等待较早的事件。
 
-Event time processing in Flink depends on *watermark generators* that insert special timestamped
-elements into the stream, called *watermarks*. A watermark for time _t_ is an assertion that the
-stream is (probably) now complete up through time _t_.
+Flink 中事件时间的处理取决于 *watermark 生成器*,后者将带有时间戳的特殊元素插入流中形成 *watermarks*。事件时间 _t_ 的 watermark 代表 _t_ 之后(很可能)不会有新的元素到达。
 
-When should this stream sorter stop waiting, and push out the 2 to start the sorted stream? When a
-watermark arrives with a timestamp of 2, or greater.
+当 watermark 以 2 或更大的时间戳到达时,事件流的排序器应何时停止等待,并输出 2 作为已经排序好的流。

Review comment:
       `事件流的排序器应何时停止等待` -> `事件流的排序器应停止等待` 会好一些吗?

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -355,18 +306,15 @@ private static class MyWindowFunction extends ProcessWindowFunction<
 }
 {% endhighlight %}
 
-Notice that the `Iterable<SensorReading>`
-will contain exactly one reading -- the pre-aggregated maximum computed by `MyReducingMax`.
+请注意 `Iterable<SensorReading>`将只包含一个读数 -- `MyReducingMax` 计算出的预先汇总的最大值。

Review comment:
       ```suggestion
   请注意 `Iterable<SensorReading>` 将只包含一个读数 -- `MyReducingMax` 计算出的预先汇总的最大值。
   ```

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -29,123 +29,86 @@ under the License.
 
 ## Event Time and Watermarks
 
-### Introduction
+<a name="Introduction"></a>
+### 概要
 
-Flink explicitly supports three different notions of time:
+Flink 明确支持以下三种时间语义:
 
-* _event time:_ the time when an event occurred, as recorded by the device producing (or storing) the event
+* _事件时间(event time):_ 事件产生的时间,记录的是设备生产(或者存储)事件的时间
 
-* _ingestion time:_ a timestamp recorded by Flink at the moment it ingests the event
+* _摄取时间(ingestion time):_ Flink 读取事件时记录的时间
 
-* _processing time:_ the time when a specific operator in your pipeline is processing the event
+* _处理时间(processing time):_ Flink pipeline 中具体算子处理事件的时间
 
-For reproducible results, e.g., when computing the maximum price a stock reached during the first
-hour of trading on a given day, you should use event time. In this way the result won't depend on
-when the calculation is performed. This kind of real-time application is sometimes performed using
-processing time, but then the results are determined by the events that happen to be processed
-during that hour, rather than the events that occurred then. Computing analytics based on processing
-time causes inconsistencies, and makes it difficult to re-analyze historic data or test new
-implementations.
+为了获得可重现的结果,例如在计算过去的特定一天里第一个小时股票的最高价格时,我们应该使用事件时间。这样的话,无论什么时间去计算都不会影响输出结果。然而如果使用 processing time 的话,实时应用程序的结果是由程序运行的时间所决定。多次运行基于 processing time 的实时程序,可能得到的结果都不想同,也可能会导致再次分析历史数据或者测试新代码变得异常困难。
 
-### Working with Event Time
+<a name="Working-with-Event-Time"></a>
+### 使用 Event Time
 
-By default, Flink will use processing time. To change this, you can set the Time Characteristic:
+Flink 在默认情况下是使用处理时间。也可以通过下面配置来告诉 Flink 选择哪种时间语义:
 
 {% highlight java %}
 final StreamExecutionEnvironment env =
     StreamExecutionEnvironment.getExecutionEnvironment();
 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
 {% endhighlight %}
 
-If you want to use event time, you will also need to supply a Timestamp Extractor and Watermark
-Generator that Flink will use to track the progress of event time. This will be covered in the
-section below on [Working with Watermarks]({% link
-learn-flink/streaming_analytics.zh.md %}#working-with-watermarks), but first we should explain what
-watermarks are.
+如果想要使用事件时间,需要额外给 Flink 提供一个时间戳提取器和 Watermark 生成器,Flink 将使用它们来跟踪事件时间的进度。这将在选节[使用 Watermarks]({% link learn-flink/streaming_analytics.zh.md %}#Working-with-Watermarks) 中介绍,但是首先我们需要解释一下 watermarks 是什么。
 
 ### Watermarks
 
-Let's work through a simple example that will show why watermarks are needed, and how they work.
+让我们通过一个简单的示例来演示为什么需要 watermarks 及其工作方式。
 
-In this example you have a stream of timestamped events that arrive somewhat out of order, as shown
-below. The numbers shown are timestamps that indicate when these events actually occurred. The first
-event to arrive happened at time 4, and it is followed by an event that happened earlier, at time 2,
-and so on:
+在此示例中,我们将看到带有混乱时间戳的事件流,如下所示。显示的数字表达的是这些事件实际发生时间的时间戳。到达的第一个事件发生在时间 4,随后发生的事件发生在更早的时间 2,依此类推:
 
 <div class="text-center" style="font-size: x-large; word-spacing: 0.5em; margin: 1em 0em;">
 ··· 23 19 22 24 21 14 17 13 12 15 9 11 7 2 4 →
 </div>
 
-Now imagine that you are trying create a stream sorter. This is meant to be an application that
-processes each event from a stream as it arrives, and emits a new stream containing the same events,
-but ordered by their timestamps.
+假设我们要对数据流排序,我们想要达到的目的是:应用程序应该在数据流里的事件到达时就有一个算子(我们暂且称之为排序)开始处理事件,这个算子所输出的流是按照时间戳排序好的。
 
-Some observations:
+让我们重新审视这些数据:
 
-(1) The first element your stream sorter sees is the 4, but you can't just immediately release it as
-the first element of the sorted stream. It may have arrived out of order, and an earlier event might
-yet arrive. In fact, you have the benefit of some god-like knowledge of this stream's future, and
-you can see that your stream sorter should wait at least until the 2 arrives before producing any
-results.
+(1) 我们的排序器第一个看到的数据是 4,但是我们不能立即将其作为已排序流的第一个元素释放。因为我们并不能确定它是有序的,并且较早的事件有可能并未到达。事实上,如果站在上帝视角,我们知道,必须要等到 2 到来时,排序器才可以有事件输出。
 
-*Some buffering, and some delay, is necessary.*
+*需要一些缓冲,需要一些时间,但这都是值得的*
 
-(2) If you do this wrong, you could end up waiting forever. First the sorter saw an event from time
-4, and then an event from time 2. Will an event with a timestamp less than 2 ever arrive? Maybe.
-Maybe not. You could wait forever and never see a 1.
+(2) 接下来的这一步,如果我们选择的是固执的等待,我们永远不会有结果。首先,我们从时间 4 看到了一个事件,然后从时间 2 看到了一个事件。可是,时间戳小于 2 的事件接下来会不会到来呢?可能会,也可能不会。再次站在上帝视角,我们知道,我们永远不会看到 1。
 
-*Eventually you have to be courageous and emit the 2 as the start of the sorted stream.*
+*最终,我们必须勇于承担责任,并发出指令,把 2 作为已排序的事件流的开始*
 
-(3) What you need then is some sort of policy that defines when, for any given timestamped event, to
-stop waiting for the arrival of earlier events.
+(3)然后,我们需要一种策略,该策略定义:对于任何给定时间戳的事件,Flink何时停止等待较早事件的到来。

Review comment:
       ```suggestion
   (3)然后,我们需要一种策略,该策略定义:对于任何给定时间戳的事件,Flink 何时停止等待较早事件的到来。
   ```

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -29,123 +29,86 @@ under the License.
 
 ## Event Time and Watermarks
 
-### Introduction
+<a name="Introduction"></a>
+### 概要
 
-Flink explicitly supports three different notions of time:
+Flink 明确支持以下三种时间语义:
 
-* _event time:_ the time when an event occurred, as recorded by the device producing (or storing) the event
+* _事件时间(event time):_ 事件产生的时间,记录的是设备生产(或者存储)事件的时间
 
-* _ingestion time:_ a timestamp recorded by Flink at the moment it ingests the event
+* _摄取时间(ingestion time):_ Flink 读取事件时记录的时间
 
-* _processing time:_ the time when a specific operator in your pipeline is processing the event
+* _处理时间(processing time):_ Flink pipeline 中具体算子处理事件的时间
 
-For reproducible results, e.g., when computing the maximum price a stock reached during the first
-hour of trading on a given day, you should use event time. In this way the result won't depend on
-when the calculation is performed. This kind of real-time application is sometimes performed using
-processing time, but then the results are determined by the events that happen to be processed
-during that hour, rather than the events that occurred then. Computing analytics based on processing
-time causes inconsistencies, and makes it difficult to re-analyze historic data or test new
-implementations.
+为了获得可重现的结果,例如在计算过去的特定一天里第一个小时股票的最高价格时,我们应该使用事件时间。这样的话,无论什么时间去计算都不会影响输出结果。然而如果使用 processing time 的话,实时应用程序的结果是由程序运行的时间所决定。多次运行基于 processing time 的实时程序,可能得到的结果都不想同,也可能会导致再次分析历史数据或者测试新代码变得异常困难。
 
-### Working with Event Time
+<a name="Working-with-Event-Time"></a>
+### 使用 Event Time
 
-By default, Flink will use processing time. To change this, you can set the Time Characteristic:
+Flink 在默认情况下是使用处理时间。也可以通过下面配置来告诉 Flink 选择哪种时间语义:
 
 {% highlight java %}
 final StreamExecutionEnvironment env =
     StreamExecutionEnvironment.getExecutionEnvironment();
 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
 {% endhighlight %}
 
-If you want to use event time, you will also need to supply a Timestamp Extractor and Watermark
-Generator that Flink will use to track the progress of event time. This will be covered in the
-section below on [Working with Watermarks]({% link
-learn-flink/streaming_analytics.zh.md %}#working-with-watermarks), but first we should explain what
-watermarks are.
+如果想要使用事件时间,需要额外给 Flink 提供一个时间戳提取器和 Watermark 生成器,Flink 将使用它们来跟踪事件时间的进度。这将在选节[使用 Watermarks]({% link learn-flink/streaming_analytics.zh.md %}#Working-with-Watermarks) 中介绍,但是首先我们需要解释一下 watermarks 是什么。
 
 ### Watermarks
 
-Let's work through a simple example that will show why watermarks are needed, and how they work.
+让我们通过一个简单的示例来演示为什么需要 watermarks 及其工作方式。
 
-In this example you have a stream of timestamped events that arrive somewhat out of order, as shown
-below. The numbers shown are timestamps that indicate when these events actually occurred. The first
-event to arrive happened at time 4, and it is followed by an event that happened earlier, at time 2,
-and so on:
+在此示例中,我们将看到带有混乱时间戳的事件流,如下所示。显示的数字表达的是这些事件实际发生时间的时间戳。到达的第一个事件发生在时间 4,随后发生的事件发生在更早的时间 2,依此类推:
 
 <div class="text-center" style="font-size: x-large; word-spacing: 0.5em; margin: 1em 0em;">
 ··· 23 19 22 24 21 14 17 13 12 15 9 11 7 2 4 →
 </div>
 
-Now imagine that you are trying create a stream sorter. This is meant to be an application that
-processes each event from a stream as it arrives, and emits a new stream containing the same events,
-but ordered by their timestamps.
+假设我们要对数据流排序,我们想要达到的目的是:应用程序应该在数据流里的事件到达时就有一个算子(我们暂且称之为排序)开始处理事件,这个算子所输出的流是按照时间戳排序好的。
 
-Some observations:
+让我们重新审视这些数据:
 
-(1) The first element your stream sorter sees is the 4, but you can't just immediately release it as
-the first element of the sorted stream. It may have arrived out of order, and an earlier event might
-yet arrive. In fact, you have the benefit of some god-like knowledge of this stream's future, and
-you can see that your stream sorter should wait at least until the 2 arrives before producing any
-results.
+(1) 我们的排序器第一个看到的数据是 4,但是我们不能立即将其作为已排序流的第一个元素释放。因为我们并不能确定它是有序的,并且较早的事件有可能并未到达。事实上,如果站在上帝视角,我们知道,必须要等到 2 到来时,排序器才可以有事件输出。
 
-*Some buffering, and some delay, is necessary.*
+*需要一些缓冲,需要一些时间,但这都是值得的*
 
-(2) If you do this wrong, you could end up waiting forever. First the sorter saw an event from time
-4, and then an event from time 2. Will an event with a timestamp less than 2 ever arrive? Maybe.
-Maybe not. You could wait forever and never see a 1.
+(2) 接下来的这一步,如果我们选择的是固执的等待,我们永远不会有结果。首先,我们从时间 4 看到了一个事件,然后从时间 2 看到了一个事件。可是,时间戳小于 2 的事件接下来会不会到来呢?可能会,也可能不会。再次站在上帝视角,我们知道,我们永远不会看到 1。
 
-*Eventually you have to be courageous and emit the 2 as the start of the sorted stream.*
+*最终,我们必须勇于承担责任,并发出指令,把 2 作为已排序的事件流的开始*
 
-(3) What you need then is some sort of policy that defines when, for any given timestamped event, to
-stop waiting for the arrival of earlier events.
+(3)然后,我们需要一种策略,该策略定义:对于任何给定时间戳的事件,Flink何时停止等待较早事件的到来。
 
-*This is precisely what watermarks do* — they define when to stop waiting for earlier events.
+*这正是 watermarks 的作用* — 它们定义何时停止等待较早的事件。
 
-Event time processing in Flink depends on *watermark generators* that insert special timestamped
-elements into the stream, called *watermarks*. A watermark for time _t_ is an assertion that the
-stream is (probably) now complete up through time _t_.
+Flink 中事件时间的处理取决于 *watermark 生成器*,后者将带有时间戳的特殊元素插入流中形成 *watermarks*。事件时间 _t_ 的 watermark 代表 _t_ 之后(很可能)不会有新的元素到达。
 
-When should this stream sorter stop waiting, and push out the 2 to start the sorted stream? When a
-watermark arrives with a timestamp of 2, or greater.
+当 watermark 以 2 或更大的时间戳到达时,事件流的排序器应何时停止等待,并输出 2 作为已经排序好的流。
 
-(4) You might imagine different policies for deciding how to generate watermarks.
+(4) 我们可能会思考,如何决定 watermarks 的不同生成策略
 
-Each event arrives after some delay, and these delays vary, so some events are delayed more than
-others. One simple approach is to assume that these delays are bounded by some maximum delay. Flink
-refers to this strategy as *bounded-out-of-orderness* watermarking. It is easy to imagine more
-complex approaches to watermarking, but for most applications a fixed delay works well enough.
+每个事件都会延迟一段时间后到达,然而这些延迟有所不同,有些事件可能其他事件延迟得更多。一种简单的方法是假定这些延迟受某个最大延迟的限制。Flink 将此策略称为 *最大无序边界(bounded-out-of-orderness)* watermark。当然,我们可以想像出更好的生成 watermark 的方法,但是对于大多数应用而言,固定延迟策略已经足够了。
 
-### Latency vs. Completeness
+<a name="Latency-vs.-Completeness"></a>
+### 延迟 VS 正确性
 
-Another way to think about watermarks is that they give you, the developer of a streaming
-application, control over the tradeoff between latency and completeness. Unlike in batch processing,
-where one has the luxury of being able to have complete knowledge of the input before producing any
-results, with streaming you must eventually stop waiting to see more of the input, and produce some
-sort of result.
+watermarks 给了开发者流处理的一种选择,它们使开发人员在开发应用程序是可以控制延迟和完整性之间的权衡。与批处理不同,批处理中的奢侈之处在于可以在产生任何结果之前完全了解输入,而使用流式传输,我们不被允许等待所有的时间都产生了,才输出排序好的数据,这与流相违背。
 
-You can either configure your watermarking aggressively, with a short bounded delay, and thereby
-take the risk of producing results with rather incomplete knowledge of the input -- i.e., a possibly
-wrong result, produced quickly. Or you can wait longer, and produce results that take advantage of
-having more complete knowledge of the input stream(s).
+我们可以把 watermarks 的边界时间配置的相对较短,从而冒着在输入了解不完全的情况下产生结果的风险-即可能会很快产生错误结果。或者,你可以等待更长的时间,并利用对输入流的更全面的了解来产生结果。
 
-It is also possible to implement hybrid solutions that produce initial results quickly, and then
-supply updates to those results as additional (late) data is processed. This is a good approach for
-some applications.
+当然也可以实施混合解决方案,先快速产生初步结果,然后在处理其他(最新)数据时向这些结果提供更新。对于有一些对延迟的容忍程度很低,但是又对结果有很严格的要求的场景下,或许是一个福音。
 
-### Lateness
+<a name="Latency"></a>
+### 延迟
 
-Lateness is defined relative to the watermarks. A `Watermark(t)` asserts that the stream is complete
-up through time _t_; any event following this watermark whose timestamp is &le; _t_ is late.
+延迟是相对于 watermarks 定义的。`Watermark(t)` 判定事件流的时间已经到达了 _t_; watermark 之后的时间戳为 &le; _t_ 的任何事件都被称之为延迟事件。
 
-### Working with Watermarks
+<a name="Working-with-Watermarks"></a>
+### 使用 Watermarks
 
-In order to perform event-time-based event processing, Flink needs to know the time associated with
-each event, and it also needs the stream to include watermarks.
+如果想要使用基于带有事件时间戳的事件流,Flink 需要知道与每个事件相关的时间戳,这个时间戳就是 watermarks。

Review comment:
       这里的意思是说 `流必须包含 watermark` 吧,而不是 `这个时间戳就是 watermark` 对于事件时间,watermark 也可以和 eventtime 不一样的。

##########
File path: docs/learn-flink/streaming_analytics.zh.md
##########
@@ -355,18 +306,15 @@ private static class MyWindowFunction extends ProcessWindowFunction<
 }
 {% endhighlight %}
 
-Notice that the `Iterable<SensorReading>`
-will contain exactly one reading -- the pre-aggregated maximum computed by `MyReducingMax`.
+请注意 `Iterable<SensorReading>`将只包含一个读数 -- `MyReducingMax` 计算出的预先汇总的最大值。
 
-### Late Events
+<a name="Late-Events"></a>
+### 晚到的事件
 
-By default, when using event time windows, late events are dropped. There are two optional parts of
-the window API that give you more control over this.
+默认场景下,超过最大无序边界的事件会被删除,但是 Flink 给了我们两个选择去控制这些事件。
+
+您可以使用一种称为[旁路输出]({% link learn-flink/event_driven.zh.md %}#side-outputs) 的机制来安排将要删除的事件收集到侧输出流中,这里是一个示例:

Review comment:
       这里可以在 `event_driven.zh.md` 中提一个 hotfix 的 PR,修改一下链接(在旁路输出一节前面增加一个 <a name="xxx"></a>),现在的链接是错误的(这不是你这里的问题,是之前的文档有问题)
   
   hotfix 的提法可以参考 [contribution](https://flink.apache.org/contributing/contribute-code.html)
   
   提了之后可以 ping 我




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org