You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/05/19 16:51:33 UTC

[GitHub] [flink] XBaith commented on a change in pull request #12237: [FLINK-17290] [chinese-translation, Documentation / Training] Transla…

XBaith commented on a change in pull request #12237:
URL: https://github.com/apache/flink/pull/12237#discussion_r427437580



##########
File path: docs/training/streaming_analytics.zh.md
##########
@@ -27,125 +27,101 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Event Time and Watermarks
+## 事件时间和水印
 
-### Introduction
+### 简介
 
-Flink explicitly supports three different notions of time:
+Flink 明确的支持以下三种事件时间:
 
-* _event time:_ the time when an event occurred, as recorded by the device producing (or storing) the event
+* _事件时间:_ 事件产生的时间,记录的是设备生产(或者存储)事件的时间
 
-* _ingestion time:_ a timestamp recorded by Flink at the moment it ingests the event
+* _摄取时间:_ Flink 提取事件时记录的时间戳
 
-* _processing time:_ the time when a specific operator in your pipeline is processing the event
+* _处理时间:_ Flink 中通过特定的操作处理事件的时间
 
-For reproducible results, e.g., when computing the maximum price a stock reached during the first
-hour of trading on a given day, you should use event time. In this way the result won't depend on
-when the calculation is performed. This kind of real-time application is sometimes performed using
-processing time, but then the results are determined by the events that happen to be processed
-during that hour, rather than the events that occurred then. Computing analytics based on processing
-time causes inconsistencies, and makes it difficult to re-analyze historic data or test new
-implementations.
+为了获得可重现的结果,例如在计算过去的特定一天里第一个小时股票的最高价格时,我们应该使用事件时间。这样的话,无论
+什么时间去计算都不会影响输出结果。然而有些人,在实时计算应用时使用处理时间,这样的话,输出结果就会被处理时间点所决
+定,而不是事件的生成时间。基于处理时间会导致多次计算的结果不一致,也可能会导致重新分析历史数据和测试变得异常困难。
 
-### Working with Event Time
+### 使用事件时间
 
-By default, Flink will use processing time. To change this, you can set the Time Characteristic:
+Flink 在默认情况下使用处理时间。也可以通过如下配置来告诉 Flink 选择哪种事件时间:
 
 {% highlight java %}
 final StreamExecutionEnvironment env =
     StreamExecutionEnvironment.getExecutionEnvironment();
 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
 {% endhighlight %}
 
-If you want to use event time, you will also need to supply a Timestamp Extractor and Watermark
-Generator that Flink will use to track the progress of event time. This will be covered in the
-section below on [Working with Watermarks]({% link
-training/streaming_analytics.zh.md %}#working-with-watermarks), but first we should explain what
-watermarks are.
+如果想要使用事件时间,则需要额外给 Flink 提供一个时间戳的提取器和水印,Flink 将使用它们来跟踪事件时间的进度。这
+将在选节[使用水印]({% linktutorials/streaming_analytics.md %}#使用水印)中介绍,但是首先我们需要解释一下
+水印是什么。
 
-### Watermarks
+### 水印
 
-Let's work through a simple example that will show why watermarks are needed, and how they work.
+让我们通过一个简单的示例来演示,该示例将说明为什么需要水印及其工作方式。
 
-In this example you have a stream of timestamped events that arrive somewhat out of order, as shown
-below. The numbers shown are timestamps that indicate when these events actually occurred. The first
-event to arrive happened at time 4, and it is followed by an event that happened earlier, at time 2,
-and so on:
+在此示例中,我们将看到带有混乱时间戳的事件流,如下所示。显示的数字表达的是这些事件实际发生时间的时间戳。到达的
+第一个事件发生在时间4,随后发生的事件发生在更早的时间2,依此类推:
 
 <div class="text-center" style="font-size: x-large; word-spacing: 0.5em; margin: 1em 0em;">
 ··· 23 19 22 24 21 14 17 13 12 15 9 11 7 2 4 →
 </div>
 
-Now imagine that you are trying create a stream sorter. This is meant to be an application that
-processes each event from a stream as it arrives, and emits a new stream containing the same events,
-but ordered by their timestamps.
+假设我们要对数据流排序,我们想要达到的目的是:应用程序应该在数据流里的事件到达时就处理每个事件,并发出包含相同
+事件但按其时间戳排序的新流。
 
-Some observations:
+让我们重新审视这些数据:
 
-(1) The first element your stream sorter sees is the 4, but you can't just immediately release it as
-the first element of the sorted stream. It may have arrived out of order, and an earlier event might
-yet arrive. In fact, you have the benefit of some god-like knowledge of this stream's future, and
-you can see that your stream sorter should wait at least until the 2 arrives before producing any
-results.
+(1) 我们的排序器第一个看到的数据是4,但是我们不能立即将其作为已排序流的第一个元素释放。因为我们并不能确定它是
+有序的,并且较早的事件有可能并未到达。事实上,如果站在上帝视角,我们知道,必须要等到2到来时,排序器才可以有事件输出。
 
-*Some buffering, and some delay, is necessary.*
+*需要一些缓冲,需要一些时间,但这都是值得的*
 
-(2) If you do this wrong, you could end up waiting forever. First the sorter saw an event from time
-4, and then an event from time 2. Will an event with a timestamp less than 2 ever arrive? Maybe.
-Maybe not. You could wait forever and never see a 1.
+(2) 接下来的这一步,如果我们选择的是固执的等待,我们永远不会有结果。首先,我们从时间4看到了一个事件,然后从时
+间2看到了一个事件。可是,时间戳小于2的事件接下来会不会到来呢?可能会,也可能不会。再次站在上帝视角,我们知道,我
+们永远不会看到1。
 
-*Eventually you have to be courageous and emit the 2 as the start of the sorted stream.*
+*最终,我们必须勇于承担责任,并发出指令,把2作为已排序的事件流的开始*
 
-(3) What you need then is some sort of policy that defines when, for any given timestamped event, to
-stop waiting for the arrival of earlier events.
+(3)然后,我们需要一种策略,该策略定义:对于任何给定时间戳的事件,Flink何时停止等待较早事件的到来。
 
-*This is precisely what watermarks do* — they define when to stop waiting for earlier events.
+*这正是水印的作用* — 它们定义何时停止等待较早的事件。
 
-Event time processing in Flink depends on *watermark generators* that insert special timestamped
-elements into the stream, called *watermarks*. A watermark for time _t_ is an assertion that the
-stream is (probably) now complete up through time _t_.
+Flink中事件时间的处理取决于 *水印生成器*,后者将带有时间戳的特殊元素插入流中,称为 *水印*。时间 _t_ 的水印是
+断言该事件流现在到时间 _t_ 已(可能)完成。

Review comment:
       ```suggestion
   流在 _t_ 之前(很可能)已经处理完成了。
   ```
   这里直译依然觉得不明不白,我建议把这里意译,比如说:“事件时间 _t_ 的 watermark 代表 _t_ 之后(很可能)没有新的元素到达。”

##########
File path: docs/training/streaming_analytics.zh.md
##########
@@ -27,125 +27,101 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Event Time and Watermarks
+## 事件时间和水印
 
-### Introduction
+### 简介
 
-Flink explicitly supports three different notions of time:
+Flink 明确的支持以下三种事件时间:
 
-* _event time:_ the time when an event occurred, as recorded by the device producing (or storing) the event
+* _事件时间:_ 事件产生的时间,记录的是设备生产(或者存储)事件的时间
 
-* _ingestion time:_ a timestamp recorded by Flink at the moment it ingests the event
+* _摄取时间:_ Flink 提取事件时记录的时间戳
 
-* _processing time:_ the time when a specific operator in your pipeline is processing the event
+* _处理时间:_ Flink 中通过特定的操作处理事件的时间
 
-For reproducible results, e.g., when computing the maximum price a stock reached during the first
-hour of trading on a given day, you should use event time. In this way the result won't depend on
-when the calculation is performed. This kind of real-time application is sometimes performed using
-processing time, but then the results are determined by the events that happen to be processed
-during that hour, rather than the events that occurred then. Computing analytics based on processing
-time causes inconsistencies, and makes it difficult to re-analyze historic data or test new
-implementations.
+为了获得可重现的结果,例如在计算过去的特定一天里第一个小时股票的最高价格时,我们应该使用事件时间。这样的话,无论
+什么时间去计算都不会影响输出结果。然而有些人,在实时计算应用时使用处理时间,这样的话,输出结果就会被处理时间点所决
+定,而不是事件的生成时间。基于处理时间会导致多次计算的结果不一致,也可能会导致重新分析历史数据和测试变得异常困难。
 
-### Working with Event Time
+### 使用事件时间
 
-By default, Flink will use processing time. To change this, you can set the Time Characteristic:
+Flink 在默认情况下使用处理时间。也可以通过如下配置来告诉 Flink 选择哪种事件时间:
 
 {% highlight java %}
 final StreamExecutionEnvironment env =
     StreamExecutionEnvironment.getExecutionEnvironment();
 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
 {% endhighlight %}
 
-If you want to use event time, you will also need to supply a Timestamp Extractor and Watermark
-Generator that Flink will use to track the progress of event time. This will be covered in the
-section below on [Working with Watermarks]({% link
-training/streaming_analytics.zh.md %}#working-with-watermarks), but first we should explain what
-watermarks are.
+如果想要使用事件时间,则需要额外给 Flink 提供一个时间戳的提取器和水印,Flink 将使用它们来跟踪事件时间的进度。这
+将在选节[使用水印]({% linktutorials/streaming_analytics.md %}#使用水印)中介绍,但是首先我们需要解释一下
+水印是什么。
 
-### Watermarks
+### 水印
 
-Let's work through a simple example that will show why watermarks are needed, and how they work.
+让我们通过一个简单的示例来演示,该示例将说明为什么需要水印及其工作方式。
 
-In this example you have a stream of timestamped events that arrive somewhat out of order, as shown
-below. The numbers shown are timestamps that indicate when these events actually occurred. The first
-event to arrive happened at time 4, and it is followed by an event that happened earlier, at time 2,
-and so on:
+在此示例中,我们将看到带有混乱时间戳的事件流,如下所示。显示的数字表达的是这些事件实际发生时间的时间戳。到达的
+第一个事件发生在时间4,随后发生的事件发生在更早的时间2,依此类推:
 
 <div class="text-center" style="font-size: x-large; word-spacing: 0.5em; margin: 1em 0em;">
 ··· 23 19 22 24 21 14 17 13 12 15 9 11 7 2 4 →
 </div>
 
-Now imagine that you are trying create a stream sorter. This is meant to be an application that
-processes each event from a stream as it arrives, and emits a new stream containing the same events,
-but ordered by their timestamps.
+假设我们要对数据流排序,我们想要达到的目的是:应用程序应该在数据流里的事件到达时就处理每个事件,并发出包含相同
+事件但按其时间戳排序的新流。
 
-Some observations:
+让我们重新审视这些数据:
 
-(1) The first element your stream sorter sees is the 4, but you can't just immediately release it as
-the first element of the sorted stream. It may have arrived out of order, and an earlier event might
-yet arrive. In fact, you have the benefit of some god-like knowledge of this stream's future, and
-you can see that your stream sorter should wait at least until the 2 arrives before producing any
-results.
+(1) 我们的排序器第一个看到的数据是4,但是我们不能立即将其作为已排序流的第一个元素释放。因为我们并不能确定它是
+有序的,并且较早的事件有可能并未到达。事实上,如果站在上帝视角,我们知道,必须要等到2到来时,排序器才可以有事件输出。
 
-*Some buffering, and some delay, is necessary.*
+*需要一些缓冲,需要一些时间,但这都是值得的*
 
-(2) If you do this wrong, you could end up waiting forever. First the sorter saw an event from time
-4, and then an event from time 2. Will an event with a timestamp less than 2 ever arrive? Maybe.
-Maybe not. You could wait forever and never see a 1.
+(2) 接下来的这一步,如果我们选择的是固执的等待,我们永远不会有结果。首先,我们从时间4看到了一个事件,然后从时
+间2看到了一个事件。可是,时间戳小于2的事件接下来会不会到来呢?可能会,也可能不会。再次站在上帝视角,我们知道,我
+们永远不会看到1。
 
-*Eventually you have to be courageous and emit the 2 as the start of the sorted stream.*
+*最终,我们必须勇于承担责任,并发出指令,把2作为已排序的事件流的开始*
 
-(3) What you need then is some sort of policy that defines when, for any given timestamped event, to
-stop waiting for the arrival of earlier events.
+(3)然后,我们需要一种策略,该策略定义:对于任何给定时间戳的事件,Flink何时停止等待较早事件的到来。
 
-*This is precisely what watermarks do* — they define when to stop waiting for earlier events.
+*这正是水印的作用* — 它们定义何时停止等待较早的事件。

Review comment:
       ```suggestion
   *这正是 watermark 的作用* — 定义何时停止等待之前的事件。
   ```

##########
File path: docs/training/streaming_analytics.zh.md
##########
@@ -437,39 +394,32 @@ stream
     .reduce(<same reduce function>)
 {% endhighlight %}
 
-You might expect Flink's runtime to be smart enough to do this parallel pre-aggregation for you
-(provided you are using a ReduceFunction or AggregateFunction), but it's not.
+可能我们会猜测以 Flink 的能力,想要做到这样看起来是可行的(前提是您使用的是ReduceFunction或AggregateFunction),但不是。
 
-The reason why this works is that the events produced by a time window are assigned timestamps
-based on the time at the end of the window. So, for example, all of the events produced
-by an hour-long window will have timestamps marking the end of an hour. Any subsequent window
-consuming those events should have a duration that is the same as, or a multiple of, the
-previous window.
+之所以可行,是因为时间窗口产生的事件是根据窗口结束时的时间分配时间戳的。例如,一个小时小时的窗口所产生的所有事
+件都将带有标记一个小时结束的时间戳。后面的窗口内的数据消费和前面的流产生的数据是一致的。
 
-#### No Results for Empty TimeWindows
+#### 空的窗口不会产出结果
 
-Windows are only created when events are assigned to them. So if there are no events in a given time
-frame, no results will be reported.
+事件会触发窗口的创建。换句话说,如果在特定的窗口内没有事件,就不会有窗口,就不会有输出结果。
 
 #### Late Events Can Cause Late Merges
 
-Session windows are based on an abstraction of windows that can _merge_. Each element is initially
-assigned to a new window, after which windows are merged whenever the gap between them is small
-enough. In this way, a late event can bridge the gap separating two previously separate sessions,
-producing a late merge.
+会话窗口的实现是基于窗口的一个抽象能力,窗口可以_聚合_。会话窗口中的每个数据在初始被消费时,都会被分配一个新的
+窗口,但是如果窗口之间的间隔足够小,多个窗口就会被聚合。延迟事件可以弥合分隔两个先前分开的会话的间隔,从而产生

Review comment:
       ```suggestion
   窗口,但是如果窗口之间的间隔足够小,多个窗口就会被聚合。延迟事件可以弥合两个先前分开的会话间隔,从而产生
   ```

##########
File path: docs/training/streaming_analytics.zh.md
##########
@@ -27,125 +27,101 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Event Time and Watermarks
+## 事件时间和水印
 
-### Introduction
+### 简介
 
-Flink explicitly supports three different notions of time:
+Flink 明确的支持以下三种事件时间:
 
-* _event time:_ the time when an event occurred, as recorded by the device producing (or storing) the event
+* _事件时间:_ 事件产生的时间,记录的是设备生产(或者存储)事件的时间
 
-* _ingestion time:_ a timestamp recorded by Flink at the moment it ingests the event
+* _摄取时间:_ Flink 提取事件时记录的时间戳
 
-* _processing time:_ the time when a specific operator in your pipeline is processing the event
+* _处理时间:_ Flink 中通过特定的操作处理事件的时间
 
-For reproducible results, e.g., when computing the maximum price a stock reached during the first
-hour of trading on a given day, you should use event time. In this way the result won't depend on
-when the calculation is performed. This kind of real-time application is sometimes performed using
-processing time, but then the results are determined by the events that happen to be processed
-during that hour, rather than the events that occurred then. Computing analytics based on processing
-time causes inconsistencies, and makes it difficult to re-analyze historic data or test new
-implementations.
+为了获得可重现的结果,例如在计算过去的特定一天里第一个小时股票的最高价格时,我们应该使用事件时间。这样的话,无论
+什么时间去计算都不会影响输出结果。然而有些人,在实时计算应用时使用处理时间,这样的话,输出结果就会被处理时间点所决
+定,而不是事件的生成时间。基于处理时间会导致多次计算的结果不一致,也可能会导致重新分析历史数据和测试变得异常困难。
 
-### Working with Event Time
+### 使用事件时间
 
-By default, Flink will use processing time. To change this, you can set the Time Characteristic:
+Flink 在默认情况下使用处理时间。也可以通过如下配置来告诉 Flink 选择哪种事件时间:
 
 {% highlight java %}
 final StreamExecutionEnvironment env =
     StreamExecutionEnvironment.getExecutionEnvironment();
 env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
 {% endhighlight %}
 
-If you want to use event time, you will also need to supply a Timestamp Extractor and Watermark
-Generator that Flink will use to track the progress of event time. This will be covered in the
-section below on [Working with Watermarks]({% link
-training/streaming_analytics.zh.md %}#working-with-watermarks), but first we should explain what
-watermarks are.
+如果想要使用事件时间,则需要额外给 Flink 提供一个时间戳的提取器和水印,Flink 将使用它们来跟踪事件时间的进度。这
+将在选节[使用水印]({% linktutorials/streaming_analytics.md %}#使用水印)中介绍,但是首先我们需要解释一下
+水印是什么。
 
-### Watermarks
+### 水印
 
-Let's work through a simple example that will show why watermarks are needed, and how they work.
+让我们通过一个简单的示例来演示,该示例将说明为什么需要水印及其工作方式。
 
-In this example you have a stream of timestamped events that arrive somewhat out of order, as shown
-below. The numbers shown are timestamps that indicate when these events actually occurred. The first
-event to arrive happened at time 4, and it is followed by an event that happened earlier, at time 2,
-and so on:
+在此示例中,我们将看到带有混乱时间戳的事件流,如下所示。显示的数字表达的是这些事件实际发生时间的时间戳。到达的
+第一个事件发生在时间4,随后发生的事件发生在更早的时间2,依此类推:
 
 <div class="text-center" style="font-size: x-large; word-spacing: 0.5em; margin: 1em 0em;">
 ··· 23 19 22 24 21 14 17 13 12 15 9 11 7 2 4 →
 </div>
 
-Now imagine that you are trying create a stream sorter. This is meant to be an application that
-processes each event from a stream as it arrives, and emits a new stream containing the same events,
-but ordered by their timestamps.
+假设我们要对数据流排序,我们想要达到的目的是:应用程序应该在数据流里的事件到达时就处理每个事件,并发出包含相同
+事件但按其时间戳排序的新流。
 
-Some observations:
+让我们重新审视这些数据:
 
-(1) The first element your stream sorter sees is the 4, but you can't just immediately release it as
-the first element of the sorted stream. It may have arrived out of order, and an earlier event might
-yet arrive. In fact, you have the benefit of some god-like knowledge of this stream's future, and
-you can see that your stream sorter should wait at least until the 2 arrives before producing any
-results.
+(1) 我们的排序器第一个看到的数据是4,但是我们不能立即将其作为已排序流的第一个元素释放。因为我们并不能确定它是
+有序的,并且较早的事件有可能并未到达。事实上,如果站在上帝视角,我们知道,必须要等到2到来时,排序器才可以有事件输出。
 
-*Some buffering, and some delay, is necessary.*
+*需要一些缓冲,需要一些时间,但这都是值得的*
 
-(2) If you do this wrong, you could end up waiting forever. First the sorter saw an event from time
-4, and then an event from time 2. Will an event with a timestamp less than 2 ever arrive? Maybe.
-Maybe not. You could wait forever and never see a 1.
+(2) 接下来的这一步,如果我们选择的是固执的等待,我们永远不会有结果。首先,我们从时间4看到了一个事件,然后从时
+间2看到了一个事件。可是,时间戳小于2的事件接下来会不会到来呢?可能会,也可能不会。再次站在上帝视角,我们知道,我
+们永远不会看到1。
 
-*Eventually you have to be courageous and emit the 2 as the start of the sorted stream.*
+*最终,我们必须勇于承担责任,并发出指令,把2作为已排序的事件流的开始*
 
-(3) What you need then is some sort of policy that defines when, for any given timestamped event, to
-stop waiting for the arrival of earlier events.
+(3)然后,我们需要一种策略,该策略定义:对于任何给定时间戳的事件,Flink何时停止等待较早事件的到来。
 
-*This is precisely what watermarks do* — they define when to stop waiting for earlier events.
+*这正是水印的作用* — 它们定义何时停止等待较早的事件。
 
-Event time processing in Flink depends on *watermark generators* that insert special timestamped
-elements into the stream, called *watermarks*. A watermark for time _t_ is an assertion that the
-stream is (probably) now complete up through time _t_.
+Flink中事件时间的处理取决于 *水印生成器*,后者将带有时间戳的特殊元素插入流中,称为 *水印*。时间 _t_ 的水印是

Review comment:
       ```suggestion
   Flink 中事件时间的处理取决于 *watermark 生成器*,后者将带有时间戳的特殊元素插入流中形成 *watermark*。时间 _t_ 的 watermark 表示
   ```

##########
File path: docs/training/streaming_analytics.zh.md
##########
@@ -207,67 +180,64 @@ stream.
     .reduce|aggregate|process(<window function>)
 {% endhighlight %}
 
-You can also use windowing with non-keyed streams, but keep in mind that in this case, the
-processing will _not_ be done in parallel:
+您不是必须使用键控事件流,但是值得注意的是,如果不使用键控事件流,我们的程序就不能 _并行_ 处理。
 
 {% highlight java %}
 stream.
     .windowAll(<window assigner>)
     .reduce|aggregate|process(<window function>)
 {% endhighlight %}
 
-### Window Assigners
+### 窗口分配

Review comment:
       ```suggestion
   ### 窗口分配器
   ```

##########
File path: docs/training/streaming_analytics.zh.md
##########
@@ -207,67 +180,64 @@ stream.
     .reduce|aggregate|process(<window function>)
 {% endhighlight %}
 
-You can also use windowing with non-keyed streams, but keep in mind that in this case, the
-processing will _not_ be done in parallel:
+您不是必须使用键控事件流,但是值得注意的是,如果不使用键控事件流,我们的程序就不能 _并行_ 处理。
 
 {% highlight java %}
 stream.
     .windowAll(<window assigner>)
     .reduce|aggregate|process(<window function>)
 {% endhighlight %}
 
-### Window Assigners
+### 窗口分配
 
-Flink has several built-in types of window assigners, which are illustrated below:
+Flink 有一些内置的窗口分配器,如下所示:
 
 <img src="{{ site.baseurl }}/fig/window-assigners.svg" alt="Window assigners" class="center" width="80%" />
 
-Some examples of what these window assigners might be used for, and how to specify them:
+通过一些示例来展示关于这些窗口如何使用,或者如何区分它们:
 
-* Tumbling time windows
-  * _page views per minute_
+* 滚动时间窗口
+  * _每分钟页面浏览量_
   * `TumblingEventTimeWindows.of(Time.minutes(1))`
-* Sliding time windows
-  * _page views per minute computed every 10 seconds_
+* 滑动时间窗口
+  * _每10秒钟计算前1分钟的页面浏览量_
   * `SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(10))`
-* Session windows 
-  * _page views per session, where sessions are defined by a gap of at least 30 minutes between sessions_
+* 会话窗口
+  * _每个会话的网页浏览量,其中会话之间的间隔至少为30分钟_
   * `EventTimeSessionWindows.withGap(Time.minutes(30))`
 
-Durations can be specified using one of `Time.milliseconds(n)`, `Time.seconds(n)`, `Time.minutes(n)`, `Time.hours(n)`, and `Time.days(n)`.
+以下都是一些可以使用的间隔时间 `Time.milliseconds(n)`, `Time.seconds(n)`, `Time.minutes(n)`,
+ `Time.hours(n)`, and `Time.days(n)`。
 
-The time-based window assigners (including session windows) come in both event time and processing
-time flavors. There are significant tradeoffs between these two types of time windows. With
-processing time windowing you have to accept these limitations:
+基于时间的窗口分配器(包括会话时间)既可以处理 `事件时间`,也可以处理 `处理时间`。这两种基于时间的处理没有
+哪一个更好,我们必须折衷。使用 `处理时间`,我们必须接受以下限制:
 
-* can not correctly process historic data,
-* can not correctly handle out-of-order data,
-* results will be non-deterministic,
+* 无法正确处理历史数据,
+* 无法正确处理超过最大乱序边界的数据,
+* 结果将是不确定的,
 
-but with the advantage of lower latency. 
+但是有自己的优势,较低的延迟。
 
-When working with count-based windows, keep in mind that these windows will not fire until a batch
-is complete. There's no option to time-out and process a partial window, though you could implement
-that behavior yourself with a custom Trigger.
+使用基于计数的窗口时,请记住,直到完成批处理后,这些窗口才会触发。尽管可以使用自定义触发器自己实现该行为,但无法
+应对超时和处理部分窗口。
 
-A global window assigner assigns every event (with the same key) to the same global window. This is
-only useful if you are going to do your own custom windowing, with a custom Trigger. In many cases
-where this might seem useful you will be better off using a `ProcessFunction` as described
-[in another section]({% link training/event_driven.zh.md %}#process-functions).
+仅当您要使用自定义触发器进行自定义窗口时,全局窗口分配器将每个事件(具有相同的键)分配给相同的全局窗口。 很多情
+况下,一个比较好的建议是使用 `ProcessFunction`,具体介绍在[这里]({% link tutorials/event_driven.md %}#process-functions)。
 
-### Window Functions
+### 窗口方法

Review comment:
       ```suggestion
   ### 窗口应用函数
   ```
   这里参照《基于 Apache Flink 的流处理》的翻译

##########
File path: docs/training/streaming_analytics.zh.md
##########
@@ -437,39 +394,32 @@ stream
     .reduce(<same reduce function>)
 {% endhighlight %}
 
-You might expect Flink's runtime to be smart enough to do this parallel pre-aggregation for you
-(provided you are using a ReduceFunction or AggregateFunction), but it's not.
+可能我们会猜测以 Flink 的能力,想要做到这样看起来是可行的(前提是您使用的是ReduceFunction或AggregateFunction),但不是。
 
-The reason why this works is that the events produced by a time window are assigned timestamps
-based on the time at the end of the window. So, for example, all of the events produced
-by an hour-long window will have timestamps marking the end of an hour. Any subsequent window
-consuming those events should have a duration that is the same as, or a multiple of, the
-previous window.
+之所以可行,是因为时间窗口产生的事件是根据窗口结束时的时间分配时间戳的。例如,一个小时小时的窗口所产生的所有事
+件都将带有标记一个小时结束的时间戳。后面的窗口内的数据消费和前面的流产生的数据是一致的。
 
-#### No Results for Empty TimeWindows
+#### 空的窗口不会产出结果
 
-Windows are only created when events are assigned to them. So if there are no events in a given time
-frame, no results will be reported.
+事件会触发窗口的创建。换句话说,如果在特定的窗口内没有事件,就不会有窗口,就不会有输出结果。
 
 #### Late Events Can Cause Late Merges
 
-Session windows are based on an abstraction of windows that can _merge_. Each element is initially
-assigned to a new window, after which windows are merged whenever the gap between them is small
-enough. In this way, a late event can bridge the gap separating two previously separate sessions,
-producing a late merge.
+会话窗口的实现是基于窗口的一个抽象能力,窗口可以_聚合_。会话窗口中的每个数据在初始被消费时,都会被分配一个新的
+窗口,但是如果窗口之间的间隔足够小,多个窗口就会被聚合。延迟事件可以弥合分隔两个先前分开的会话的间隔,从而产生
+一个虽然延迟但是更加准确地结果。

Review comment:
       ```suggestion
   一个虽然有延迟但是更加准确地结果。
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org