You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/04/24 05:29:22 UTC

[GitHub] [flink] chaojianok opened a new pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

chaojianok opened a new pull request #11897:
URL: https://github.com/apache/flink/pull/11897


   [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162046977",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "SUCCESS",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162152427",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=283",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 67027fc17a69ef4edc8300e4b82a9e1233b016ee UNKNOWN
   * da909ac1e2831251bf60375a080cc192e4e4a0c8 Travis: [SUCCESS](https://travis-ci.com/github/flink-ci/flink/builds/162152427) Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=283) 
   * 836c6bce67b366939d524f928f0796a18abbf3ec UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] chaojianok commented on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
chaojianok commented on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-625589762


   @leonardBang Thanks a lot for your review, I've optimized it according to your suggestions, please help me review again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] wuchong commented on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
wuchong commented on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-619371554


   @leonardBang  Could you help to review on this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] leonardBang commented on a change in pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
leonardBang commented on a change in pull request #11897:
URL: https://github.com/apache/flink/pull/11897#discussion_r415291928



##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,34 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析中使用最广泛的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子优化。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。
 
-In this page, we will introduce some useful optimization options and the internals of streaming aggregation which will bring great improvement in some cases.
+在这一页,我们将介绍一些实用的优化选项以及流式聚合的内部原理,它们在某些情况下能带来很大的提升。
 
-<span class="label label-danger">Attention</span> Currently, the optimization options mentioned in this page are only supported in the Blink planner.
+<span class="label label-danger">注意</span> 目前,这一页提到的优化选项仅支持 Blink planner。
 
-<span class="label label-danger">Attention</span> Currently, the streaming aggregations optimization are only supported for [unbounded-aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#aggregations). Optimizations for [window aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#group-windows) will be supported in the future.
+<span class="label label-danger">注意</span> 目前,流聚合优化仅支持 [无界聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#aggregations)。[窗口聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#group-windows) 优化将在未来支持。
 
 * This will be replaced by the TOC
 {:toc}
 
-By default, the unbounded aggregation operator processes input records one by one, i.e., (1) read accumulator from state, (2) accumulate/retract record to accumulator, (3) write accumulator back to state, (4) the next record will do the process again from (1). This processing pattern may increase the overhead of StateBackend (especially for RocksDB StateBackend).
-Besides, data skew which is very common in production will worsen the problem and make it easy for the jobs to be under backpressure situations.
+默认情况下,无界聚合算子是一个一个的处理输入的记录,也就是说,(1)从状态读取累加器,(2)累积/撤回记录至累积器,(3)将累加器写回状态,(4)下一条记录将再次从(1)开始处理。 这种处理模式可能会增加 StateBackend 开销(尤其是对于 RocksDB StateBackend)。此外,生产中非常常见的数据倾斜会使这个问题恶化,并使 job 容易承受反压的情况。

Review comment:
       无界聚合算子是逐条处理输入的记录,即:
   (1)从状态中读取累加器
   (2)累加/撤回记录至累加器
   > (尤其是对于 RocksDB StateBackend)
   
   (尤其是对于 RocksDB StateBackend )  英文单词前后都要空格
   
   > 并使 job 容易承受反压的情况。
   
   并且容易导致 job 发生反压。

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,34 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析最广泛使用的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子实现。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。
 
-In this page, we will introduce some useful optimization options and the internals of streaming aggregation which will bring great improvement in some cases.
+这里我们将介绍一些实用的优化选项以及流式聚合的内部原理,它们在某些情况下能带来很大改进。
 
-<span class="label label-danger">Attention</span> Currently, the optimization options mentioned in this page are only supported in the Blink planner.
+<span class="label label-danger">注意</span> 目前,这里提到的优化选项仅支持 Blink 计划器。
 
-<span class="label label-danger">Attention</span> Currently, the streaming aggregations optimization are only supported for [unbounded-aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#aggregations). Optimizations for [window aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#group-windows) will be supported in the future.
+<span class="label label-danger">注意</span> 目前,流聚合优化仅支持 [无界聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#aggregations)。[窗口聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#group-windows) 优化将在未来支持。
 
 * This will be replaced by the TOC
 {:toc}
 
-By default, the unbounded aggregation operator processes input records one by one, i.e., (1) read accumulator from state, (2) accumulate/retract record to accumulator, (3) write accumulator back to state, (4) the next record will do the process again from (1). This processing pattern may increase the overhead of StateBackend (especially for RocksDB StateBackend).
-Besides, data skew which is very common in production will worsen the problem and make it easy for the jobs to be under backpressure situations.
+默认情况下,无界聚合算子是一个一个的处理输入的记录,也就是说,(1)从状态读取累加器,(2)累积/撤回记录至累积器,(3)将累加器写回状态,(4)下一条记录将再次从(1)开始处理。 这种处理模式可能会增加 StateBackend 开销(尤其是对于 RocksDB StateBackend)。此外,生产中非常常见的数据倾斜会使这个问题恶化,并使 job 容易承受反压的情况。
 
-## MiniBatch Aggregation
+<a name="minibatch-aggregation"></a>

Review comment:
       yes,我建议不要添加,因为默认格式我们需要和英文文档保持一致

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,34 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析中使用最广泛的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子优化。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。
 
-In this page, we will introduce some useful optimization options and the internals of streaming aggregation which will bring great improvement in some cases.
+在这一页,我们将介绍一些实用的优化选项以及流式聚合的内部原理,它们在某些情况下能带来很大的提升。
 
-<span class="label label-danger">Attention</span> Currently, the optimization options mentioned in this page are only supported in the Blink planner.
+<span class="label label-danger">注意</span> 目前,这一页提到的优化选项仅支持 Blink planner。
 
-<span class="label label-danger">Attention</span> Currently, the streaming aggregations optimization are only supported for [unbounded-aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#aggregations). Optimizations for [window aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#group-windows) will be supported in the future.
+<span class="label label-danger">注意</span> 目前,流聚合优化仅支持 [无界聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#aggregations)。[窗口聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#group-windows) 优化将在未来支持。
 
 * This will be replaced by the TOC
 {:toc}
 
-By default, the unbounded aggregation operator processes input records one by one, i.e., (1) read accumulator from state, (2) accumulate/retract record to accumulator, (3) write accumulator back to state, (4) the next record will do the process again from (1). This processing pattern may increase the overhead of StateBackend (especially for RocksDB StateBackend).
-Besides, data skew which is very common in production will worsen the problem and make it easy for the jobs to be under backpressure situations.
+默认情况下,无界聚合算子是一个一个的处理输入的记录,也就是说,(1)从状态读取累加器,(2)累积/撤回记录至累积器,(3)将累加器写回状态,(4)下一条记录将再次从(1)开始处理。 这种处理模式可能会增加 StateBackend 开销(尤其是对于 RocksDB StateBackend)。此外,生产中非常常见的数据倾斜会使这个问题恶化,并使 job 容易承受反压的情况。
 
-## MiniBatch Aggregation
+<a name="minibatch-aggregation"></a>
 
-The core idea of mini-batch aggregation is caching a bundle of inputs in a buffer inside of the aggregation operator. When the bundle of inputs is triggered to process, only one operation per key to access state is needed. This can significantly reduce the state overhead and get a better throughput. However, this may increase some latency because it buffers some records instead of processing them in an instant. This is a trade-off between throughput and latency.
+## MiniBatch 聚合
 
-The following figure explains how the mini-batch aggregation reduces state operations.
+MiniBatch 聚合的核心思想是将一组输入的数据缓存在聚合算子内部的缓冲区中。当输入的数据被触发处理时,每个键只需一个操作即可访问状态。这样可以大大减少状态开销并获得更好的吞吐量。但是,这可能会增加一些延迟,因为它会缓冲一些记录而不是立即处理它们。这是吞吐量和延迟之间的权衡。
+
+下图说明了 mini-batch 聚合如何减少状态操作。
 
 <div style="text-align: center">
   <img src="{{ site.baseurl }}/fig/table-streaming/minibatch_agg.png" width="50%" height="50%" />
 </div>
 
-MiniBatch optimization is disabled by default. In order to enable this optimization, you should set options `table.exec.mini-batch.enabled`, `table.exec.mini-batch.allow-latency` and `table.exec.mini-batch.size`. Please see [configuration]({{ site.baseurl }}/dev/table/config.html#execution-options) page for more details.
+默认情况下 mini-batch 优化是被禁用的。开启这项优化,需要设置选项 `table.exec.mini-batch.enabled`、`table.exec.mini-batch.allow-latency` 和 `table.exec.mini-batch.size`。更多详细信息请参见[配置]({{site.baseurl}}/zh/dev/table/config.html#execution-options)页面。
 
-The following examples show how to enable these options.
+以下示例显示如何启用这些选项。

Review comment:
       下面的例子显示如何弃用这些选项。
   (使用了 示例 后面再接 显示 有点重复)

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,34 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析中使用最广泛的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子优化。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。
 
-In this page, we will introduce some useful optimization options and the internals of streaming aggregation which will bring great improvement in some cases.
+在这一页,我们将介绍一些实用的优化选项以及流式聚合的内部原理,它们在某些情况下能带来很大的提升。
 
-<span class="label label-danger">Attention</span> Currently, the optimization options mentioned in this page are only supported in the Blink planner.
+<span class="label label-danger">注意</span> 目前,这一页提到的优化选项仅支持 Blink planner。
 
-<span class="label label-danger">Attention</span> Currently, the streaming aggregations optimization are only supported for [unbounded-aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#aggregations). Optimizations for [window aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#group-windows) will be supported in the future.
+<span class="label label-danger">注意</span> 目前,流聚合优化仅支持 [无界聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#aggregations)。[窗口聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#group-windows) 优化将在未来支持。
 
 * This will be replaced by the TOC
 {:toc}
 
-By default, the unbounded aggregation operator processes input records one by one, i.e., (1) read accumulator from state, (2) accumulate/retract record to accumulator, (3) write accumulator back to state, (4) the next record will do the process again from (1). This processing pattern may increase the overhead of StateBackend (especially for RocksDB StateBackend).
-Besides, data skew which is very common in production will worsen the problem and make it easy for the jobs to be under backpressure situations.
+默认情况下,无界聚合算子是一个一个的处理输入的记录,也就是说,(1)从状态读取累加器,(2)累积/撤回记录至累积器,(3)将累加器写回状态,(4)下一条记录将再次从(1)开始处理。 这种处理模式可能会增加 StateBackend 开销(尤其是对于 RocksDB StateBackend)。此外,生产中非常常见的数据倾斜会使这个问题恶化,并使 job 容易承受反压的情况。
 
-## MiniBatch Aggregation
+<a name="minibatch-aggregation"></a>
 
-The core idea of mini-batch aggregation is caching a bundle of inputs in a buffer inside of the aggregation operator. When the bundle of inputs is triggered to process, only one operation per key to access state is needed. This can significantly reduce the state overhead and get a better throughput. However, this may increase some latency because it buffers some records instead of processing them in an instant. This is a trade-off between throughput and latency.
+## MiniBatch 聚合
 
-The following figure explains how the mini-batch aggregation reduces state operations.
+MiniBatch 聚合的核心思想是将一组输入的数据缓存在聚合算子内部的缓冲区中。当输入的数据被触发处理时,每个键只需一个操作即可访问状态。这样可以大大减少状态开销并获得更好的吞吐量。但是,这可能会增加一些延迟,因为它会缓冲一些记录而不是立即处理它们。这是吞吐量和延迟之间的权衡。
+
+下图说明了 mini-batch 聚合如何减少状态操作。
 
 <div style="text-align: center">
   <img src="{{ site.baseurl }}/fig/table-streaming/minibatch_agg.png" width="50%" height="50%" />
 </div>
 
-MiniBatch optimization is disabled by default. In order to enable this optimization, you should set options `table.exec.mini-batch.enabled`, `table.exec.mini-batch.allow-latency` and `table.exec.mini-batch.size`. Please see [configuration]({{ site.baseurl }}/dev/table/config.html#execution-options) page for more details.
+默认情况下 mini-batch 优化是被禁用的。开启这项优化,需要设置选项 `table.exec.mini-batch.enabled`、`table.exec.mini-batch.allow-latency` 和 `table.exec.mini-batch.size`。更多详细信息请参见[配置]({{site.baseurl}}/zh/dev/table/config.html#execution-options)页面。
 

Review comment:
       please do not change the default format, 这些格式的调整是不需要的(多的空格或少的空格)
   > ({{site.baseurl}}/zh/dev/table/config.html#execution-options)页面。
   ({{ site.baseurl }}/dev/table/config.html#execution-options)页面。

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -94,28 +96,28 @@ configuration.set_string("table.exec.mini-batch.size", "5000"); # the maximum nu
 </div>
 </div>
 
-## Local-Global Aggregation
+<a name="local-global-aggregation"></a>

Review comment:
       as above

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -94,28 +96,28 @@ configuration.set_string("table.exec.mini-batch.size", "5000"); # the maximum nu
 </div>
 </div>
 
-## Local-Global Aggregation
+<a name="local-global-aggregation"></a>
+
+## Local-Global 聚合
 
-Local-Global is proposed to solve data skew problem by dividing a group aggregation into two stages, that is doing local aggregation in upstream firstly, and followed by global aggregation in downstream, which is similar to Combine + Reduce pattern in MapReduce. For example, considering the following SQL:
+Local-Global 聚合是为解决数据倾斜问题提出的,通过将一组聚合分为两个阶段,首先在上游进行本地聚合,然后在下游进行全局聚合,类似于 MapReduce 中的 Combine + Reduce 模式。例如,就以下 SQL 而言:
 
 {% highlight sql %}
 SELECT color, sum(id)
 FROM T
 GROUP BY color
 {% endhighlight %}
 
-It is possible that the records in the data stream are skewed, thus some instances of aggregation operator have to process much more records than others, which leads to hotspot.
-The local aggregation can help to accumulate a certain amount of inputs which have the same key into a single accumulator. The global aggregation will only receive the reduced accumulators instead of large number of raw inputs.
-This can significantly reduce the network shuffle and the cost of state access. The number of inputs accumulated by local aggregation every time is based on mini-batch interval. It means local-global aggregation depends on mini-batch optimization is enabled.
+数据流中的记录可能会倾斜,因此某些聚合算子的实例必须比其他实例处理更多的记录,这会导致 hotspot。本地聚合可以将一定数量具有相同 key 的输入数据累加到单个累加器中。全局聚合将仅接收 reduce 后的累加器,而不是大量的原始输入数据。这可以大大减少网络 shuffle 和状态访问的成本。每次本地聚合累积的输入数据量基于 mini-batch 间隔。这意味着 local-global 聚合依赖于启用了 mini-batch 优化。
 
-The following figure shows how the local-global aggregation improve performance.
+下图显示了 local-global 聚合如何提高性能。
 
 <div style="text-align: center">
   <img src="{{ site.baseurl }}/fig/table-streaming/local_agg.png" width="70%" height="70%" />
 </div>
 
 
-The following examples show how to enable the local-global aggregation.
+以下示例显示如何启用 local-global 聚合。

Review comment:
       as above, 下面的例子显示xxx

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -164,25 +166,25 @@ configuration.set_string("table.optimizer.agg-phase-strategy", "TWO_PHASE"); # e
 </div>
 </div>
 
-## Split Distinct Aggregation
+<a name="split-distinct-aggregation"></a>
 
-Local-Global optimization is effective to eliminate data skew for general aggregation, such as SUM, COUNT, MAX, MIN, AVG. But its performance is not satisfactory when dealing with distinct aggregation.
+## 拆分 distinct 聚合
 
-For example, if we want to analyse how many unique users logined today. We may have the following query:
+Local-Global 优化可有效消除常规聚合的数据倾斜,例如 SUM、COUNT、MAX、MIN、AVG。但是在处理 distinct 聚合时,其性能并不令人满意。
+
+例如,如果我们要分析今天有多少唯一用户登录。我们可能有以下查询:
 
 {% highlight sql %}
 SELECT day, COUNT(DISTINCT user_id)
 FROM T
 GROUP BY day
 {% endhighlight %}
 
-COUNT DISTINCT is not good at reducing records if the value of distinct key (i.e. user_id) is sparse. Even if local-global optimization is enabled, it doesn't help much. Because the accumulator still contain almost all the raw records, and the global aggregation will be the bottleneck (most of the heavy accumulators are processed by one task, i.e. on the same day).
+如果唯一键(即 user_id)的值稀疏,则 COUNT DISTINCT 不适合 reduce 操作。即使启用了 local-global 优化也没有太大帮助。因为累加器仍然包含几乎所有原始记录,并且全局聚合将成为瓶颈(大多数繁重的累加器由一个任务处理,即同一天)。
 
-The idea of this optimization is splitting distinct aggregation (e.g. `COUNT(DISTINCT col)`) into two levels. The first aggregation is shuffled by group key and an additional bucket key. The bucket key is calculated using `HASH_CODE(distinct_key) % BUCKET_NUM`. `BUCKET_NUM` is 1024 by default, and can be configured by `table.optimizer.distinct-agg.split.bucket-num` option.
-The second aggregation is shuffled by the original group key, and use `SUM` to aggregate COUNT DISTINCT values from different buckets. Because the same distinct key will only be calculated in the same bucket, so the transformation is equivalent.
-The bucket key plays the role of an additional group key to share the burden of hotspot in group key. The bucket key makes the job to be scalability to solve data-skew/hotspot in distinct aggregations.
+这个优化的想法是将不同的聚合(例如 `COUNT(DISTINCT col)`)分为两个级别。第一次聚合由 group key 和额外的 bucket key 进行 shuffle。bucket key 是使用 `HASH_CODE(distinct_key) % BUCKET_NUM` 计算的。`BUCKET_NUM` 默认为1024,可以通过 `table.optimizer.distinct-agg.split.bucket-num` 选项进行配置。第二次聚合是由原始 group key 进行 shuffle,并使用 `SUM` 聚合来自不同 buckets 的 COUNT DISTINCT 值。由于相同的唯一键将仅在同一 bucket 中计算,因此转换是等效的。bucket key 充当附加 group key 的角色,以分担 group key 中热点的负担。bucket key 使 job 具有可伸缩性来解决不同聚合中的数据倾斜/热点。
 
-After split distinct aggregate, the above query will be rewritten into the following query automatically:
+拆分 distinct 聚合后,以上查询将被自动重写为以下查询:

Review comment:
       重写 -> 改写, rewritten query 算是术语通常中文是说改写 query

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,34 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析中使用最广泛的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子优化。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。
 
-In this page, we will introduce some useful optimization options and the internals of streaming aggregation which will bring great improvement in some cases.
+在这一页,我们将介绍一些实用的优化选项以及流式聚合的内部原理,它们在某些情况下能带来很大的提升。
 
-<span class="label label-danger">Attention</span> Currently, the optimization options mentioned in this page are only supported in the Blink planner.
+<span class="label label-danger">注意</span> 目前,这一页提到的优化选项仅支持 Blink planner。
 
-<span class="label label-danger">Attention</span> Currently, the streaming aggregations optimization are only supported for [unbounded-aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#aggregations). Optimizations for [window aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#group-windows) will be supported in the future.
+<span class="label label-danger">注意</span> 目前,流聚合优化仅支持 [无界聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#aggregations)。[窗口聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#group-windows) 优化将在未来支持。
 
 * This will be replaced by the TOC
 {:toc}
 
-By default, the unbounded aggregation operator processes input records one by one, i.e., (1) read accumulator from state, (2) accumulate/retract record to accumulator, (3) write accumulator back to state, (4) the next record will do the process again from (1). This processing pattern may increase the overhead of StateBackend (especially for RocksDB StateBackend).
-Besides, data skew which is very common in production will worsen the problem and make it easy for the jobs to be under backpressure situations.
+默认情况下,无界聚合算子是一个一个的处理输入的记录,也就是说,(1)从状态读取累加器,(2)累积/撤回记录至累积器,(3)将累加器写回状态,(4)下一条记录将再次从(1)开始处理。 这种处理模式可能会增加 StateBackend 开销(尤其是对于 RocksDB StateBackend)。此外,生产中非常常见的数据倾斜会使这个问题恶化,并使 job 容易承受反压的情况。
 
-## MiniBatch Aggregation
+<a name="minibatch-aggregation"></a>
 
-The core idea of mini-batch aggregation is caching a bundle of inputs in a buffer inside of the aggregation operator. When the bundle of inputs is triggered to process, only one operation per key to access state is needed. This can significantly reduce the state overhead and get a better throughput. However, this may increase some latency because it buffers some records instead of processing them in an instant. This is a trade-off between throughput and latency.
+## MiniBatch 聚合
 
-The following figure explains how the mini-batch aggregation reduces state operations.
+MiniBatch 聚合的核心思想是将一组输入的数据缓存在聚合算子内部的缓冲区中。当输入的数据被触发处理时,每个键只需一个操作即可访问状态。这样可以大大减少状态开销并获得更好的吞吐量。但是,这可能会增加一些延迟,因为它会缓冲一些记录而不是立即处理它们。这是吞吐量和延迟之间的权衡。

Review comment:
       每个键只 -> 每个 key 只

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -164,25 +166,25 @@ configuration.set_string("table.optimizer.agg-phase-strategy", "TWO_PHASE"); # e
 </div>
 </div>
 
-## Split Distinct Aggregation
+<a name="split-distinct-aggregation"></a>
 
-Local-Global optimization is effective to eliminate data skew for general aggregation, such as SUM, COUNT, MAX, MIN, AVG. But its performance is not satisfactory when dealing with distinct aggregation.
+## 拆分 distinct 聚合
 
-For example, if we want to analyse how many unique users logined today. We may have the following query:
+Local-Global 优化可有效消除常规聚合的数据倾斜,例如 SUM、COUNT、MAX、MIN、AVG。但是在处理 distinct 聚合时,其性能并不令人满意。
+
+例如,如果我们要分析今天有多少唯一用户登录。我们可能有以下查询:
 
 {% highlight sql %}
 SELECT day, COUNT(DISTINCT user_id)
 FROM T
 GROUP BY day
 {% endhighlight %}
 
-COUNT DISTINCT is not good at reducing records if the value of distinct key (i.e. user_id) is sparse. Even if local-global optimization is enabled, it doesn't help much. Because the accumulator still contain almost all the raw records, and the global aggregation will be the bottleneck (most of the heavy accumulators are processed by one task, i.e. on the same day).
+如果唯一键(即 user_id)的值稀疏,则 COUNT DISTINCT 不适合 reduce 操作。即使启用了 local-global 优化也没有太大帮助。因为累加器仍然包含几乎所有原始记录,并且全局聚合将成为瓶颈(大多数繁重的累加器由一个任务处理,即同一天)。
 
-The idea of this optimization is splitting distinct aggregation (e.g. `COUNT(DISTINCT col)`) into two levels. The first aggregation is shuffled by group key and an additional bucket key. The bucket key is calculated using `HASH_CODE(distinct_key) % BUCKET_NUM`. `BUCKET_NUM` is 1024 by default, and can be configured by `table.optimizer.distinct-agg.split.bucket-num` option.
-The second aggregation is shuffled by the original group key, and use `SUM` to aggregate COUNT DISTINCT values from different buckets. Because the same distinct key will only be calculated in the same bucket, so the transformation is equivalent.
-The bucket key plays the role of an additional group key to share the burden of hotspot in group key. The bucket key makes the job to be scalability to solve data-skew/hotspot in distinct aggregations.
+这个优化的想法是将不同的聚合(例如 `COUNT(DISTINCT col)`)分为两个级别。第一次聚合由 group key 和额外的 bucket key 进行 shuffle。bucket key 是使用 `HASH_CODE(distinct_key) % BUCKET_NUM` 计算的。`BUCKET_NUM` 默认为1024,可以通过 `table.optimizer.distinct-agg.split.bucket-num` 选项进行配置。第二次聚合是由原始 group key 进行 shuffle,并使用 `SUM` 聚合来自不同 buckets 的 COUNT DISTINCT 值。由于相同的唯一键将仅在同一 bucket 中计算,因此转换是等效的。bucket key 充当附加 group key 的角色,以分担 group key 中热点的负担。bucket key 使 job 具有可伸缩性来解决不同聚合中的数据倾斜/热点。

Review comment:
       由于相同的唯一键将 -> 由于相同的 distinct key 将

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -164,25 +166,25 @@ configuration.set_string("table.optimizer.agg-phase-strategy", "TWO_PHASE"); # e
 </div>
 </div>
 
-## Split Distinct Aggregation
+<a name="split-distinct-aggregation"></a>
 
-Local-Global optimization is effective to eliminate data skew for general aggregation, such as SUM, COUNT, MAX, MIN, AVG. But its performance is not satisfactory when dealing with distinct aggregation.
+## 拆分 distinct 聚合
 
-For example, if we want to analyse how many unique users logined today. We may have the following query:
+Local-Global 优化可有效消除常规聚合的数据倾斜,例如 SUM、COUNT、MAX、MIN、AVG。但是在处理 distinct 聚合时,其性能并不令人满意。
+

Review comment:
       多余的空行

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -164,25 +166,25 @@ configuration.set_string("table.optimizer.agg-phase-strategy", "TWO_PHASE"); # e
 </div>
 </div>
 
-## Split Distinct Aggregation
+<a name="split-distinct-aggregation"></a>
 
-Local-Global optimization is effective to eliminate data skew for general aggregation, such as SUM, COUNT, MAX, MIN, AVG. But its performance is not satisfactory when dealing with distinct aggregation.
+## 拆分 distinct 聚合
 
-For example, if we want to analyse how many unique users logined today. We may have the following query:
+Local-Global 优化可有效消除常规聚合的数据倾斜,例如 SUM、COUNT、MAX、MIN、AVG。但是在处理 distinct 聚合时,其性能并不令人满意。
+
+例如,如果我们要分析今天有多少唯一用户登录。我们可能有以下查询:
 
 {% highlight sql %}
 SELECT day, COUNT(DISTINCT user_id)
 FROM T
 GROUP BY day
 {% endhighlight %}
 
-COUNT DISTINCT is not good at reducing records if the value of distinct key (i.e. user_id) is sparse. Even if local-global optimization is enabled, it doesn't help much. Because the accumulator still contain almost all the raw records, and the global aggregation will be the bottleneck (most of the heavy accumulators are processed by one task, i.e. on the same day).
+如果唯一键(即 user_id)的值稀疏,则 COUNT DISTINCT 不适合 reduce 操作。即使启用了 local-global 优化也没有太大帮助。因为累加器仍然包含几乎所有原始记录,并且全局聚合将成为瓶颈(大多数繁重的累加器由一个任务处理,即同一天)。

Review comment:
       >如果唯一键(即 user_id)的值稀疏,则 COUNT DISTINCT 不适合 reduce 操作
   如果 distinct key(即 user_id)的值分布稀疏,则COUNT DISTINCT 不适合减少数据。

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -242,10 +244,11 @@ t_env.get_config()        # access high-level configuration
 </div>
 </div>
 
-## Use FILTER Modifier on Distinct Aggregates
+<a name="use-filter-modifier-on-distinct-aggregates"></a>
+
+## 在 distinct 聚合上使用 FILTER 修改器

Review comment:
       修饰符

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -195,17 +197,17 @@ GROUP BY day
 {% endhighlight %}
 
 
-The following figure shows how the split distinct aggregation improve performance (assuming color represents days, and letter represents user_id).
+下图显示了拆分 distinct 聚合如何提高性能(假设颜色表示 days,字母表示 user_id)。
 
 <div style="text-align: center">
   <img src="{{ site.baseurl }}/fig/table-streaming/distinct_split.png" width="70%" height="70%" />
 </div>
 
-NOTE: Above is the simplest example which can benefit from this optimization. Besides that, Flink supports to split more complex aggregation queries, for example, more than one distinct aggregates with different distinct key (e.g. `COUNT(DISTINCT a), SUM(DISTINCT b)`), works with other non-distinct aggregates (e.g. `SUM`, `MAX`, `MIN`, `COUNT`).
+注意:上面是可以从这个优化中受益的最简单的示例。除此之外,Flink 还支持拆分更复杂的聚合查询,例如,多个具有不同唯一键(例如 `COUNT(DISTINCT a), SUM(DISTINCT b)` )的不同聚合,可以与其他非明显聚合(例如 `SUM`、`MAX`、`MIN`、`COUNT` )一起使用。

Review comment:
       >多个具有不同唯一键(例如 `COUNT(DISTINCT a), SUM(DISTINCT b)` )的不同聚合,可以与其他非明显聚合(例如 `SUM`、`MAX`、`MIN`、`COUNT` )一起使用。
   
   多个具有不同 distinct key(例如 `COUNT(DISTINCT a), SUM(DISTINCT b)` )的 distinct 聚合,可以与其他非 distinct 聚合(例如 `SUM`、`MAX`、`MIN`、`COUNT` )一起使用。

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -270,8 +272,7 @@ FROM T
 GROUP BY day
 {% endhighlight %}
 
-Flink SQL optimizer can recognize the different filter arguments on the same distinct key. For example, in the above example, all the three COUNT DISTINCT are on `user_id` column.
-Then Flink can use just one shared state instance instead of three state instances to reduce state access and state size. In some workloads, this can get significant performance improvements.
+Flink SQL 优化器可以识别相同唯一键上的不同过滤器参数。例如,在上面的示例中,三个 COUNT DISTINCT 都在 `user_id` 一列上。Flink 可以只使用一个共享状态实例,而不是三个状态实例,以减少状态访问和状态大小。在某些工作负载下,可以获得显著的性能提升。

Review comment:
       相同的 distinct key 上的




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] leonardBang commented on a change in pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
leonardBang commented on a change in pull request #11897:
URL: https://github.com/apache/flink/pull/11897#discussion_r419283115



##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,32 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析中使用最广泛的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子优化。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。

Review comment:
       此外,Flink Table API 和 SQL 是高效优化过的

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,32 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析中使用最广泛的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子优化。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。
 
-In this page, we will introduce some useful optimization options and the internals of streaming aggregation which will bring great improvement in some cases.
+在这一页,我们将介绍一些实用的优化选项以及流式聚合的内部原理,它们在某些情况下能带来很大的提升。
 
-<span class="label label-danger">Attention</span> Currently, the optimization options mentioned in this page are only supported in the Blink planner.
+<span class="label label-danger">注意</span> 目前,这一页提到的优化选项仅支持 Blink planner。
 
-<span class="label label-danger">Attention</span> Currently, the streaming aggregations optimization are only supported for [unbounded-aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#aggregations). Optimizations for [window aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#group-windows) will be supported in the future.
+<span class="label label-danger">注意</span> 目前,流聚合优化仅支持 [无界聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#aggregations)。[窗口聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#group-windows) 优化将在未来支持。

Review comment:
       {{ site.baseurl }}/zh/dev/table/sql/queries.html#aggregations -> {{ site.baseurl }}/zh/dev/table/sql/queries.html#聚合)
   {{ site.baseurl }}/zh/dev/table/sql/queries.html#group-windows -> {{ site.baseurl }}/zh/dev/table/sql/queries.html#分组窗口
   
   这些链接地址需要换成对应的中文页面链接,不然用户点击的时候跳转不到对应的位置

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -94,28 +94,26 @@ configuration.set_string("table.exec.mini-batch.size", "5000"); # the maximum nu
 </div>
 </div>
 
-## Local-Global Aggregation
+## Local-Global 聚合
 
-Local-Global is proposed to solve data skew problem by dividing a group aggregation into two stages, that is doing local aggregation in upstream firstly, and followed by global aggregation in downstream, which is similar to Combine + Reduce pattern in MapReduce. For example, considering the following SQL:
+Local-Global 聚合是为解决数据倾斜问题提出的,通过将一组聚合分为两个阶段,首先在上游进行本地聚合,然后在下游进行全局聚合,类似于 MapReduce 中的 Combine + Reduce 模式。例如,就以下 SQL 而言:
 
 {% highlight sql %}
 SELECT color, sum(id)
 FROM T
 GROUP BY color
 {% endhighlight %}
 
-It is possible that the records in the data stream are skewed, thus some instances of aggregation operator have to process much more records than others, which leads to hotspot.
-The local aggregation can help to accumulate a certain amount of inputs which have the same key into a single accumulator. The global aggregation will only receive the reduced accumulators instead of large number of raw inputs.
-This can significantly reduce the network shuffle and the cost of state access. The number of inputs accumulated by local aggregation every time is based on mini-batch interval. It means local-global aggregation depends on mini-batch optimization is enabled.
+数据流中的记录可能会倾斜,因此某些聚合算子的实例必须比其他实例处理更多的记录,这会导致 hotspot。本地聚合可以将一定数量具有相同 key 的输入数据累加到单个累加器中。全局聚合将仅接收 reduce 后的累加器,而不是大量的原始输入数据。这可以大大减少网络 shuffle 和状态访问的成本。每次本地聚合累积的输入数据量基于 mini-batch 间隔。这意味着 local-global 聚合依赖于启用了 mini-batch 优化。

Review comment:
       这会产生热点问题

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -195,17 +191,17 @@ GROUP BY day
 {% endhighlight %}
 
 
-The following figure shows how the split distinct aggregation improve performance (assuming color represents days, and letter represents user_id).
+下图显示了拆分 distinct 聚合如何提高性能(假设颜色表示 days,字母表示 user_id)。
 
 <div style="text-align: center">
   <img src="{{ site.baseurl }}/fig/table-streaming/distinct_split.png" width="70%" height="70%" />
 </div>
 
-NOTE: Above is the simplest example which can benefit from this optimization. Besides that, Flink supports to split more complex aggregation queries, for example, more than one distinct aggregates with different distinct key (e.g. `COUNT(DISTINCT a), SUM(DISTINCT b)`), works with other non-distinct aggregates (e.g. `SUM`, `MAX`, `MIN`, `COUNT`).
+注意:上面是可以从这个优化中受益的最简单的示例。除此之外,Flink 还支持拆分更复杂的聚合查询,例如,多个具有不同 distinct key (例如 `COUNT(DISTINCT a), SUM(DISTINCT b)` )的 distinct 聚合,可以与其他非 distinct 聚合(例如 `SUM`、`MAX`、`MIN`、`COUNT` )一起使用。
 
-<span class="label label-danger">Attention</span> However, currently, the split optimization doesn't support aggregations which contains user defined AggregateFunction.
+<span class="label label-danger">注意</span> 但是,当前,拆分优化不支持包含用户定义的 AggregateFunction 聚合。

Review comment:
       但是,当前 -> 当前
   这里我们意译下吧,直译过来有点怪怪的




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] chaojianok commented on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
chaojianok commented on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-619842515


   @leonardBang I've optimized it according to your suggestions, please help me review again, thanks a lot.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162046977",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162152427",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=283",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=777",
       "triggerID" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "triggerType" : "PUSH"
     }, {
       "hash" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "status" : "PENDING",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/164601384",
       "triggerID" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 67027fc17a69ef4edc8300e4b82a9e1233b016ee UNKNOWN
   * 836c6bce67b366939d524f928f0796a18abbf3ec Travis: [PENDING](https://travis-ci.com/github/flink-ci/flink/builds/164601384) Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=777) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "SUCCESS",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "PENDING",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162046977",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7969f07cca4581f4ae7cbef7bd06787b3e90d248 Travis: [SUCCESS](https://travis-ci.com/github/flink-ci/flink/builds/161766709) Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168) 
   * 1cd0f11272e1d4013f2f09a328c8777ae32e5470 Travis: [PENDING](https://travis-ci.com/github/flink-ci/flink/builds/162046977) Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260) 
   * 67027fc17a69ef4edc8300e4b82a9e1233b016ee UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] wuchong commented on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
wuchong commented on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-619371585


   @leonardBang  Could you help to review on this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] leonardBang commented on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
leonardBang commented on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-625639050


   @chadnickbok Thanks very much for your contribution, LGTM now
   +1 to merge cc @wuchong 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] leonardBang commented on a change in pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
leonardBang commented on a change in pull request #11897:
URL: https://github.com/apache/flink/pull/11897#discussion_r415193918



##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,34 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析最广泛使用的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子实现。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。

Review comment:
       (1)"SQL 是数据分析最广泛使用的语言" -> "SQL 是数据分析中使用最广泛的语言"
   (2)"它集成了许多查询优化和算子实现" -> "它集成了许多查询优化和算子优化"

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,34 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析最广泛使用的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子实现。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。
 
-In this page, we will introduce some useful optimization options and the internals of streaming aggregation which will bring great improvement in some cases.
+这里我们将介绍一些实用的优化选项以及流式聚合的内部原理,它们在某些情况下能带来很大改进。

Review comment:
       (1)“这里我们将”-> “在这一页,我们将” 或 “在这一节中,我们将”
   (2)很大改进->很大的提升

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,34 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析最广泛使用的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子实现。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。
 
-In this page, we will introduce some useful optimization options and the internals of streaming aggregation which will bring great improvement in some cases.
+这里我们将介绍一些实用的优化选项以及流式聚合的内部原理,它们在某些情况下能带来很大改进。
 
-<span class="label label-danger">Attention</span> Currently, the optimization options mentioned in this page are only supported in the Blink planner.
+<span class="label label-danger">注意</span> 目前,这里提到的优化选项仅支持 Blink 计划器。

Review comment:
       (1)“这里提到的” -> “这一页提到的” 或 “这一节提到的”
     术语可以保留英文,翻译到中文一些不是很通用的反而感觉怪怪的,你觉得呢?
   (2)Blink 计划器 -> Blink planner

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -94,28 +96,28 @@ configuration.set_string("table.exec.mini-batch.size", "5000"); # the maximum nu
 </div>
 </div>
 
-## Local-Global Aggregation
+<a name="local-global-aggregation"></a>
+
+## 本地全局聚合
 
-Local-Global is proposed to solve data skew problem by dividing a group aggregation into two stages, that is doing local aggregation in upstream firstly, and followed by global aggregation in downstream, which is similar to Combine + Reduce pattern in MapReduce. For example, considering the following SQL:
+本地全局聚合是为解决数据倾斜问题提出的,通过将一组聚合分为两个阶段,首先在上游进行本地聚合,然后在下游进行全局聚合,类似于 MapReduce 中的 Combine + Reduce 模式。例如,就以下 SQL 而言:

Review comment:
        本地全局聚合 -> "Local-Global 聚合是为解决"

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -94,28 +96,28 @@ configuration.set_string("table.exec.mini-batch.size", "5000"); # the maximum nu
 </div>
 </div>
 
-## Local-Global Aggregation
+<a name="local-global-aggregation"></a>
+
+## 本地全局聚合

Review comment:
       我们也保留 ”Local-Global“?-> "Local-Global 聚合"

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,34 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析最广泛使用的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子实现。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。
 
-In this page, we will introduce some useful optimization options and the internals of streaming aggregation which will bring great improvement in some cases.
+这里我们将介绍一些实用的优化选项以及流式聚合的内部原理,它们在某些情况下能带来很大改进。
 
-<span class="label label-danger">Attention</span> Currently, the optimization options mentioned in this page are only supported in the Blink planner.
+<span class="label label-danger">注意</span> 目前,这里提到的优化选项仅支持 Blink 计划器。
 
-<span class="label label-danger">Attention</span> Currently, the streaming aggregations optimization are only supported for [unbounded-aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#aggregations). Optimizations for [window aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#group-windows) will be supported in the future.
+<span class="label label-danger">注意</span> 目前,流聚合优化仅支持 [无界聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#aggregations)。[窗口聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#group-windows) 优化将在未来支持。
 
 * This will be replaced by the TOC
 {:toc}
 
-By default, the unbounded aggregation operator processes input records one by one, i.e., (1) read accumulator from state, (2) accumulate/retract record to accumulator, (3) write accumulator back to state, (4) the next record will do the process again from (1). This processing pattern may increase the overhead of StateBackend (especially for RocksDB StateBackend).
-Besides, data skew which is very common in production will worsen the problem and make it easy for the jobs to be under backpressure situations.
+默认情况下,无界聚合算子是一个一个的处理输入的记录,也就是说,(1)从状态读取累加器,(2)累积/撤回记录至累积器,(3)将累加器写回状态,(4)下一条记录将再次从(1)开始处理。 这种处理模式可能会增加 StateBackend 开销(尤其是对于 RocksDB StateBackend)。此外,生产中非常常见的数据倾斜会使这个问题恶化,并使 job 容易承受反压的情况。
 
-## MiniBatch Aggregation
+<a name="minibatch-aggregation"></a>
 
-The core idea of mini-batch aggregation is caching a bundle of inputs in a buffer inside of the aggregation operator. When the bundle of inputs is triggered to process, only one operation per key to access state is needed. This can significantly reduce the state overhead and get a better throughput. However, this may increase some latency because it buffers some records instead of processing them in an instant. This is a trade-off between throughput and latency.
+## 微批聚合
 
-The following figure explains how the mini-batch aggregation reduces state operations.
+微批聚合的核心思想是将一组输入的数据缓存在聚合算子内部的缓冲区中。当输入的数据被触发处理时,每个键只需一个操作即可访问状态。这样可以大大减少状态开销并获得更好的吞吐量。但是,这可能会增加一些延迟,因为它会缓冲一些记录而不是立即处理它们。这是吞吐量和延迟之间的权衡。

Review comment:
       ”微批聚合的核心思想“ -> ”MiniBatch 聚合的核心思想“

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,34 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析最广泛使用的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子实现。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。
 
-In this page, we will introduce some useful optimization options and the internals of streaming aggregation which will bring great improvement in some cases.
+这里我们将介绍一些实用的优化选项以及流式聚合的内部原理,它们在某些情况下能带来很大改进。
 
-<span class="label label-danger">Attention</span> Currently, the optimization options mentioned in this page are only supported in the Blink planner.
+<span class="label label-danger">注意</span> 目前,这里提到的优化选项仅支持 Blink 计划器。
 
-<span class="label label-danger">Attention</span> Currently, the streaming aggregations optimization are only supported for [unbounded-aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#aggregations). Optimizations for [window aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#group-windows) will be supported in the future.
+<span class="label label-danger">注意</span> 目前,流聚合优化仅支持 [无界聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#aggregations)。[窗口聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#group-windows) 优化将在未来支持。
 
 * This will be replaced by the TOC
 {:toc}
 
-By default, the unbounded aggregation operator processes input records one by one, i.e., (1) read accumulator from state, (2) accumulate/retract record to accumulator, (3) write accumulator back to state, (4) the next record will do the process again from (1). This processing pattern may increase the overhead of StateBackend (especially for RocksDB StateBackend).
-Besides, data skew which is very common in production will worsen the problem and make it easy for the jobs to be under backpressure situations.
+默认情况下,无界聚合算子是一个一个的处理输入的记录,也就是说,(1)从状态读取累加器,(2)累积/撤回记录至累积器,(3)将累加器写回状态,(4)下一条记录将再次从(1)开始处理。 这种处理模式可能会增加 StateBackend 开销(尤其是对于 RocksDB StateBackend)。此外,生产中非常常见的数据倾斜会使这个问题恶化,并使 job 容易承受反压的情况。
 
-## MiniBatch Aggregation
+<a name="minibatch-aggregation"></a>
 
-The core idea of mini-batch aggregation is caching a bundle of inputs in a buffer inside of the aggregation operator. When the bundle of inputs is triggered to process, only one operation per key to access state is needed. This can significantly reduce the state overhead and get a better throughput. However, this may increase some latency because it buffers some records instead of processing them in an instant. This is a trade-off between throughput and latency.
+## 微批聚合

Review comment:
       这里保留原文是否更好点?,微批聚合 -> ”MiniBatch 聚合“

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -94,28 +96,28 @@ configuration.set_string("table.exec.mini-batch.size", "5000"); # the maximum nu
 </div>
 </div>
 
-## Local-Global Aggregation
+<a name="local-global-aggregation"></a>
+
+## 本地全局聚合
 
-Local-Global is proposed to solve data skew problem by dividing a group aggregation into two stages, that is doing local aggregation in upstream firstly, and followed by global aggregation in downstream, which is similar to Combine + Reduce pattern in MapReduce. For example, considering the following SQL:
+本地全局聚合是为解决数据倾斜问题提出的,通过将一组聚合分为两个阶段,首先在上游进行本地聚合,然后在下游进行全局聚合,类似于 MapReduce 中的 Combine + Reduce 模式。例如,就以下 SQL 而言:
 
 {% highlight sql %}
 SELECT color, sum(id)
 FROM T
 GROUP BY color
 {% endhighlight %}
 
-It is possible that the records in the data stream are skewed, thus some instances of aggregation operator have to process much more records than others, which leads to hotspot.
-The local aggregation can help to accumulate a certain amount of inputs which have the same key into a single accumulator. The global aggregation will only receive the reduced accumulators instead of large number of raw inputs.
-This can significantly reduce the network shuffle and the cost of state access. The number of inputs accumulated by local aggregation every time is based on mini-batch interval. It means local-global aggregation depends on mini-batch optimization is enabled.
+数据流中的记录可能会倾斜,因此某些聚合算子的实例必须比其他实例处理更多的记录,这会导致 hotspot。本地聚合可以将一定数量具有相同 key 的输入数据累加到单个累加器中。全局聚合将仅接收 reduce 后的累加器,而不是大量的原始输入数据。这可以大大减少网络 shuffle 和状态访问的成本。每次本地聚合累积的输入数据量基于微批间隔。这意味着本地全局聚合依赖于启用了微批优化。
 
-The following figure shows how the local-global aggregation improve performance.
+下图显示了本地全局聚合如何提高性能。

Review comment:
       local-global 我们都保留不翻译吧

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -94,28 +96,28 @@ configuration.set_string("table.exec.mini-batch.size", "5000"); # the maximum nu
 </div>
 </div>
 
-## Local-Global Aggregation
+<a name="local-global-aggregation"></a>
+
+## 本地全局聚合
 
-Local-Global is proposed to solve data skew problem by dividing a group aggregation into two stages, that is doing local aggregation in upstream firstly, and followed by global aggregation in downstream, which is similar to Combine + Reduce pattern in MapReduce. For example, considering the following SQL:
+本地全局聚合是为解决数据倾斜问题提出的,通过将一组聚合分为两个阶段,首先在上游进行本地聚合,然后在下游进行全局聚合,类似于 MapReduce 中的 Combine + Reduce 模式。例如,就以下 SQL 而言:
 
 {% highlight sql %}
 SELECT color, sum(id)
 FROM T
 GROUP BY color
 {% endhighlight %}
 
-It is possible that the records in the data stream are skewed, thus some instances of aggregation operator have to process much more records than others, which leads to hotspot.
-The local aggregation can help to accumulate a certain amount of inputs which have the same key into a single accumulator. The global aggregation will only receive the reduced accumulators instead of large number of raw inputs.
-This can significantly reduce the network shuffle and the cost of state access. The number of inputs accumulated by local aggregation every time is based on mini-batch interval. It means local-global aggregation depends on mini-batch optimization is enabled.
+数据流中的记录可能会倾斜,因此某些聚合算子的实例必须比其他实例处理更多的记录,这会导致 hotspot。本地聚合可以将一定数量具有相同 key 的输入数据累加到单个累加器中。全局聚合将仅接收 reduce 后的累加器,而不是大量的原始输入数据。这可以大大减少网络 shuffle 和状态访问的成本。每次本地聚合累积的输入数据量基于微批间隔。这意味着本地全局聚合依赖于启用了微批优化。

Review comment:
       ”基于微批间隔。这意味着本地全局聚合依赖于启用了微批优化。“ ->
    ”基于 mini-batch 的间隔时间。这意味着 local-global  聚合依赖于启用了 mini-batch 优化。“

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -94,28 +96,28 @@ configuration.set_string("table.exec.mini-batch.size", "5000"); # the maximum nu
 </div>
 </div>
 
-## Local-Global Aggregation
+<a name="local-global-aggregation"></a>
+
+## 本地全局聚合
 
-Local-Global is proposed to solve data skew problem by dividing a group aggregation into two stages, that is doing local aggregation in upstream firstly, and followed by global aggregation in downstream, which is similar to Combine + Reduce pattern in MapReduce. For example, considering the following SQL:
+本地全局聚合是为解决数据倾斜问题提出的,通过将一组聚合分为两个阶段,首先在上游进行本地聚合,然后在下游进行全局聚合,类似于 MapReduce 中的 Combine + Reduce 模式。例如,就以下 SQL 而言:
 
 {% highlight sql %}
 SELECT color, sum(id)
 FROM T
 GROUP BY color
 {% endhighlight %}
 
-It is possible that the records in the data stream are skewed, thus some instances of aggregation operator have to process much more records than others, which leads to hotspot.
-The local aggregation can help to accumulate a certain amount of inputs which have the same key into a single accumulator. The global aggregation will only receive the reduced accumulators instead of large number of raw inputs.
-This can significantly reduce the network shuffle and the cost of state access. The number of inputs accumulated by local aggregation every time is based on mini-batch interval. It means local-global aggregation depends on mini-batch optimization is enabled.
+数据流中的记录可能会倾斜,因此某些聚合算子的实例必须比其他实例处理更多的记录,这会导致 hotspot。本地聚合可以将一定数量具有相同 key 的输入数据累加到单个累加器中。全局聚合将仅接收 reduce 后的累加器,而不是大量的原始输入数据。这可以大大减少网络 shuffle 和状态访问的成本。每次本地聚合累积的输入数据量基于微批间隔。这意味着本地全局聚合依赖于启用了微批优化。
 
-The following figure shows how the local-global aggregation improve performance.
+下图显示了本地全局聚合如何提高性能。
 
 <div style="text-align: center">
   <img src="{{ site.baseurl }}/fig/table-streaming/local_agg.png" width="70%" height="70%" />
 </div>
 
 
-The following examples show how to enable the local-global aggregation.
+以下示例显示如何启用本地全局聚合。

Review comment:
       启用本地全局聚合 -> "启用 local-global 聚合"

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -164,25 +166,25 @@ configuration.set_string("table.optimizer.agg-phase-strategy", "TWO_PHASE"); # e
 </div>
 </div>
 
-## Split Distinct Aggregation
+<a name="split-distinct-aggregation"></a>
 
-Local-Global optimization is effective to eliminate data skew for general aggregation, such as SUM, COUNT, MAX, MIN, AVG. But its performance is not satisfactory when dealing with distinct aggregation.
+## 拆分不同的聚合
 
-For example, if we want to analyse how many unique users logined today. We may have the following query:
+本地全局优化可有效消除常规聚合的数据倾斜,例如 SUM、COUNT、MAX、MIN、AVG。但是在处理不同的聚合时,其性能并不令人满意。
+
+例如,如果我们要分析今天有多少唯一用户登录。我们可能有以下查询:
 
 {% highlight sql %}
 SELECT day, COUNT(DISTINCT user_id)
 FROM T
 GROUP BY day
 {% endhighlight %}
 
-COUNT DISTINCT is not good at reducing records if the value of distinct key (i.e. user_id) is sparse. Even if local-global optimization is enabled, it doesn't help much. Because the accumulator still contain almost all the raw records, and the global aggregation will be the bottleneck (most of the heavy accumulators are processed by one task, i.e. on the same day).
+如果唯一键(即 user_id)的值稀疏,则 COUNT DISTINCT 不适合 reduce 操作。即使启用了本地全局优化也没有太大帮助。因为累加器仍然包含几乎所有原始记录,并且全局聚合将成为瓶颈(大多数繁重的累加器由一个任务处理,即同一天)。

Review comment:
       "启用了本地全局优化" -> "启用了 local-global 优化"

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -164,25 +166,25 @@ configuration.set_string("table.optimizer.agg-phase-strategy", "TWO_PHASE"); # e
 </div>
 </div>
 
-## Split Distinct Aggregation
+<a name="split-distinct-aggregation"></a>
 
-Local-Global optimization is effective to eliminate data skew for general aggregation, such as SUM, COUNT, MAX, MIN, AVG. But its performance is not satisfactory when dealing with distinct aggregation.
+## 拆分不同的聚合

Review comment:
       distinct 需要作为术语翻译,"拆分 distinct 聚合"

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,34 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析最广泛使用的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子实现。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。
 
-In this page, we will introduce some useful optimization options and the internals of streaming aggregation which will bring great improvement in some cases.
+这里我们将介绍一些实用的优化选项以及流式聚合的内部原理,它们在某些情况下能带来很大改进。
 
-<span class="label label-danger">Attention</span> Currently, the optimization options mentioned in this page are only supported in the Blink planner.
+<span class="label label-danger">注意</span> 目前,这里提到的优化选项仅支持 Blink 计划器。
 
-<span class="label label-danger">Attention</span> Currently, the streaming aggregations optimization are only supported for [unbounded-aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#aggregations). Optimizations for [window aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#group-windows) will be supported in the future.
+<span class="label label-danger">注意</span> 目前,流聚合优化仅支持 [无界聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#aggregations)。[窗口聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#group-windows) 优化将在未来支持。
 
 * This will be replaced by the TOC
 {:toc}
 
-By default, the unbounded aggregation operator processes input records one by one, i.e., (1) read accumulator from state, (2) accumulate/retract record to accumulator, (3) write accumulator back to state, (4) the next record will do the process again from (1). This processing pattern may increase the overhead of StateBackend (especially for RocksDB StateBackend).
-Besides, data skew which is very common in production will worsen the problem and make it easy for the jobs to be under backpressure situations.
+默认情况下,无界聚合算子是一个一个的处理输入的记录,也就是说,(1)从状态读取累加器,(2)累积/撤回记录至累积器,(3)将累加器写回状态,(4)下一条记录将再次从(1)开始处理。 这种处理模式可能会增加 StateBackend 开销(尤其是对于 RocksDB StateBackend)。此外,生产中非常常见的数据倾斜会使这个问题恶化,并使 job 容易承受反压的情况。
 
-## MiniBatch Aggregation
+<a name="minibatch-aggregation"></a>

Review comment:
       why add this line?

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -242,10 +244,11 @@ t_env.get_config()        # access high-level configuration
 </div>
 </div>
 
-## Use FILTER Modifier on Distinct Aggregates
+<a name="use-filter-modifier-on-distinct-aggregates"></a>
+
+## 在不同的聚合上使用 FILTER 修改器
 
-In some cases, user may need to calculate the number of UV (unique visitor) from different dimensions, e.g. UV from Android, UV from iPhone, UV from Web and the total UV.
-Many users will choose `CASE WHEN` to support this, for example:
+在某些情况下,用户可能需要从不同维度计算 UV(unique visitor)的数量,例如来自 Android 的 UV、iPhone 的 UV、Web 的 UV 和总 UV。很多人会选择 `CASE WHEN`,例如:

Review comment:
       unique visitor 可翻译成中文,独立访客

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -195,17 +197,17 @@ GROUP BY day
 {% endhighlight %}
 
 
-The following figure shows how the split distinct aggregation improve performance (assuming color represents days, and letter represents user_id).
+下图显示了拆分不同聚合如何提高性能(假设颜色表示 days,字母表示 user_id)。

Review comment:
       拆分不同的聚合 -> "拆分 distinct 聚合"
   
   

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -164,25 +166,25 @@ configuration.set_string("table.optimizer.agg-phase-strategy", "TWO_PHASE"); # e
 </div>
 </div>
 
-## Split Distinct Aggregation
+<a name="split-distinct-aggregation"></a>
 
-Local-Global optimization is effective to eliminate data skew for general aggregation, such as SUM, COUNT, MAX, MIN, AVG. But its performance is not satisfactory when dealing with distinct aggregation.
+## 拆分不同的聚合
 
-For example, if we want to analyse how many unique users logined today. We may have the following query:
+本地全局优化可有效消除常规聚合的数据倾斜,例如 SUM、COUNT、MAX、MIN、AVG。但是在处理不同的聚合时,其性能并不令人满意。

Review comment:
        处理 distinct 聚合

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -242,10 +244,11 @@ t_env.get_config()        # access high-level configuration
 </div>
 </div>
 
-## Use FILTER Modifier on Distinct Aggregates
+<a name="use-filter-modifier-on-distinct-aggregates"></a>
+
+## 在不同的聚合上使用 FILTER 修改器

Review comment:
       在 distinct 聚合上使用 FILTER 修饰符

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -195,17 +197,17 @@ GROUP BY day
 {% endhighlight %}
 
 
-The following figure shows how the split distinct aggregation improve performance (assuming color represents days, and letter represents user_id).
+下图显示了拆分不同聚合如何提高性能(假设颜色表示 days,字母表示 user_id)。
 
 <div style="text-align: center">
   <img src="{{ site.baseurl }}/fig/table-streaming/distinct_split.png" width="70%" height="70%" />
 </div>
 
-NOTE: Above is the simplest example which can benefit from this optimization. Besides that, Flink supports to split more complex aggregation queries, for example, more than one distinct aggregates with different distinct key (e.g. `COUNT(DISTINCT a), SUM(DISTINCT b)`), works with other non-distinct aggregates (e.g. `SUM`, `MAX`, `MIN`, `COUNT`).
+注意:上面是可以从这个优化中受益的最简单的示例。除此之外,Flink 还支持拆分更复杂的聚合查询,例如,多个具有不同唯一键(例如 `COUNT(DISTINCT a), SUM(DISTINCT b)` )的不同聚合,可以与其他非明显聚合(例如 `SUM`、`MAX`、`MIN`、`COUNT` )一起使用。
 
-<span class="label label-danger">Attention</span> However, currently, the split optimization doesn't support aggregations which contains user defined AggregateFunction.
+<span class="label label-danger">注意</span> 但是,当前,拆分优化不支持包含用户定义的 AggregateFunction 聚合。
 
-The following examples show how to enable the split distinct aggregation optimization.
+以下示例显示了如何启用拆分非重复聚合优化。

Review comment:
       拆分非重复聚合 -> "拆分 distinct 聚合"

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -164,25 +166,25 @@ configuration.set_string("table.optimizer.agg-phase-strategy", "TWO_PHASE"); # e
 </div>
 </div>
 
-## Split Distinct Aggregation
+<a name="split-distinct-aggregation"></a>
 
-Local-Global optimization is effective to eliminate data skew for general aggregation, such as SUM, COUNT, MAX, MIN, AVG. But its performance is not satisfactory when dealing with distinct aggregation.
+## 拆分不同的聚合
 
-For example, if we want to analyse how many unique users logined today. We may have the following query:
+本地全局优化可有效消除常规聚合的数据倾斜,例如 SUM、COUNT、MAX、MIN、AVG。但是在处理不同的聚合时,其性能并不令人满意。
+
+例如,如果我们要分析今天有多少唯一用户登录。我们可能有以下查询:
 
 {% highlight sql %}
 SELECT day, COUNT(DISTINCT user_id)
 FROM T
 GROUP BY day
 {% endhighlight %}
 
-COUNT DISTINCT is not good at reducing records if the value of distinct key (i.e. user_id) is sparse. Even if local-global optimization is enabled, it doesn't help much. Because the accumulator still contain almost all the raw records, and the global aggregation will be the bottleneck (most of the heavy accumulators are processed by one task, i.e. on the same day).
+如果唯一键(即 user_id)的值稀疏,则 COUNT DISTINCT 不适合 reduce 操作。即使启用了本地全局优化也没有太大帮助。因为累加器仍然包含几乎所有原始记录,并且全局聚合将成为瓶颈(大多数繁重的累加器由一个任务处理,即同一天)。
 
-The idea of this optimization is splitting distinct aggregation (e.g. `COUNT(DISTINCT col)`) into two levels. The first aggregation is shuffled by group key and an additional bucket key. The bucket key is calculated using `HASH_CODE(distinct_key) % BUCKET_NUM`. `BUCKET_NUM` is 1024 by default, and can be configured by `table.optimizer.distinct-agg.split.bucket-num` option.
-The second aggregation is shuffled by the original group key, and use `SUM` to aggregate COUNT DISTINCT values from different buckets. Because the same distinct key will only be calculated in the same bucket, so the transformation is equivalent.
-The bucket key plays the role of an additional group key to share the burden of hotspot in group key. The bucket key makes the job to be scalability to solve data-skew/hotspot in distinct aggregations.
+这个优化的想法是将不同的聚合(例如 `COUNT(DISTINCT col)`)分为两个级别。第一次聚合由 group key 和额外的 bucket key 进行 shuffle。bucket key 是使用 `HASH_CODE(distinct_key) % BUCKET_NUM` 计算的。`BUCKET_NUM` 默认为1024,可以通过 `table.optimizer.distinct-agg.split.bucket-num` 选项进行配置。第二次聚合是由原始 group key 进行 shuffle,并使用 `SUM` 聚合来自不同 buckets 的 COUNT DISTINCT 值。由于相同的唯一键将仅在同一 bucket 中计算,因此转换是等效的。bucket key 充当附加 group key 的角色,以分担 group key 中 hotspot 的负担。bucket key 使 job 具有可伸缩性来解决不同聚合中的数据倾斜/hotspot。

Review comment:
       这段翻译得很好,就是hotspot我们可以翻译过来,热点这个词中文里还是很好理解的
   (1)以分担 group key 中 hotspot 的负担 -> 以分担 group key 中的热点
   (2)数据倾斜/hotspot -> 数据倾斜/数据热点

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -164,25 +166,25 @@ configuration.set_string("table.optimizer.agg-phase-strategy", "TWO_PHASE"); # e
 </div>
 </div>
 
-## Split Distinct Aggregation
+<a name="split-distinct-aggregation"></a>
 
-Local-Global optimization is effective to eliminate data skew for general aggregation, such as SUM, COUNT, MAX, MIN, AVG. But its performance is not satisfactory when dealing with distinct aggregation.
+## 拆分不同的聚合
 
-For example, if we want to analyse how many unique users logined today. We may have the following query:
+本地全局优化可有效消除常规聚合的数据倾斜,例如 SUM、COUNT、MAX、MIN、AVG。但是在处理不同的聚合时,其性能并不令人满意。
+
+例如,如果我们要分析今天有多少唯一用户登录。我们可能有以下查询:
 
 {% highlight sql %}
 SELECT day, COUNT(DISTINCT user_id)
 FROM T
 GROUP BY day
 {% endhighlight %}
 
-COUNT DISTINCT is not good at reducing records if the value of distinct key (i.e. user_id) is sparse. Even if local-global optimization is enabled, it doesn't help much. Because the accumulator still contain almost all the raw records, and the global aggregation will be the bottleneck (most of the heavy accumulators are processed by one task, i.e. on the same day).
+如果唯一键(即 user_id)的值稀疏,则 COUNT DISTINCT 不适合 reduce 操作。即使启用了本地全局优化也没有太大帮助。因为累加器仍然包含几乎所有原始记录,并且全局聚合将成为瓶颈(大多数繁重的累加器由一个任务处理,即同一天)。
 
-The idea of this optimization is splitting distinct aggregation (e.g. `COUNT(DISTINCT col)`) into two levels. The first aggregation is shuffled by group key and an additional bucket key. The bucket key is calculated using `HASH_CODE(distinct_key) % BUCKET_NUM`. `BUCKET_NUM` is 1024 by default, and can be configured by `table.optimizer.distinct-agg.split.bucket-num` option.
-The second aggregation is shuffled by the original group key, and use `SUM` to aggregate COUNT DISTINCT values from different buckets. Because the same distinct key will only be calculated in the same bucket, so the transformation is equivalent.
-The bucket key plays the role of an additional group key to share the burden of hotspot in group key. The bucket key makes the job to be scalability to solve data-skew/hotspot in distinct aggregations.
+这个优化的想法是将不同的聚合(例如 `COUNT(DISTINCT col)`)分为两个级别。第一次聚合由 group key 和额外的 bucket key 进行 shuffle。bucket key 是使用 `HASH_CODE(distinct_key) % BUCKET_NUM` 计算的。`BUCKET_NUM` 默认为1024,可以通过 `table.optimizer.distinct-agg.split.bucket-num` 选项进行配置。第二次聚合是由原始 group key 进行 shuffle,并使用 `SUM` 聚合来自不同 buckets 的 COUNT DISTINCT 值。由于相同的唯一键将仅在同一 bucket 中计算,因此转换是等效的。bucket key 充当附加 group key 的角色,以分担 group key 中 hotspot 的负担。bucket key 使 job 具有可伸缩性来解决不同聚合中的数据倾斜/hotspot。
 
-After split distinct aggregate, the above query will be rewritten into the following query automatically:
+拆分不同的聚合后,以上查询将被自动重写为以下查询:

Review comment:
       拆分不同的聚合后 -> "拆分 distinct 聚合后"
   

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -257,8 +260,7 @@ FROM T
 GROUP BY day
 {% endhighlight %}
 
-However, it is recommended to use `FILTER` syntax instead of CASE WHEN in this case. Because `FILTER` is more compliant with the SQL standard and will get much more performance improvement.
-`FILTER` is a modifier used on an aggregate function to limit the values used in an aggregation. Replace the above example with `FILTER` modifier as following:
+但是,在这种情况下,建议使用 `FILTER` 语法而不是 CASE WHEN。因为 `FILTER` 更符合 SQL 标准,并且能获得更多的性能改进。`FILTER` 是用于聚合函数的修饰符,用于限制聚合中使用的值。将上面的示例替换为 `FILTER` 修饰符,如下所示:

Review comment:
       更多的性能改进 -> 更多的性能提升

##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -270,8 +272,7 @@ FROM T
 GROUP BY day
 {% endhighlight %}
 
-Flink SQL optimizer can recognize the different filter arguments on the same distinct key. For example, in the above example, all the three COUNT DISTINCT are on `user_id` column.
-Then Flink can use just one shared state instance instead of three state instances to reduce state access and state size. In some workloads, this can get significant performance improvements.
+Flink SQL 优化器可以识别相同唯一键上的不同过滤器参数。例如,在上面的示例中,三个 COUNT DISTINCT 都在 `user_id` 一列上。Flink 可以只使用一个共享状态实例,而不是三个状态实例,以减少状态访问和状态大小。在某些工作负载中,这可以显着提高性能。

Review comment:
       > 在某些工作负载中,这可以显着提高性能。
   
   `在某些工作负载下,可以获得显著的性能提升`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "CANCELED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162046977",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "PENDING",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162152427",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=283",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1cd0f11272e1d4013f2f09a328c8777ae32e5470 Travis: [CANCELED](https://travis-ci.com/github/flink-ci/flink/builds/162046977) Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260) 
   * 67027fc17a69ef4edc8300e4b82a9e1233b016ee UNKNOWN
   * da909ac1e2831251bf60375a080cc192e4e4a0c8 Travis: [PENDING](https://travis-ci.com/github/flink-ci/flink/builds/162152427) Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=283) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "PENDING",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7969f07cca4581f4ae7cbef7bd06787b3e90d248 Travis: [PENDING](https://travis-ci.com/github/flink-ci/flink/builds/161766709) Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "CANCELED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162046977",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1cd0f11272e1d4013f2f09a328c8777ae32e5470 Travis: [CANCELED](https://travis-ci.com/github/flink-ci/flink/builds/162046977) Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260) 
   * 67027fc17a69ef4edc8300e4b82a9e1233b016ee UNKNOWN
   * da909ac1e2831251bf60375a080cc192e4e4a0c8 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] chaojianok commented on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
chaojianok commented on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-619489756


   @leonardBang Thanks a lot! I'll optimize according to your suggestion.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "SUCCESS",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7969f07cca4581f4ae7cbef7bd06787b3e90d248 Travis: [SUCCESS](https://travis-ci.com/github/flink-ci/flink/builds/161766709) Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168) 
   * 1cd0f11272e1d4013f2f09a328c8777ae32e5470 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "SUCCESS",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7969f07cca4581f4ae7cbef7bd06787b3e90d248 Travis: [SUCCESS](https://travis-ci.com/github/flink-ci/flink/builds/161766709) Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162046977",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "SUCCESS",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162152427",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=283",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 67027fc17a69ef4edc8300e4b82a9e1233b016ee UNKNOWN
   * da909ac1e2831251bf60375a080cc192e4e4a0c8 Travis: [SUCCESS](https://travis-ci.com/github/flink-ci/flink/builds/162152427) Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=283) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "CANCELED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162046977",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1cd0f11272e1d4013f2f09a328c8777ae32e5470 Travis: [CANCELED](https://travis-ci.com/github/flink-ci/flink/builds/162046977) Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260) 
   * 67027fc17a69ef4edc8300e4b82a9e1233b016ee UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162046977",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162152427",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=283",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=777",
       "triggerID" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "triggerType" : "PUSH"
     }, {
       "hash" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "status" : "SUCCESS",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/164601384",
       "triggerID" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 67027fc17a69ef4edc8300e4b82a9e1233b016ee UNKNOWN
   * 836c6bce67b366939d524f928f0796a18abbf3ec Travis: [SUCCESS](https://travis-ci.com/github/flink-ci/flink/builds/164601384) Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=777) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot commented on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7969f07cca4581f4ae7cbef7bd06787b3e90d248 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162046977",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "SUCCESS",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162152427",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=283",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=777",
       "triggerID" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "triggerType" : "PUSH"
     }, {
       "hash" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "status" : "PENDING",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/164601384",
       "triggerID" : "836c6bce67b366939d524f928f0796a18abbf3ec",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 67027fc17a69ef4edc8300e4b82a9e1233b016ee UNKNOWN
   * da909ac1e2831251bf60375a080cc192e4e4a0c8 Travis: [SUCCESS](https://travis-ci.com/github/flink-ci/flink/builds/162152427) Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=283) 
   * 836c6bce67b366939d524f928f0796a18abbf3ec Travis: [PENDING](https://travis-ci.com/github/flink-ci/flink/builds/164601384) Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=777) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] chaojianok commented on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
chaojianok commented on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-619515431


   @leonardBang I have optimized it according to your suggestions, please help me review again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] leonardBang commented on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
leonardBang commented on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-619468950


   > @leonardBang Could you help to review on this?
   very pleasure


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] chaojianok commented on a change in pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
chaojianok commented on a change in pull request #11897:
URL: https://github.com/apache/flink/pull/11897#discussion_r415230158



##########
File path: docs/dev/table/tuning/streaming_aggregation_optimization.zh.md
##########
@@ -22,33 +23,34 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-SQL is the most widely used language for data analytics. Flink's Table API and SQL enables users to define efficient stream analytics applications in less time and effort. Moreover, Flink Table API and SQL is effectively optimized, it integrates a lot of query optimizations and tuned operator implementations. But not all of the optimizations are enabled by default, so for some workloads, it is possible to improve performance by turning on some options.
+SQL 是数据分析最广泛使用的语言。Flink Table API 和 SQL 使用户能够以更少的时间和精力定义高效的流分析应用程序。而且,Flink Table API 和 SQL 是有效优化过的,它集成了许多查询优化和算子实现。但并不是所有的优化都是默认开启的,因此对于某些工作负载,可以通过打开某些选项来提高性能。
 
-In this page, we will introduce some useful optimization options and the internals of streaming aggregation which will bring great improvement in some cases.
+这里我们将介绍一些实用的优化选项以及流式聚合的内部原理,它们在某些情况下能带来很大改进。
 
-<span class="label label-danger">Attention</span> Currently, the optimization options mentioned in this page are only supported in the Blink planner.
+<span class="label label-danger">注意</span> 目前,这里提到的优化选项仅支持 Blink 计划器。
 
-<span class="label label-danger">Attention</span> Currently, the streaming aggregations optimization are only supported for [unbounded-aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#aggregations). Optimizations for [window aggregations]({{ site.baseurl }}/dev/table/sql/queries.html#group-windows) will be supported in the future.
+<span class="label label-danger">注意</span> 目前,流聚合优化仅支持 [无界聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#aggregations)。[窗口聚合]({{ site.baseurl }}/zh/dev/table/sql/queries.html#group-windows) 优化将在未来支持。
 
 * This will be replaced by the TOC
 {:toc}
 
-By default, the unbounded aggregation operator processes input records one by one, i.e., (1) read accumulator from state, (2) accumulate/retract record to accumulator, (3) write accumulator back to state, (4) the next record will do the process again from (1). This processing pattern may increase the overhead of StateBackend (especially for RocksDB StateBackend).
-Besides, data skew which is very common in production will worsen the problem and make it easy for the jobs to be under backpressure situations.
+默认情况下,无界聚合算子是一个一个的处理输入的记录,也就是说,(1)从状态读取累加器,(2)累积/撤回记录至累积器,(3)将累加器写回状态,(4)下一条记录将再次从(1)开始处理。 这种处理模式可能会增加 StateBackend 开销(尤其是对于 RocksDB StateBackend)。此外,生产中非常常见的数据倾斜会使这个问题恶化,并使 job 容易承受反压的情况。
 
-## MiniBatch Aggregation
+<a name="minibatch-aggregation"></a>

Review comment:
       Do you mean the line 39? It's for the anchor.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot commented on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618810155


   Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress of the review.
   
   
   ## Automated Checks
   Last check on commit 7969f07cca4581f4ae7cbef7bd06787b3e90d248 (Fri Apr 24 05:31:01 UTC 2020)
   
    ✅no warnings
   
   <sub>Mention the bot in a comment to re-run the automated checks.</sub>
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process.<details>
    The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`)
    - `@flinkbot approve all` to approve all aspects
    - `@flinkbot approve-until architecture` to approve everything until `architecture`
    - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention
    - `@flinkbot disapprove architecture` to remove an approval you gave earlier
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "CANCELED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162046977",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1cd0f11272e1d4013f2f09a328c8777ae32e5470 Travis: [CANCELED](https://travis-ci.com/github/flink-ci/flink/builds/162046977) Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260) 
   * 67027fc17a69ef4edc8300e4b82a9e1233b016ee UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #11897: [FLINK-16104] Translate "Streaming Aggregation" page of "Table API & SQL" into Chinese

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #11897:
URL: https://github.com/apache/flink/pull/11897#issuecomment-618812841


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=168",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/161766709",
       "triggerID" : "7969f07cca4581f4ae7cbef7bd06787b3e90d248",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "DELETED",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162046977",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=260",
       "triggerID" : "1cd0f11272e1d4013f2f09a328c8777ae32e5470",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67027fc17a69ef4edc8300e4b82a9e1233b016ee",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "SUCCESS",
       "url" : "https://travis-ci.com/github/flink-ci/flink/builds/162152427",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=283",
       "triggerID" : "da909ac1e2831251bf60375a080cc192e4e4a0c8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 67027fc17a69ef4edc8300e4b82a9e1233b016ee UNKNOWN
   * da909ac1e2831251bf60375a080cc192e4e4a0c8 Travis: [SUCCESS](https://travis-ci.com/github/flink-ci/flink/builds/162152427) Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=283) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org