You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/07/03 12:57:53 UTC

[GitHub] [flink-web] tillrohrmann commented on a change in pull request #352: Add 1.11 Release announcement.

tillrohrmann commented on a change in pull request #352:
URL: https://github.com/apache/flink-web/pull/352#discussion_r449563682



##########
File path: _posts/2020-07-06-release-1.11.0.md
##########
@@ -0,0 +1,307 @@
+---
+layout: post 
+title:  "Apache Flink 1.11.0 Release Announcement" 
+date: 2020-07-06T08:00:00.000Z
+categories: news
+authors:
+- morsapaes:
+  name: "Marta Paes"
+  twitter: "morsapaes"
+
+excerpt: The Apache Flink community is proud to announce the release of Flink 1.11.0! More than 200 contributors worked on over 1.3k issues to bring significant improvements to usability as well as new features to Flink users across the whole API stack. We're particularly excited about unaligned checkpoints to cope with high backpressure scenarios, a new source API that simplifies and unifies the implementation of (custom) sources, and support for Change Data Capture (CDC) and other common use cases in the Table API/SQL. Read on for all major new features and improvements, important changes to be aware of and what to expect moving forward!
+---
+
+The Apache Flink community is proud to announce the release of Flink 1.11.0! More than 200 contributors worked on over 1.3k issues to bring significant improvements to usability as well as new features to Flink users across the whole API stack. Some highlights that we're particularly excited about are:
+
+* The core engine is introducing **unaligned checkpoints**, a major change to Flink's fault tolerance mechanism that improves checkpointing performance under heavy backpressure.
+
+* A **new Source API** that simplifies the implementation of (custom) sources by unifying batch and streaming execution, as well as offloading internals such as event-time handling, watermark generation or idleness detection to Flink.
+
+* Flink SQL is introducing **Support for Change Data Capture (CDC)** to easily consume and interpret database changelogs from tools like Debezium. The renewed **FileSystem Connector** also expands the set of use cases and formats supported in the Table API/SQL, enabling scenarios like streaming data directly from Kafka to Hive.
+
+* Multiple performance optimizations to PyFlink, including support for **vectorized User-defined Functions (Pandas UDFs)**. This improves interoperability with libraries like Pandas and NumPy, making Flink more powerful for data science and ML workloads.
+
+Read on for all major new features and improvements, important changes to be aware of and what to expect moving forward!
+
+{% toc %}
+
+The binary distribution and source artifacts are now available on the updated [Downloads page]({{ site.baseurl }}/downloads.html) of the Flink website, and the most recent distribution of PyFlink is available on [PyPI](https://pypi.org/project/apache-flink/). Please review the [release notes]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/release-notes/flink-1.11.html) carefully, and check the complete [release changelog](https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346364&styleName=Html&projectId=12315522) and [updated documentation]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/flink-docs-release-1.11/) for more details. 
+
+We encourage you to download the release and share your feedback with the community through the [Flink mailing lists](https://flink.apache.org/community.html#mailing-lists) or [JIRA](https://issues.apache.org/jira/projects/FLINK/summary).
+
+## New Features and Improvements
+
+### Unaligned Checkpoints (Beta)
+
+Triggering a checkpoint in Flink will cause a [checkpoint barrier]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/internals/stream_checkpointing.html#barriers) to flow from the sources of your topology all the way towards the sinks. For operators that receive more than one input stream, the barriers flowing through each channel need to be aligned before the operator can snapshot its state and forward the checkpoint barrier — typically, this alignment will take just a few milliseconds to complete, but it can become a bottleneck in backpressured pipelines as:
+
+ * Checkpoint barriers will flow much slower through backpressured channels, effectively blocking the remaining channels and their upstream operators during checkpointing;
+
+ * Slow checkpoint barrier propagation leads to longer checkpointing times and can, worst case, result in little to no progress in the application.
+
+To improve the performance of checkpointing under backpressure scenarios, the community is rolling out the first iteration of unaligned checkpoints ([FLIP-76](https://cwiki.apache.org/confluence/display/FLINK/FLIP-76%3A+Unaligned+Checkpoints)) with Flink 1.11. Compared to the original checkpointing mechanism (Fig. 1), this approach doesn’t wait for barrier alignment across input channels, instead allowing barriers to overtake in-flight records (i.e., data stored in buffers) and forwarding them downstream before the synchronous part of the checkpoint takes place (Fig. 2).
+
+<div style="line-height:60%;">
+    <br>
+</div>
+
+<div class="row">
+  <div class="col-lg-6">
+    <div class="text-center">
+      <figure>
+		<img src="{{ site.baseurl }}/img/blog/2020-07-06-release-1.11.0/image1.gif" width="600px" alt="Aligned Checkpoints"/>
+		<br/><br/>
+		<figcaption><i><b>Fig.1:</b> Aligned Checkpoints</i></figcaption>
+	  </figure>
+    </div>
+  </div>
+  <div class="col-lg-6">
+    <div class="text-center">
+      <figure>
+		<img src="{{ site.baseurl }}/img/blog/2020-07-06-release-1.11.0/image2.png" width="600px" alt="Unaligned Checkpoints"/>
+		<br/><br/>
+		<figcaption><i><b>Fig.2:</b> Unaligned Checkpoints</i></figcaption>
+	  </figure>
+    </div>
+  </div>
+</div>
+
+<div style="line-height:150%;">
+    <br>
+</div>
+
+Because in-flight records have to be persisted as part of the snapshot, unaligned checkpoints will lead to increased checkpoints sizes. On the upside, **checkpointing times are heavily reduced**, so users will see more progress (even in unstable environments) as more up-to-date checkpoints will lighten the recovery process. You can learn more about the current limitations of unaligned checkpoints in the [documentation]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/ops/state/checkpoints.html#unaligned-checkpoints), and track the improvement work planned for this feature in [FLINK-14551](https://issues.apache.org/jira/browse/FLINK-14551). 
+
+As with any beta feature, we appreciate early feedback that you might want to share with the community after giving unaligned checkpoints a try!
+
+<span class="label label-info">Info</span> To enable this feature, you need to configure the [``enableUnalignedCheckpoints``]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/api/java/org/apache/flink/streaming/api/environment/CheckpointConfig.html) option in your [checkpoint config]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/dev/stream/state/checkpointing.html#enabling-and-configuring-checkpointing). Please note that unaligned checkpoints can only be enabled if ``checkpointingMode`` is set to ``CheckpointingMode.EXACTLY_ONCE``.
+
+### Unified Watermark Generators
+
+So far, watermark generation (prev. also called _assignment_) relied on two different interfaces: [``AssignerWithPunctuatedWatermarks``]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/api/java/org/apache/flink/streaming/api/functions/AssignerWithPunctuatedWatermarks.html) and [``AssignerWithPeriodicWatermarks``]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/api/java/org/apache/flink/streaming/api/functions/AssignerWithPeriodicWatermarks.html); that were closely intertwined with timestamp extraction. This made it difficult to implement long-requested features like support for idleness detection, besides leading to code duplication and maintenance burden. With [FLIP-126](https://cwiki.apache.org/confluence/display/FLINK/FLIP-126%3A+Unify+%28and+separate%29+Watermark+Assigners), the legacy watermark assigners are unified into a single interface: the [``WatermarkGenerator``]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/api/java/org/apache/flink/api/common/eventtime/WatermarkGenerator.html); and detached from the [``TimestampAssigner``]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/api/java/org/apache/flink/api/common/eventtime/TimestampAssigner.html). 
+
+This gives users more control over watermark emission and simplifies the implementation of new connectors that need to support watermark assignment and timestamp extraction at the source (see _[New Data Source API](#new-data-source-api-beta)_). Multiple [strategies for watermarking]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11//dev/event_timestamps_watermarks.html#introduction-to-watermark-strategies) are available out-of-the-box as convenience methods in Flink 1.11 (e.g. [``forBoundedOutOfOrderness``]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/api/java/org/apache/flink/api/common/eventtime/WatermarkStrategy.html#forBoundedOutOfOrderness-java.time.Duration-), [``forMonotonousTimestamps``]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/api/java/org/apache/flink/api/common/eventtime/WatermarkStrategy.html#forMonotonousTimestamps--)), though you can also choose to customize your own.
+
+**Support for Watermark Idleness Detection**
+
+The [``WatermarkStrategy.withIdleness()``]({{ site.DOCS_BASE_URL }}flink-docs-release-1.11/api/java/org/apache/flink/api/common/eventtime/WatermarkStrategy.html#withIdleness-java.time.Duration-) method allows you to mark a stream as idle if no events arrive within a configured time (i.e. a timeout duration), which in turn allows handling event time skew properly and preventing idle partitions from holding back the event time progress of the entire application. Users can already benefit from **per-partition idleness detection** in the Kafka connector, which has been adapted to use the new interfaces ([FLINK-17669](https://issues.apache.org/jira/browse/FLINK-17669)).
+
+<span class="label label-info">Note</span> This FLIP introduces no breaking changes. We recommend that users give preference to the new ``WatermarkGenerator`` interface moving forward, in preparation for the deprecation of the legacy watermark assigners in future releases.

Review comment:
       ```suggestion
   <span class="label label-info">Note</span> This [FLIP](url to flip) introduces no breaking changes. We recommend that users give preference to the new ``WatermarkGenerator`` interface moving forward, in preparation for the deprecation of the legacy watermark assigners in future releases.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org