You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/10/25 07:03:38 UTC

[GitHub] [flink-web] rmetzger commented on a diff in pull request #574: Announcement blogpost for the 1.16 release

rmetzger commented on code in PR #574:
URL: https://github.com/apache/flink-web/pull/574#discussion_r1004049581


##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).

Review Comment:
   ```suggestion
   allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate with Flink into 
   the Hive ecosystem (DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 

Review Comment:
   ```suggestion
   The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
   by manually modifying the execution plans. Join is the most widely used operator in batch jobs, 
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.

Review Comment:
   ```suggestion
   Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
   which is caused by unhealthy TaskManagers. An unhealthy TaskManagers may have hardware problems, 
   high I/O utilization, or high CPU load. These problems may make the hosted tasks run 
   much slower than tasks on other nodes, and affect the overall execution time of a batch job.
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 

Review Comment:
   ```suggestion
   Most existing sources can work with speculative execution ([FLIP-245](https://cwiki.apache.org/confluence/display/FLINK/FLIP-245%3A+Source+Supports+Speculative+Execution+For+Batch+Job)). Only if a source uses 
   the `SourceEvent` interface, it must implement the `SupportsHandleExecutionAttemptSourceEvent` interface to support 
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 
+speculative execution. Sinks do not support speculative execution yet so that 
+speculative execution will not happen on sinks at the moment.
+
+The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of 
+tasks and blocked task managers.
+
+## Hybrid Shuffle Mode
+
+We have introduced a new [Hybrid Shuffle](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/batch/batch_shuffle) 
+Mode for batch executions. It combines the advantages of blocking shuffle and 
+pipelined shuffle (in streaming mode).
+ - Like blocking shuffle, it does not require upstream and downstream tasks to run simultaneously, 
+   which allows executing a job with little resources.
+ - Like pipelined shuffle, it does not require downstream tasks to be executed 
+   after upstream tasks finish, which reduces the overall execution time of the job when 
+   given sufficient resources.
+ - It adapts to custom preferences between persisting less data and restarting less tasks on failures, 
+   by providing different spilling strategies.
+
+Note: This feature is experimental and by default not activated.
+
+## Further improvements of blocking shuffle
+
+We further improved the blocking shuffle usability and performance in this version, 
+including adaptive network buffer allocation, sequential IO optimization and result partition reuse 
+which allows multiple consumer vertices reusing the same physical result partition to reduce disk IO 
+and storage space. These optimizations can achieve an overall 7% performance gain for 
+10 T scale TPC-DS tests. In addition, two more compression algorithms (LZO and ZSTD) 
+with higher compression ratio were introduced which can further reduce the storage space with 
+some CPU cost compared to the default LZ4 compression algorithm.
+
+## Dynamic Partition Pruning
+
+For batch jobs, partitioned tables are more widely used than non-partitioned tables in 
+production environments. Currently Flink has support for static partition pruning, 
+where the optimizer pushes down the partition field related filter conditions in the WHERE clause 
+into the Source Connector during the optimization phase, thus reducing unnecessary partition scan IO. 
+The [star-schema](https://en.wikipedia.org/wiki/Star_schema) is the simplest of 
+the most commonly used data mart patterns. We have found that many user jobs cannot use 
+static partition pruning because partition pruning information can only be determined in execution, 
+which requires [dynamic partition pruning techniques](https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning),
+where partition pruning information is collected at runtime based on data from other related tables, 
+thus reducing unnecessary partition scan IO for partitioned tables. The use of dynamic partition pruning 
+has been validated with the 10 T dataset TPC-DS to improve performance by up to 30%.
+
+# Stream Processing
+
+In 1.16, we have made improvements in Checkpoints, SQL, Connector and other fields, 
+so that stream computing can continue to lead.
+
+## Generalized incremental checkpoint
+
+Changelog state backend aims at making checkpoint intervals shorter and more predictable, 
+this release is prod-ready and is dedicated to adapting changelog state backend to 
+the existing state backends and improving the usability of changelog state backend:
+ - Support state migration
+ - Support local recovery
+ - Introduce file cache to optimize restoring
+ - Support switch based on checkpoint
+ - Improve the monitoring experience of changelog state backend
+   - expose changelog's metrics
+   - expose changelog's configuration to webUI
+
+Table 1: The comparison between Changelog Enabled / Changelog Disabled on value state 
+(see [this blog](https://flink.apache.org/2022/05/30/changelog-state-backend.html) for more details)
+
+| Percentile  | End to End Duration | Checkpointed Data Size<sup>*<sup> | Full Checkpoint Data Size<sup>*<sup> |
+|-------------|---------------------|-----------------------------------|--------------------------------------|
+| 50%         | 311ms / 5s          | 14.8MB / 3.05GB                   | 24.2GB / 18.5GB                      |
+| 90%         | 664ms / 6s          | 23.5MB / 4.52GB                   | 25.2GB / 19.3GB                      |
+| 99%         | 1s / 7s             | 36.6MB / 5.19GB                   | 25.6GB / 19.6GB                      |
+| 99.9%       | 1s / 10s            | 52.8MB / 6.49GB                   | 25.7GB / 19.8GB                      |
+
+## RocksDB rescaling improvement & rescaling benchmark
+
+Rescaling is a frequent operation for cloud services built on Apache Flink, this release leverages 
+[deleteRange](https://rocksdb.org/blog/2018/11/21/delete-range.html) to optimize the rescaling of 
+Incremental RocksDB state backend. deleteRange is used to avoid massive scan-and-delete operations, 
+for upscaling with a large number of states that need to be deleted, 
+the speed of restoring can be almost [increased by 2~10 times](https://github.com/apache/flink/pull/19033#issuecomment-1072267658).
+
+<center>
+<img src="{{ site.baseurl }}/img/rocksdb_rescaling_benchmark.png" style="width:90%;margin:15px">
+</center>
+
+## Improve monitoring experience and usability of state backend
+
+This release also improves the monitoring experience and usability of state backend.
+Previously, RocksDB's log was located in its own DB folder, which made debugging RocksDB not so easy. 
+This release lets RocksDB's log stay in Flink's log directory by default. RocksDB statistics-based 
+metrics are introduced to help debug the performance at the database level, e.g. total block cache hit/miss
+count within the DB.
+
+## Support overdraft buffer
+
+A new concept of [overdraft network buffer](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers) 
+is introduced to mitigate the effects of uninterruptible blocking a subtask thread during back pressure, 
+which can be turned on through taskmanager.network.memory.max-overdraft-buffers-per-gate. 
+Starting from 1.16.0, Flink subtask can request by default up to 5 extra (overdraft) buffers over 
+the regular configured amount.  This change can slightly increase memory consumption of 
+the Flink Job but vastly reduce the checkpoint duration of the unaligned checkpoint.
+If the subtask is back pressure by downstream subtasks and the subtask requires more than 

Review Comment:
   ```suggestion
   If the subtask is back pressured by downstream subtasks and the subtask requires more than 
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 
+speculative execution. Sinks do not support speculative execution yet so that 
+speculative execution will not happen on sinks at the moment.
+
+The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of 
+tasks and blocked task managers.
+
+## Hybrid Shuffle Mode
+
+We have introduced a new [Hybrid Shuffle](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/batch/batch_shuffle) 
+Mode for batch executions. It combines the advantages of blocking shuffle and 
+pipelined shuffle (in streaming mode).
+ - Like blocking shuffle, it does not require upstream and downstream tasks to run simultaneously, 
+   which allows executing a job with little resources.
+ - Like pipelined shuffle, it does not require downstream tasks to be executed 
+   after upstream tasks finish, which reduces the overall execution time of the job when 
+   given sufficient resources.
+ - It adapts to custom preferences between persisting less data and restarting less tasks on failures, 
+   by providing different spilling strategies.
+
+Note: This feature is experimental and by default not activated.
+
+## Further improvements of blocking shuffle
+
+We further improved the blocking shuffle usability and performance in this version, 
+including adaptive network buffer allocation, sequential IO optimization and result partition reuse 
+which allows multiple consumer vertices reusing the same physical result partition to reduce disk IO 
+and storage space. These optimizations can achieve an overall 7% performance gain for 
+10 T scale TPC-DS tests. In addition, two more compression algorithms (LZO and ZSTD) 
+with higher compression ratio were introduced which can further reduce the storage space with 
+some CPU cost compared to the default LZ4 compression algorithm.
+
+## Dynamic Partition Pruning
+
+For batch jobs, partitioned tables are more widely used than non-partitioned tables in 
+production environments. Currently Flink has support for static partition pruning, 
+where the optimizer pushes down the partition field related filter conditions in the WHERE clause 
+into the Source Connector during the optimization phase, thus reducing unnecessary partition scan IO. 
+The [star-schema](https://en.wikipedia.org/wiki/Star_schema) is the simplest of 
+the most commonly used data mart patterns. We have found that many user jobs cannot use 
+static partition pruning because partition pruning information can only be determined in execution, 
+which requires [dynamic partition pruning techniques](https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning),
+where partition pruning information is collected at runtime based on data from other related tables, 
+thus reducing unnecessary partition scan IO for partitioned tables. The use of dynamic partition pruning 
+has been validated with the 10 T dataset TPC-DS to improve performance by up to 30%.
+
+# Stream Processing
+
+In 1.16, we have made improvements in Checkpoints, SQL, Connector and other fields, 
+so that stream computing can continue to lead.
+
+## Generalized incremental checkpoint
+
+Changelog state backend aims at making checkpoint intervals shorter and more predictable, 
+this release is prod-ready and is dedicated to adapting changelog state backend to 
+the existing state backends and improving the usability of changelog state backend:
+ - Support state migration
+ - Support local recovery
+ - Introduce file cache to optimize restoring
+ - Support switch based on checkpoint
+ - Improve the monitoring experience of changelog state backend
+   - expose changelog's metrics
+   - expose changelog's configuration to webUI
+
+Table 1: The comparison between Changelog Enabled / Changelog Disabled on value state 
+(see [this blog](https://flink.apache.org/2022/05/30/changelog-state-backend.html) for more details)
+
+| Percentile  | End to End Duration | Checkpointed Data Size<sup>*<sup> | Full Checkpoint Data Size<sup>*<sup> |
+|-------------|---------------------|-----------------------------------|--------------------------------------|
+| 50%         | 311ms / 5s          | 14.8MB / 3.05GB                   | 24.2GB / 18.5GB                      |
+| 90%         | 664ms / 6s          | 23.5MB / 4.52GB                   | 25.2GB / 19.3GB                      |
+| 99%         | 1s / 7s             | 36.6MB / 5.19GB                   | 25.6GB / 19.6GB                      |
+| 99.9%       | 1s / 10s            | 52.8MB / 6.49GB                   | 25.7GB / 19.8GB                      |
+
+## RocksDB rescaling improvement & rescaling benchmark
+
+Rescaling is a frequent operation for cloud services built on Apache Flink, this release leverages 
+[deleteRange](https://rocksdb.org/blog/2018/11/21/delete-range.html) to optimize the rescaling of 
+Incremental RocksDB state backend. deleteRange is used to avoid massive scan-and-delete operations, 
+for upscaling with a large number of states that need to be deleted, 
+the speed of restoring can be almost [increased by 2~10 times](https://github.com/apache/flink/pull/19033#issuecomment-1072267658).
+
+<center>
+<img src="{{ site.baseurl }}/img/rocksdb_rescaling_benchmark.png" style="width:90%;margin:15px">
+</center>
+
+## Improve monitoring experience and usability of state backend
+
+This release also improves the monitoring experience and usability of state backend.
+Previously, RocksDB's log was located in its own DB folder, which made debugging RocksDB not so easy. 
+This release lets RocksDB's log stay in Flink's log directory by default. RocksDB statistics-based 
+metrics are introduced to help debug the performance at the database level, e.g. total block cache hit/miss
+count within the DB.
+
+## Support overdraft buffer
+
+A new concept of [overdraft network buffer](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers) 
+is introduced to mitigate the effects of uninterruptible blocking a subtask thread during back pressure, 
+which can be turned on through taskmanager.network.memory.max-overdraft-buffers-per-gate. 
+Starting from 1.16.0, Flink subtask can request by default up to 5 extra (overdraft) buffers over 
+the regular configured amount.  This change can slightly increase memory consumption of 
+the Flink Job but vastly reduce the checkpoint duration of the unaligned checkpoint.
+If the subtask is back pressure by downstream subtasks and the subtask requires more than 
+a single network buffer to finish what it's currently doing, overdraft buffer comes into play.
+Read more about this in [the documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers).
+
+## Timeout aligned to unaligned checkpoint barrier in the output buffers of an upstream subtask
+
+This release updates the timing of switching from Aligned Checkpoint (AC)  to Unaligned Checkpoint (UC).
+With UC enabled, if [execution.checkpointing.aligned-checkpoint-timeout](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/state/checkpointing_under_backpressure/#aligned-checkpoint-timeout) 
+is configured, each checkpoint will still begin as an AC, but when the global checkpoint duration 
+exceeds the aligned-checkpoint-timeout, if the AC has not been completed, 
+then the checkpoint will be switched to unaligned.
+
+Previously, the switch of one subtask needs to wait for all barriers from upstream. 
+If the back pressure is served, the downstream subtask may not receive all barriers 

Review Comment:
   not sure about this one
   ```suggestion
   If the back pressure is severe, the downstream subtask may not receive all barriers 
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 
+speculative execution. Sinks do not support speculative execution yet so that 
+speculative execution will not happen on sinks at the moment.
+
+The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of 
+tasks and blocked task managers.
+
+## Hybrid Shuffle Mode
+
+We have introduced a new [Hybrid Shuffle](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/batch/batch_shuffle) 
+Mode for batch executions. It combines the advantages of blocking shuffle and 
+pipelined shuffle (in streaming mode).
+ - Like blocking shuffle, it does not require upstream and downstream tasks to run simultaneously, 
+   which allows executing a job with little resources.
+ - Like pipelined shuffle, it does not require downstream tasks to be executed 
+   after upstream tasks finish, which reduces the overall execution time of the job when 
+   given sufficient resources.
+ - It adapts to custom preferences between persisting less data and restarting less tasks on failures, 
+   by providing different spilling strategies.
+
+Note: This feature is experimental and by default not activated.
+
+## Further improvements of blocking shuffle
+
+We further improved the blocking shuffle usability and performance in this version, 
+including adaptive network buffer allocation, sequential IO optimization and result partition reuse 
+which allows multiple consumer vertices reusing the same physical result partition to reduce disk IO 
+and storage space. These optimizations can achieve an overall 7% performance gain for 
+10 T scale TPC-DS tests. In addition, two more compression algorithms (LZO and ZSTD) 
+with higher compression ratio were introduced which can further reduce the storage space with 
+some CPU cost compared to the default LZ4 compression algorithm.
+
+## Dynamic Partition Pruning
+
+For batch jobs, partitioned tables are more widely used than non-partitioned tables in 
+production environments. Currently Flink has support for static partition pruning, 
+where the optimizer pushes down the partition field related filter conditions in the WHERE clause 
+into the Source Connector during the optimization phase, thus reducing unnecessary partition scan IO. 
+The [star-schema](https://en.wikipedia.org/wiki/Star_schema) is the simplest of 
+the most commonly used data mart patterns. We have found that many user jobs cannot use 
+static partition pruning because partition pruning information can only be determined in execution, 
+which requires [dynamic partition pruning techniques](https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning),
+where partition pruning information is collected at runtime based on data from other related tables, 
+thus reducing unnecessary partition scan IO for partitioned tables. The use of dynamic partition pruning 
+has been validated with the 10 T dataset TPC-DS to improve performance by up to 30%.
+
+# Stream Processing
+
+In 1.16, we have made improvements in Checkpoints, SQL, Connector and other fields, 
+so that stream computing can continue to lead.
+
+## Generalized incremental checkpoint
+
+Changelog state backend aims at making checkpoint intervals shorter and more predictable, 
+this release is prod-ready and is dedicated to adapting changelog state backend to 
+the existing state backends and improving the usability of changelog state backend:
+ - Support state migration
+ - Support local recovery
+ - Introduce file cache to optimize restoring
+ - Support switch based on checkpoint
+ - Improve the monitoring experience of changelog state backend
+   - expose changelog's metrics
+   - expose changelog's configuration to webUI
+
+Table 1: The comparison between Changelog Enabled / Changelog Disabled on value state 
+(see [this blog](https://flink.apache.org/2022/05/30/changelog-state-backend.html) for more details)
+
+| Percentile  | End to End Duration | Checkpointed Data Size<sup>*<sup> | Full Checkpoint Data Size<sup>*<sup> |
+|-------------|---------------------|-----------------------------------|--------------------------------------|
+| 50%         | 311ms / 5s          | 14.8MB / 3.05GB                   | 24.2GB / 18.5GB                      |
+| 90%         | 664ms / 6s          | 23.5MB / 4.52GB                   | 25.2GB / 19.3GB                      |
+| 99%         | 1s / 7s             | 36.6MB / 5.19GB                   | 25.6GB / 19.6GB                      |
+| 99.9%       | 1s / 10s            | 52.8MB / 6.49GB                   | 25.7GB / 19.8GB                      |
+
+## RocksDB rescaling improvement & rescaling benchmark
+
+Rescaling is a frequent operation for cloud services built on Apache Flink, this release leverages 
+[deleteRange](https://rocksdb.org/blog/2018/11/21/delete-range.html) to optimize the rescaling of 
+Incremental RocksDB state backend. deleteRange is used to avoid massive scan-and-delete operations, 
+for upscaling with a large number of states that need to be deleted, 
+the speed of restoring can be almost [increased by 2~10 times](https://github.com/apache/flink/pull/19033#issuecomment-1072267658).
+
+<center>
+<img src="{{ site.baseurl }}/img/rocksdb_rescaling_benchmark.png" style="width:90%;margin:15px">
+</center>
+
+## Improve monitoring experience and usability of state backend
+
+This release also improves the monitoring experience and usability of state backend.
+Previously, RocksDB's log was located in its own DB folder, which made debugging RocksDB not so easy. 
+This release lets RocksDB's log stay in Flink's log directory by default. RocksDB statistics-based 
+metrics are introduced to help debug the performance at the database level, e.g. total block cache hit/miss
+count within the DB.
+
+## Support overdraft buffer
+
+A new concept of [overdraft network buffer](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers) 
+is introduced to mitigate the effects of uninterruptible blocking a subtask thread during back pressure, 
+which can be turned on through taskmanager.network.memory.max-overdraft-buffers-per-gate. 
+Starting from 1.16.0, Flink subtask can request by default up to 5 extra (overdraft) buffers over 
+the regular configured amount.  This change can slightly increase memory consumption of 
+the Flink Job but vastly reduce the checkpoint duration of the unaligned checkpoint.
+If the subtask is back pressure by downstream subtasks and the subtask requires more than 
+a single network buffer to finish what it's currently doing, overdraft buffer comes into play.
+Read more about this in [the documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers).
+
+## Timeout aligned to unaligned checkpoint barrier in the output buffers of an upstream subtask
+
+This release updates the timing of switching from Aligned Checkpoint (AC)  to Unaligned Checkpoint (UC).
+With UC enabled, if [execution.checkpointing.aligned-checkpoint-timeout](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/state/checkpointing_under_backpressure/#aligned-checkpoint-timeout) 
+is configured, each checkpoint will still begin as an AC, but when the global checkpoint duration 
+exceeds the aligned-checkpoint-timeout, if the AC has not been completed, 
+then the checkpoint will be switched to unaligned.
+
+Previously, the switch of one subtask needs to wait for all barriers from upstream. 
+If the back pressure is served, the downstream subtask may not receive all barriers 
+within [checkpointing-timeout](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/config/#execution-checkpointing-timeout), 
+causing the checkpoint to fail.
+
+In this release, If the barrier cannot be sent from the output buffer to the downstream task 
+within the execution.checkpointing.aligned-checkpoint-timeout, Flink lets upstream subtasks 
+switch to UC first to send barriers to downstream, thereby decreasing 
+the probability of checkpoint timeout during back pressure.
+More details can be found in [this documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/state/checkpointing_under_backpressure/#aligned-checkpoint-timeout).
+
+## Non-Determinism In Stream Processing
+
+Users often complain about the high cost of understanding stream processing. 
+One of the pain point is the non-determinisim in stream processing (usually not intuitive) 
+which may cause wrong results or errors, it is long lived since the first day 
+that flink sql was available.

Review Comment:
   ```suggestion
   Users often complain about the high cost of understanding stream processing. 
   One of the pain points is the non-determinisim in stream processing (usually not intuitive) 
   which may cause wrong results or errors. These pain points have been around since the
   early days of Flink SQL.
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 
+speculative execution. Sinks do not support speculative execution yet so that 
+speculative execution will not happen on sinks at the moment.
+
+The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of 
+tasks and blocked task managers.
+
+## Hybrid Shuffle Mode
+
+We have introduced a new [Hybrid Shuffle](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/batch/batch_shuffle) 
+Mode for batch executions. It combines the advantages of blocking shuffle and 
+pipelined shuffle (in streaming mode).
+ - Like blocking shuffle, it does not require upstream and downstream tasks to run simultaneously, 
+   which allows executing a job with little resources.
+ - Like pipelined shuffle, it does not require downstream tasks to be executed 
+   after upstream tasks finish, which reduces the overall execution time of the job when 
+   given sufficient resources.
+ - It adapts to custom preferences between persisting less data and restarting less tasks on failures, 
+   by providing different spilling strategies.
+
+Note: This feature is experimental and by default not activated.
+
+## Further improvements of blocking shuffle
+
+We further improved the blocking shuffle usability and performance in this version, 
+including adaptive network buffer allocation, sequential IO optimization and result partition reuse 
+which allows multiple consumer vertices reusing the same physical result partition to reduce disk IO 
+and storage space. These optimizations can achieve an overall 7% performance gain for 
+10 T scale TPC-DS tests. In addition, two more compression algorithms (LZO and ZSTD) 

Review Comment:
   Is the TPC-DS test with a scale of 10 TB?



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.

Review Comment:
   ```suggestion
   which can automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
   This mechanism ensures that the Hash-Join is always successful and improves the stability 
   by gracefully degrading from a Hash-Join to a more robust Sort-Merge-Join.
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.

Review Comment:
   ```suggestion
     - Functionality: Join hints let Flink SQL users manually specify join strategies
     to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
     and users can migrate from Hive to Flink at a very low cost.
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 
+speculative execution. Sinks do not support speculative execution yet so that 
+speculative execution will not happen on sinks at the moment.
+
+The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of 

Review Comment:
   ```suggestion
   The Web UI & REST API are also improved ([FLIP-249](https://cwiki.apache.org/confluence/display/FLINK/FLIP-249%3A+Flink+Web+UI+Enhancement+for+Speculative+Execution)) to display multiple concurrent attempts of 
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 

Review Comment:
   ```suggestion
   When speculative execution is enabled, Flink will keep detecting slow tasks. 
   Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
   as problematic nodes and get blocked via the blocklist mechanism ([FLIP-224](https://cwiki.apache.org/confluence/display/FLINK/FLIP-224%3A+Blocklist+Mechanism)). 
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 
+speculative execution. Sinks do not support speculative execution yet so that 
+speculative execution will not happen on sinks at the moment.
+
+The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of 
+tasks and blocked task managers.
+
+## Hybrid Shuffle Mode
+
+We have introduced a new [Hybrid Shuffle](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/batch/batch_shuffle) 
+Mode for batch executions. It combines the advantages of blocking shuffle and 
+pipelined shuffle (in streaming mode).
+ - Like blocking shuffle, it does not require upstream and downstream tasks to run simultaneously, 
+   which allows executing a job with little resources.
+ - Like pipelined shuffle, it does not require downstream tasks to be executed 
+   after upstream tasks finish, which reduces the overall execution time of the job when 
+   given sufficient resources.
+ - It adapts to custom preferences between persisting less data and restarting less tasks on failures, 
+   by providing different spilling strategies.
+
+Note: This feature is experimental and by default not activated.
+
+## Further improvements of blocking shuffle
+
+We further improved the blocking shuffle usability and performance in this version, 
+including adaptive network buffer allocation, sequential IO optimization and result partition reuse 
+which allows multiple consumer vertices reusing the same physical result partition to reduce disk IO 
+and storage space. These optimizations can achieve an overall 7% performance gain for 
+10 T scale TPC-DS tests. In addition, two more compression algorithms (LZO and ZSTD) 

Review Comment:
   What does 
   > 10 T scale TPC-DS tests
   
   mean?



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 

Review Comment:
   ```suggestion
   the optimizer can lead to the wrong choice of a join strategy, which will cause slow execution 
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 
+speculative execution. Sinks do not support speculative execution yet so that 
+speculative execution will not happen on sinks at the moment.
+
+The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of 
+tasks and blocked task managers.
+
+## Hybrid Shuffle Mode
+
+We have introduced a new [Hybrid Shuffle](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/batch/batch_shuffle) 
+Mode for batch executions. It combines the advantages of blocking shuffle and 
+pipelined shuffle (in streaming mode).
+ - Like blocking shuffle, it does not require upstream and downstream tasks to run simultaneously, 
+   which allows executing a job with little resources.
+ - Like pipelined shuffle, it does not require downstream tasks to be executed 
+   after upstream tasks finish, which reduces the overall execution time of the job when 
+   given sufficient resources.
+ - It adapts to custom preferences between persisting less data and restarting less tasks on failures, 
+   by providing different spilling strategies.
+
+Note: This feature is experimental and by default not activated.
+
+## Further improvements of blocking shuffle
+
+We further improved the blocking shuffle usability and performance in this version, 
+including adaptive network buffer allocation, sequential IO optimization and result partition reuse 
+which allows multiple consumer vertices reusing the same physical result partition to reduce disk IO 
+and storage space. These optimizations can achieve an overall 7% performance gain for 
+10 T scale TPC-DS tests. In addition, two more compression algorithms (LZO and ZSTD) 
+with higher compression ratio were introduced which can further reduce the storage space with 
+some CPU cost compared to the default LZ4 compression algorithm.
+
+## Dynamic Partition Pruning
+
+For batch jobs, partitioned tables are more widely used than non-partitioned tables in 
+production environments. Currently Flink has support for static partition pruning, 
+where the optimizer pushes down the partition field related filter conditions in the WHERE clause 
+into the Source Connector during the optimization phase, thus reducing unnecessary partition scan IO. 
+The [star-schema](https://en.wikipedia.org/wiki/Star_schema) is the simplest of 
+the most commonly used data mart patterns. We have found that many user jobs cannot use 
+static partition pruning because partition pruning information can only be determined in execution, 
+which requires [dynamic partition pruning techniques](https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning),
+where partition pruning information is collected at runtime based on data from other related tables, 
+thus reducing unnecessary partition scan IO for partitioned tables. The use of dynamic partition pruning 
+has been validated with the 10 T dataset TPC-DS to improve performance by up to 30%.
+
+# Stream Processing
+
+In 1.16, we have made improvements in Checkpoints, SQL, Connector and other fields, 
+so that stream computing can continue to lead.
+
+## Generalized incremental checkpoint
+
+Changelog state backend aims at making checkpoint intervals shorter and more predictable, 
+this release is prod-ready and is dedicated to adapting changelog state backend to 
+the existing state backends and improving the usability of changelog state backend:
+ - Support state migration
+ - Support local recovery
+ - Introduce file cache to optimize restoring
+ - Support switch based on checkpoint
+ - Improve the monitoring experience of changelog state backend
+   - expose changelog's metrics
+   - expose changelog's configuration to webUI
+
+Table 1: The comparison between Changelog Enabled / Changelog Disabled on value state 
+(see [this blog](https://flink.apache.org/2022/05/30/changelog-state-backend.html) for more details)
+
+| Percentile  | End to End Duration | Checkpointed Data Size<sup>*<sup> | Full Checkpoint Data Size<sup>*<sup> |
+|-------------|---------------------|-----------------------------------|--------------------------------------|
+| 50%         | 311ms / 5s          | 14.8MB / 3.05GB                   | 24.2GB / 18.5GB                      |
+| 90%         | 664ms / 6s          | 23.5MB / 4.52GB                   | 25.2GB / 19.3GB                      |
+| 99%         | 1s / 7s             | 36.6MB / 5.19GB                   | 25.6GB / 19.6GB                      |
+| 99.9%       | 1s / 10s            | 52.8MB / 6.49GB                   | 25.7GB / 19.8GB                      |
+
+## RocksDB rescaling improvement & rescaling benchmark
+
+Rescaling is a frequent operation for cloud services built on Apache Flink, this release leverages 
+[deleteRange](https://rocksdb.org/blog/2018/11/21/delete-range.html) to optimize the rescaling of 
+Incremental RocksDB state backend. deleteRange is used to avoid massive scan-and-delete operations, 
+for upscaling with a large number of states that need to be deleted, 
+the speed of restoring can be almost [increased by 2~10 times](https://github.com/apache/flink/pull/19033#issuecomment-1072267658).
+
+<center>
+<img src="{{ site.baseurl }}/img/rocksdb_rescaling_benchmark.png" style="width:90%;margin:15px">
+</center>
+
+## Improve monitoring experience and usability of state backend
+
+This release also improves the monitoring experience and usability of state backend.
+Previously, RocksDB's log was located in its own DB folder, which made debugging RocksDB not so easy. 
+This release lets RocksDB's log stay in Flink's log directory by default. RocksDB statistics-based 
+metrics are introduced to help debug the performance at the database level, e.g. total block cache hit/miss
+count within the DB.
+
+## Support overdraft buffer
+
+A new concept of [overdraft network buffer](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers) 
+is introduced to mitigate the effects of uninterruptible blocking a subtask thread during back pressure, 
+which can be turned on through taskmanager.network.memory.max-overdraft-buffers-per-gate. 

Review Comment:
   ```suggestion
   which can be turned on through the [taskmanager.network.memory.max-overdraft-buffers-per-gate](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/config/#taskmanager-network-memory-max-overdraft-buffers-per-gate) configuration parameter. 
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 
+speculative execution. Sinks do not support speculative execution yet so that 
+speculative execution will not happen on sinks at the moment.
+
+The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of 
+tasks and blocked task managers.
+
+## Hybrid Shuffle Mode
+
+We have introduced a new [Hybrid Shuffle](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/batch/batch_shuffle) 
+Mode for batch executions. It combines the advantages of blocking shuffle and 
+pipelined shuffle (in streaming mode).
+ - Like blocking shuffle, it does not require upstream and downstream tasks to run simultaneously, 
+   which allows executing a job with little resources.
+ - Like pipelined shuffle, it does not require downstream tasks to be executed 
+   after upstream tasks finish, which reduces the overall execution time of the job when 
+   given sufficient resources.
+ - It adapts to custom preferences between persisting less data and restarting less tasks on failures, 
+   by providing different spilling strategies.
+
+Note: This feature is experimental and by default not activated.
+
+## Further improvements of blocking shuffle
+
+We further improved the blocking shuffle usability and performance in this version, 
+including adaptive network buffer allocation, sequential IO optimization and result partition reuse 
+which allows multiple consumer vertices reusing the same physical result partition to reduce disk IO 
+and storage space. These optimizations can achieve an overall 7% performance gain for 
+10 T scale TPC-DS tests. In addition, two more compression algorithms (LZO and ZSTD) 
+with higher compression ratio were introduced which can further reduce the storage space with 
+some CPU cost compared to the default LZ4 compression algorithm.
+
+## Dynamic Partition Pruning
+
+For batch jobs, partitioned tables are more widely used than non-partitioned tables in 
+production environments. Currently Flink has support for static partition pruning, 
+where the optimizer pushes down the partition field related filter conditions in the WHERE clause 
+into the Source Connector during the optimization phase, thus reducing unnecessary partition scan IO. 
+The [star-schema](https://en.wikipedia.org/wiki/Star_schema) is the simplest of 
+the most commonly used data mart patterns. We have found that many user jobs cannot use 
+static partition pruning because partition pruning information can only be determined in execution, 
+which requires [dynamic partition pruning techniques](https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning),
+where partition pruning information is collected at runtime based on data from other related tables, 
+thus reducing unnecessary partition scan IO for partitioned tables. The use of dynamic partition pruning 
+has been validated with the 10 T dataset TPC-DS to improve performance by up to 30%.
+
+# Stream Processing
+
+In 1.16, we have made improvements in Checkpoints, SQL, Connector and other fields, 
+so that stream computing can continue to lead.
+
+## Generalized incremental checkpoint
+
+Changelog state backend aims at making checkpoint intervals shorter and more predictable, 
+this release is prod-ready and is dedicated to adapting changelog state backend to 
+the existing state backends and improving the usability of changelog state backend:
+ - Support state migration
+ - Support local recovery
+ - Introduce file cache to optimize restoring
+ - Support switch based on checkpoint
+ - Improve the monitoring experience of changelog state backend
+   - expose changelog's metrics
+   - expose changelog's configuration to webUI
+
+Table 1: The comparison between Changelog Enabled / Changelog Disabled on value state 
+(see [this blog](https://flink.apache.org/2022/05/30/changelog-state-backend.html) for more details)
+
+| Percentile  | End to End Duration | Checkpointed Data Size<sup>*<sup> | Full Checkpoint Data Size<sup>*<sup> |
+|-------------|---------------------|-----------------------------------|--------------------------------------|
+| 50%         | 311ms / 5s          | 14.8MB / 3.05GB                   | 24.2GB / 18.5GB                      |
+| 90%         | 664ms / 6s          | 23.5MB / 4.52GB                   | 25.2GB / 19.3GB                      |
+| 99%         | 1s / 7s             | 36.6MB / 5.19GB                   | 25.6GB / 19.6GB                      |
+| 99.9%       | 1s / 10s            | 52.8MB / 6.49GB                   | 25.7GB / 19.8GB                      |
+
+## RocksDB rescaling improvement & rescaling benchmark
+
+Rescaling is a frequent operation for cloud services built on Apache Flink, this release leverages 
+[deleteRange](https://rocksdb.org/blog/2018/11/21/delete-range.html) to optimize the rescaling of 
+Incremental RocksDB state backend. deleteRange is used to avoid massive scan-and-delete operations, 
+for upscaling with a large number of states that need to be deleted, 
+the speed of restoring can be almost [increased by 2~10 times](https://github.com/apache/flink/pull/19033#issuecomment-1072267658).
+
+<center>
+<img src="{{ site.baseurl }}/img/rocksdb_rescaling_benchmark.png" style="width:90%;margin:15px">
+</center>
+
+## Improve monitoring experience and usability of state backend
+
+This release also improves the monitoring experience and usability of state backend.
+Previously, RocksDB's log was located in its own DB folder, which made debugging RocksDB not so easy. 
+This release lets RocksDB's log stay in Flink's log directory by default. RocksDB statistics-based 
+metrics are introduced to help debug the performance at the database level, e.g. total block cache hit/miss
+count within the DB.
+
+## Support overdraft buffer
+
+A new concept of [overdraft network buffer](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers) 
+is introduced to mitigate the effects of uninterruptible blocking a subtask thread during back pressure, 
+which can be turned on through taskmanager.network.memory.max-overdraft-buffers-per-gate. 
+Starting from 1.16.0, Flink subtask can request by default up to 5 extra (overdraft) buffers over 
+the regular configured amount.  This change can slightly increase memory consumption of 
+the Flink Job but vastly reduce the checkpoint duration of the unaligned checkpoint.
+If the subtask is back pressure by downstream subtasks and the subtask requires more than 
+a single network buffer to finish what it's currently doing, overdraft buffer comes into play.
+Read more about this in [the documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers).
+
+## Timeout aligned to unaligned checkpoint barrier in the output buffers of an upstream subtask
+
+This release updates the timing of switching from Aligned Checkpoint (AC)  to Unaligned Checkpoint (UC).

Review Comment:
   ```suggestion
   This release updates the timing of switching from Aligned Checkpoint (AC) to Unaligned Checkpoint (UC).
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 
+speculative execution. Sinks do not support speculative execution yet so that 
+speculative execution will not happen on sinks at the moment.
+
+The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of 
+tasks and blocked task managers.
+
+## Hybrid Shuffle Mode
+
+We have introduced a new [Hybrid Shuffle](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/batch/batch_shuffle) 
+Mode for batch executions. It combines the advantages of blocking shuffle and 
+pipelined shuffle (in streaming mode).
+ - Like blocking shuffle, it does not require upstream and downstream tasks to run simultaneously, 
+   which allows executing a job with little resources.
+ - Like pipelined shuffle, it does not require downstream tasks to be executed 
+   after upstream tasks finish, which reduces the overall execution time of the job when 
+   given sufficient resources.
+ - It adapts to custom preferences between persisting less data and restarting less tasks on failures, 
+   by providing different spilling strategies.
+
+Note: This feature is experimental and by default not activated.
+
+## Further improvements of blocking shuffle
+
+We further improved the blocking shuffle usability and performance in this version, 
+including adaptive network buffer allocation, sequential IO optimization and result partition reuse 
+which allows multiple consumer vertices reusing the same physical result partition to reduce disk IO 
+and storage space. These optimizations can achieve an overall 7% performance gain for 
+10 T scale TPC-DS tests. In addition, two more compression algorithms (LZO and ZSTD) 
+with higher compression ratio were introduced which can further reduce the storage space with 
+some CPU cost compared to the default LZ4 compression algorithm.
+
+## Dynamic Partition Pruning
+
+For batch jobs, partitioned tables are more widely used than non-partitioned tables in 
+production environments. Currently Flink has support for static partition pruning, 
+where the optimizer pushes down the partition field related filter conditions in the WHERE clause 
+into the Source Connector during the optimization phase, thus reducing unnecessary partition scan IO. 
+The [star-schema](https://en.wikipedia.org/wiki/Star_schema) is the simplest of 
+the most commonly used data mart patterns. We have found that many user jobs cannot use 
+static partition pruning because partition pruning information can only be determined in execution, 
+which requires [dynamic partition pruning techniques](https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning),
+where partition pruning information is collected at runtime based on data from other related tables, 
+thus reducing unnecessary partition scan IO for partitioned tables. The use of dynamic partition pruning 
+has been validated with the 10 T dataset TPC-DS to improve performance by up to 30%.
+
+# Stream Processing
+
+In 1.16, we have made improvements in Checkpoints, SQL, Connector and other fields, 
+so that stream computing can continue to lead.
+
+## Generalized incremental checkpoint
+
+Changelog state backend aims at making checkpoint intervals shorter and more predictable, 
+this release is prod-ready and is dedicated to adapting changelog state backend to 
+the existing state backends and improving the usability of changelog state backend:
+ - Support state migration
+ - Support local recovery
+ - Introduce file cache to optimize restoring
+ - Support switch based on checkpoint
+ - Improve the monitoring experience of changelog state backend
+   - expose changelog's metrics
+   - expose changelog's configuration to webUI
+
+Table 1: The comparison between Changelog Enabled / Changelog Disabled on value state 
+(see [this blog](https://flink.apache.org/2022/05/30/changelog-state-backend.html) for more details)
+
+| Percentile  | End to End Duration | Checkpointed Data Size<sup>*<sup> | Full Checkpoint Data Size<sup>*<sup> |
+|-------------|---------------------|-----------------------------------|--------------------------------------|
+| 50%         | 311ms / 5s          | 14.8MB / 3.05GB                   | 24.2GB / 18.5GB                      |
+| 90%         | 664ms / 6s          | 23.5MB / 4.52GB                   | 25.2GB / 19.3GB                      |
+| 99%         | 1s / 7s             | 36.6MB / 5.19GB                   | 25.6GB / 19.6GB                      |
+| 99.9%       | 1s / 10s            | 52.8MB / 6.49GB                   | 25.7GB / 19.8GB                      |
+
+## RocksDB rescaling improvement & rescaling benchmark
+
+Rescaling is a frequent operation for cloud services built on Apache Flink, this release leverages 
+[deleteRange](https://rocksdb.org/blog/2018/11/21/delete-range.html) to optimize the rescaling of 
+Incremental RocksDB state backend. deleteRange is used to avoid massive scan-and-delete operations, 
+for upscaling with a large number of states that need to be deleted, 
+the speed of restoring can be almost [increased by 2~10 times](https://github.com/apache/flink/pull/19033#issuecomment-1072267658).
+
+<center>
+<img src="{{ site.baseurl }}/img/rocksdb_rescaling_benchmark.png" style="width:90%;margin:15px">
+</center>
+
+## Improve monitoring experience and usability of state backend
+
+This release also improves the monitoring experience and usability of state backend.
+Previously, RocksDB's log was located in its own DB folder, which made debugging RocksDB not so easy. 
+This release lets RocksDB's log stay in Flink's log directory by default. RocksDB statistics-based 
+metrics are introduced to help debug the performance at the database level, e.g. total block cache hit/miss
+count within the DB.
+
+## Support overdraft buffer
+
+A new concept of [overdraft network buffer](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers) 
+is introduced to mitigate the effects of uninterruptible blocking a subtask thread during back pressure, 
+which can be turned on through taskmanager.network.memory.max-overdraft-buffers-per-gate. 
+Starting from 1.16.0, Flink subtask can request by default up to 5 extra (overdraft) buffers over 
+the regular configured amount.  This change can slightly increase memory consumption of 
+the Flink Job but vastly reduce the checkpoint duration of the unaligned checkpoint.
+If the subtask is back pressure by downstream subtasks and the subtask requires more than 
+a single network buffer to finish what it's currently doing, overdraft buffer comes into play.
+Read more about this in [the documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers).
+
+## Timeout aligned to unaligned checkpoint barrier in the output buffers of an upstream subtask
+
+This release updates the timing of switching from Aligned Checkpoint (AC)  to Unaligned Checkpoint (UC).
+With UC enabled, if [execution.checkpointing.aligned-checkpoint-timeout](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/state/checkpointing_under_backpressure/#aligned-checkpoint-timeout) 
+is configured, each checkpoint will still begin as an AC, but when the global checkpoint duration 
+exceeds the aligned-checkpoint-timeout, if the AC has not been completed, 
+then the checkpoint will be switched to unaligned.
+
+Previously, the switch of one subtask needs to wait for all barriers from upstream. 
+If the back pressure is served, the downstream subtask may not receive all barriers 
+within [checkpointing-timeout](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/config/#execution-checkpointing-timeout), 
+causing the checkpoint to fail.
+
+In this release, If the barrier cannot be sent from the output buffer to the downstream task 

Review Comment:
   ```suggestion
   In this release, if the barrier cannot be sent from the output buffer to the downstream task 
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 
+speculative execution. Sinks do not support speculative execution yet so that 
+speculative execution will not happen on sinks at the moment.
+
+The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of 
+tasks and blocked task managers.
+
+## Hybrid Shuffle Mode
+
+We have introduced a new [Hybrid Shuffle](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/batch/batch_shuffle) 
+Mode for batch executions. It combines the advantages of blocking shuffle and 
+pipelined shuffle (in streaming mode).
+ - Like blocking shuffle, it does not require upstream and downstream tasks to run simultaneously, 
+   which allows executing a job with little resources.
+ - Like pipelined shuffle, it does not require downstream tasks to be executed 
+   after upstream tasks finish, which reduces the overall execution time of the job when 
+   given sufficient resources.
+ - It adapts to custom preferences between persisting less data and restarting less tasks on failures, 
+   by providing different spilling strategies.
+
+Note: This feature is experimental and by default not activated.
+
+## Further improvements of blocking shuffle
+
+We further improved the blocking shuffle usability and performance in this version, 
+including adaptive network buffer allocation, sequential IO optimization and result partition reuse 
+which allows multiple consumer vertices reusing the same physical result partition to reduce disk IO 
+and storage space. These optimizations can achieve an overall 7% performance gain for 
+10 T scale TPC-DS tests. In addition, two more compression algorithms (LZO and ZSTD) 
+with higher compression ratio were introduced which can further reduce the storage space with 
+some CPU cost compared to the default LZ4 compression algorithm.
+
+## Dynamic Partition Pruning
+
+For batch jobs, partitioned tables are more widely used than non-partitioned tables in 
+production environments. Currently Flink has support for static partition pruning, 
+where the optimizer pushes down the partition field related filter conditions in the WHERE clause 
+into the Source Connector during the optimization phase, thus reducing unnecessary partition scan IO. 
+The [star-schema](https://en.wikipedia.org/wiki/Star_schema) is the simplest of 
+the most commonly used data mart patterns. We have found that many user jobs cannot use 
+static partition pruning because partition pruning information can only be determined in execution, 
+which requires [dynamic partition pruning techniques](https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning),
+where partition pruning information is collected at runtime based on data from other related tables, 
+thus reducing unnecessary partition scan IO for partitioned tables. The use of dynamic partition pruning 
+has been validated with the 10 T dataset TPC-DS to improve performance by up to 30%.
+
+# Stream Processing
+
+In 1.16, we have made improvements in Checkpoints, SQL, Connector and other fields, 
+so that stream computing can continue to lead.
+
+## Generalized incremental checkpoint
+
+Changelog state backend aims at making checkpoint intervals shorter and more predictable, 
+this release is prod-ready and is dedicated to adapting changelog state backend to 
+the existing state backends and improving the usability of changelog state backend:
+ - Support state migration
+ - Support local recovery
+ - Introduce file cache to optimize restoring
+ - Support switch based on checkpoint
+ - Improve the monitoring experience of changelog state backend
+   - expose changelog's metrics
+   - expose changelog's configuration to webUI
+
+Table 1: The comparison between Changelog Enabled / Changelog Disabled on value state 
+(see [this blog](https://flink.apache.org/2022/05/30/changelog-state-backend.html) for more details)
+
+| Percentile  | End to End Duration | Checkpointed Data Size<sup>*<sup> | Full Checkpoint Data Size<sup>*<sup> |
+|-------------|---------------------|-----------------------------------|--------------------------------------|
+| 50%         | 311ms / 5s          | 14.8MB / 3.05GB                   | 24.2GB / 18.5GB                      |
+| 90%         | 664ms / 6s          | 23.5MB / 4.52GB                   | 25.2GB / 19.3GB                      |
+| 99%         | 1s / 7s             | 36.6MB / 5.19GB                   | 25.6GB / 19.6GB                      |
+| 99.9%       | 1s / 10s            | 52.8MB / 6.49GB                   | 25.7GB / 19.8GB                      |
+
+## RocksDB rescaling improvement & rescaling benchmark
+
+Rescaling is a frequent operation for cloud services built on Apache Flink, this release leverages 
+[deleteRange](https://rocksdb.org/blog/2018/11/21/delete-range.html) to optimize the rescaling of 
+Incremental RocksDB state backend. deleteRange is used to avoid massive scan-and-delete operations, 
+for upscaling with a large number of states that need to be deleted, 
+the speed of restoring can be almost [increased by 2~10 times](https://github.com/apache/flink/pull/19033#issuecomment-1072267658).
+
+<center>
+<img src="{{ site.baseurl }}/img/rocksdb_rescaling_benchmark.png" style="width:90%;margin:15px">
+</center>
+
+## Improve monitoring experience and usability of state backend
+
+This release also improves the monitoring experience and usability of state backend.
+Previously, RocksDB's log was located in its own DB folder, which made debugging RocksDB not so easy. 
+This release lets RocksDB's log stay in Flink's log directory by default. RocksDB statistics-based 
+metrics are introduced to help debug the performance at the database level, e.g. total block cache hit/miss
+count within the DB.
+
+## Support overdraft buffer
+
+A new concept of [overdraft network buffer](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers) 
+is introduced to mitigate the effects of uninterruptible blocking a subtask thread during back pressure, 
+which can be turned on through taskmanager.network.memory.max-overdraft-buffers-per-gate. 
+Starting from 1.16.0, Flink subtask can request by default up to 5 extra (overdraft) buffers over 
+the regular configured amount.  This change can slightly increase memory consumption of 
+the Flink Job but vastly reduce the checkpoint duration of the unaligned checkpoint.
+If the subtask is back pressure by downstream subtasks and the subtask requires more than 
+a single network buffer to finish what it's currently doing, overdraft buffer comes into play.
+Read more about this in [the documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers).
+
+## Timeout aligned to unaligned checkpoint barrier in the output buffers of an upstream subtask
+
+This release updates the timing of switching from Aligned Checkpoint (AC)  to Unaligned Checkpoint (UC).
+With UC enabled, if [execution.checkpointing.aligned-checkpoint-timeout](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/state/checkpointing_under_backpressure/#aligned-checkpoint-timeout) 
+is configured, each checkpoint will still begin as an AC, but when the global checkpoint duration 
+exceeds the aligned-checkpoint-timeout, if the AC has not been completed, 
+then the checkpoint will be switched to unaligned.
+
+Previously, the switch of one subtask needs to wait for all barriers from upstream. 
+If the back pressure is served, the downstream subtask may not receive all barriers 
+within [checkpointing-timeout](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/config/#execution-checkpointing-timeout), 
+causing the checkpoint to fail.
+
+In this release, If the barrier cannot be sent from the output buffer to the downstream task 
+within the execution.checkpointing.aligned-checkpoint-timeout, Flink lets upstream subtasks 

Review Comment:
   ```suggestion
   within the `execution.checkpointing.aligned-checkpoint-timeout`, Flink lets upstream subtasks 
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 
+speculative execution. Sinks do not support speculative execution yet so that 
+speculative execution will not happen on sinks at the moment.
+
+The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of 
+tasks and blocked task managers.
+
+## Hybrid Shuffle Mode
+
+We have introduced a new [Hybrid Shuffle](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/batch/batch_shuffle) 
+Mode for batch executions. It combines the advantages of blocking shuffle and 
+pipelined shuffle (in streaming mode).
+ - Like blocking shuffle, it does not require upstream and downstream tasks to run simultaneously, 
+   which allows executing a job with little resources.
+ - Like pipelined shuffle, it does not require downstream tasks to be executed 
+   after upstream tasks finish, which reduces the overall execution time of the job when 
+   given sufficient resources.
+ - It adapts to custom preferences between persisting less data and restarting less tasks on failures, 
+   by providing different spilling strategies.
+
+Note: This feature is experimental and by default not activated.
+
+## Further improvements of blocking shuffle
+
+We further improved the blocking shuffle usability and performance in this version, 
+including adaptive network buffer allocation, sequential IO optimization and result partition reuse 
+which allows multiple consumer vertices reusing the same physical result partition to reduce disk IO 
+and storage space. These optimizations can achieve an overall 7% performance gain for 
+10 T scale TPC-DS tests. In addition, two more compression algorithms (LZO and ZSTD) 
+with higher compression ratio were introduced which can further reduce the storage space with 
+some CPU cost compared to the default LZ4 compression algorithm.
+
+## Dynamic Partition Pruning
+
+For batch jobs, partitioned tables are more widely used than non-partitioned tables in 
+production environments. Currently Flink has support for static partition pruning, 
+where the optimizer pushes down the partition field related filter conditions in the WHERE clause 
+into the Source Connector during the optimization phase, thus reducing unnecessary partition scan IO. 
+The [star-schema](https://en.wikipedia.org/wiki/Star_schema) is the simplest of 
+the most commonly used data mart patterns. We have found that many user jobs cannot use 
+static partition pruning because partition pruning information can only be determined in execution, 
+which requires [dynamic partition pruning techniques](https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning),
+where partition pruning information is collected at runtime based on data from other related tables, 
+thus reducing unnecessary partition scan IO for partitioned tables. The use of dynamic partition pruning 
+has been validated with the 10 T dataset TPC-DS to improve performance by up to 30%.
+
+# Stream Processing
+
+In 1.16, we have made improvements in Checkpoints, SQL, Connector and other fields, 
+so that stream computing can continue to lead.
+
+## Generalized incremental checkpoint
+
+Changelog state backend aims at making checkpoint intervals shorter and more predictable, 
+this release is prod-ready and is dedicated to adapting changelog state backend to 
+the existing state backends and improving the usability of changelog state backend:
+ - Support state migration
+ - Support local recovery
+ - Introduce file cache to optimize restoring
+ - Support switch based on checkpoint
+ - Improve the monitoring experience of changelog state backend
+   - expose changelog's metrics
+   - expose changelog's configuration to webUI
+
+Table 1: The comparison between Changelog Enabled / Changelog Disabled on value state 
+(see [this blog](https://flink.apache.org/2022/05/30/changelog-state-backend.html) for more details)
+
+| Percentile  | End to End Duration | Checkpointed Data Size<sup>*<sup> | Full Checkpoint Data Size<sup>*<sup> |
+|-------------|---------------------|-----------------------------------|--------------------------------------|
+| 50%         | 311ms / 5s          | 14.8MB / 3.05GB                   | 24.2GB / 18.5GB                      |
+| 90%         | 664ms / 6s          | 23.5MB / 4.52GB                   | 25.2GB / 19.3GB                      |
+| 99%         | 1s / 7s             | 36.6MB / 5.19GB                   | 25.6GB / 19.6GB                      |
+| 99.9%       | 1s / 10s            | 52.8MB / 6.49GB                   | 25.7GB / 19.8GB                      |
+
+## RocksDB rescaling improvement & rescaling benchmark
+
+Rescaling is a frequent operation for cloud services built on Apache Flink, this release leverages 
+[deleteRange](https://rocksdb.org/blog/2018/11/21/delete-range.html) to optimize the rescaling of 
+Incremental RocksDB state backend. deleteRange is used to avoid massive scan-and-delete operations, 
+for upscaling with a large number of states that need to be deleted, 
+the speed of restoring can be almost [increased by 2~10 times](https://github.com/apache/flink/pull/19033#issuecomment-1072267658).
+
+<center>
+<img src="{{ site.baseurl }}/img/rocksdb_rescaling_benchmark.png" style="width:90%;margin:15px">
+</center>
+
+## Improve monitoring experience and usability of state backend
+
+This release also improves the monitoring experience and usability of state backend.
+Previously, RocksDB's log was located in its own DB folder, which made debugging RocksDB not so easy. 
+This release lets RocksDB's log stay in Flink's log directory by default. RocksDB statistics-based 
+metrics are introduced to help debug the performance at the database level, e.g. total block cache hit/miss
+count within the DB.
+
+## Support overdraft buffer
+
+A new concept of [overdraft network buffer](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers) 
+is introduced to mitigate the effects of uninterruptible blocking a subtask thread during back pressure, 
+which can be turned on through taskmanager.network.memory.max-overdraft-buffers-per-gate. 
+Starting from 1.16.0, Flink subtask can request by default up to 5 extra (overdraft) buffers over 
+the regular configured amount.  This change can slightly increase memory consumption of 
+the Flink Job but vastly reduce the checkpoint duration of the unaligned checkpoint.
+If the subtask is back pressure by downstream subtasks and the subtask requires more than 
+a single network buffer to finish what it's currently doing, overdraft buffer comes into play.
+Read more about this in [the documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers).
+
+## Timeout aligned to unaligned checkpoint barrier in the output buffers of an upstream subtask
+
+This release updates the timing of switching from Aligned Checkpoint (AC)  to Unaligned Checkpoint (UC).
+With UC enabled, if [execution.checkpointing.aligned-checkpoint-timeout](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/state/checkpointing_under_backpressure/#aligned-checkpoint-timeout) 
+is configured, each checkpoint will still begin as an AC, but when the global checkpoint duration 
+exceeds the aligned-checkpoint-timeout, if the AC has not been completed, 
+then the checkpoint will be switched to unaligned.
+
+Previously, the switch of one subtask needs to wait for all barriers from upstream. 
+If the back pressure is served, the downstream subtask may not receive all barriers 
+within [checkpointing-timeout](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/config/#execution-checkpointing-timeout), 
+causing the checkpoint to fail.
+
+In this release, If the barrier cannot be sent from the output buffer to the downstream task 
+within the execution.checkpointing.aligned-checkpoint-timeout, Flink lets upstream subtasks 
+switch to UC first to send barriers to downstream, thereby decreasing 
+the probability of checkpoint timeout during back pressure.
+More details can be found in [this documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/state/checkpointing_under_backpressure/#aligned-checkpoint-timeout).
+
+## Non-Determinism In Stream Processing
+
+Users often complain about the high cost of understanding stream processing. 
+One of the pain point is the non-determinisim in stream processing (usually not intuitive) 
+which may cause wrong results or errors, it is long lived since the first day 
+that flink sql was available.
+
+For complex streaming jobs, now it's possible to detect and resolve potential correctness issues 
+before running. If the problems can’t be resolved completely, a detailed message could prompt users to 
+adjust the SQL so as to avoid introducing non-deterministic problems. More details can be found 
+in [the documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/concepts/determinism).
+
+## Enhanced Lookup Join
+
+Lookup join is widely used in stream processing, and we have introduced several improvements:
+- Adds a unified abstraction for lookup source cache and 
+  [related metrics](https://cwiki.apache.org/confluence/display/FLINK/FLIP-221%3A+Abstraction+for+lookup+source+cache+and+metric) 
+  to speed up lookup queries
+- Introduces the configurable asynchronous mode (ALLOW_UNORDERED) via 
+  [job configuration](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/config/#table-exec-async-lookup-output-mode) 
+  or [lookup hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#lookup) 
+  to significantly  improve query throughput without compromising correctness.

Review Comment:
   ```suggestion
     to significantly improve query throughput without compromising correctness.
   ```



##########
_posts/2022-10-15-1.16-announcement.md:
##########
@@ -0,0 +1,400 @@
+---
+layout: post
+title:  "Announcing the Release of Apache Flink 1.16"
+subtitle: ""
+date: 2022-10-27T08:00:00.000Z
+categories: news
+authors:
+- godfreyhe:
+  name: "Godfrey He"
+  twitter: "godfreyhe"
+
+---
+
+Apache Flink continues to grow at a rapid pace and is one of the most active 
+communities in Apache. Flink 1.16 had over 230 contributors enthusiastically participating, 
+with 19 FLIPs and 900+ issues completed, bringing a lot of exciting features to the community.
+
+Flink has become the leading role and factual standard of stream processing, 
+and the concept of the unification of stream and batch data processing is gradually gaining recognition 
+and is being successfully implemented in more and more companies. Previously, 
+the integrated stream and batch concept placed more emphasis on a unified API and 
+a unified computing framework. This year, based on this, Flink proposed 
+the next development direction of [Flink-Streaming Warehouse](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) (Streamhouse), 
+which further upgraded the scope of stream-batch integration: it truly completes not only 
+the unified computation but also unified storage, thus realizing unified real-time analysis.
+
+In 1.16, the Flink community has completed many improvements for both batch and stream processing:
+
+- For batch processing, all-round improvements in ease of use, stability and performance 
+ have been completed. 1.16 is a milestone version of Flink batch processing and an important 
+ step towards maturity.
+  - Ease of use:  with the introduction of SQL Gateway and full compatibility with Hive Server2, 
+  users can submit Flink SQL jobs and Hive SQL jobs very easily, and it is also easy to 
+  connect to the original Hive ecosystem.
+  - Functionality: Introduce Join hints which let Flink SQL users manually specify join strategies
+  to avoid unreasonable execution plans. The compatibility of Hive SQL has reached 94%, 
+  and users can migrate Hive to Flink at a very low cost.
+  - Stability: Propose a speculative execution mechanism to reduce the long tail sub-tasks of
+  a job and improve the stability. Improve HashJoin and introduce failure rollback mechanism
+  to avoid join failure.
+  - Performance: Introduce dynamic partition pruning to reduce the Scan I/O and improve join 
+  processing for the star-schema queries. There is 30% improvement in the TPC-DS benchmark. 
+  We can use hybrid shuffle mode to improve resource usage and processing performance.
+- For stream processing, there are a number of significant improvements:
+  - Changelog State Backend provides users with second or even millisecond checkpoints to 
+  dramatically improve the fault tolerance experience, while providing a smaller end-to-end 
+  latency experience for transactional Sink jobs.
+  - Lookup join is widely used in stream processing. Slow lookup speed, low throughput and 
+  delay update are resolved through common cache mechanism, asynchronous io and retriable lookup. 
+  These features are very useful, solving the pain points that users often complain about, 
+  and supporting richer scenarios.
+  - From the first day of the birth of Flink SQL, there were some non-deterministic operations 
+  that could cause incorrect results or exceptions, which caused great distress to users. 
+  In 1.16, we spent a lot of effort to solve most of the problems, and we will continue to 
+  improve in the future.
+
+With the further refinement of the integration of stream and batch, and the continuous iteration of 
+the Flink Table Store ([0.2 has been released](https://flink.apache.org/news/2022/08/29/release-table-store-0.2.0.html)), 
+the Flink community is pushing the Streaming warehouse from concept to reality and maturity step by step.
+
+# Understanding Streaming Warehouses
+
+To be precise, a streaming warehouse is to make data warehouse streaming, which allows the data 
+for each layer in the whole warehouse to flow in real-time. The goal is to realize 
+a Streaming Service with end-to-end real-time performance through a unified API and computing framework.
+Please refer to [the article](https://www.alibabacloud.com/blog/more-than-computing-a-new-era-led-by-the-warehouse-architecture-of-apache-flink_598821) 
+for more details.
+
+# Batch processing
+
+Flink is a unified stream batch processing engine, stream processing has become the leading role 
+thanks to our long-term investment. We’re also putting more effort to improve batch processing 
+to make it an excellent computing engine. This makes the overall experience of stream batch 
+unification smoother.
+
+## SQL Gateway
+
+The feedback from various channels indicates that [SQL Gateway](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql-gateway/overview/) 
+is a highly anticipated feature for users, especially for batch users. This function finally 
+completed (see FLIP-91 for design) in 1.16. SQL Gateway is an extension and enhancement to SQL Client, 
+supporting multi-tenancy and pluggable API protocols (Endpoints), solving the problem that SQL Client 
+can only serve a single user and cannot be integrated with external services or components. 
+Currently SQL Gateway has support for REST API and HiveServer2 Protocol and users can link to 
+SQL Gateway via cURL, Postman, HTTP clients in various programming languages to submit stream jobs,
+batch jobs and even OLAP jobs. For HiveServer2 Endpoint, please refer to the Hive Compatibility 
+for more details.
+
+## Hive Compatibility
+
+To reduce the cost of migrating Hive to Flink, we introduce HiveServer2 Endpoint and 
+Hive Syntax Improvements in this version:
+
+[The HiveServer2 Endpoint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hiveserver2/) 
+allows users to interact with SQL Gateway with Hive JDBC/Beeline and migrate the Flink into 
+the Hive ecosystem(DBeaver, Apache Superset, Apache DolphinScheduler, and Apache Zeppelin).
+When users connect to the HiveServer2 endpoint, the SQL Gateway registers the Hive Catalog, 
+switches to Hive Dialect, and uses batch execution mode to execute jobs. With these steps, 
+users can have the same experience as HiveServer2.
+
+[Hive syntax](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/hive-compatibility/hive-dialect/overview/) 
+is already the factual standard for big data processing. Flink has improved compatibility 
+with Hive syntax and added support for several Hive syntaxes commonly used in production. 
+Hive syntax compatibility can help users migrate existing Hive SQL tasks to Flink, 
+and it is convenient for users who are familiar with Hive syntax to use Hive syntax to 
+write SQL to query tables registered in Flink. The compatibility is measured using 
+the Hive qtest suite which contains more than 12K SQL cases. Until now, for Hive 2.3, 
+the compatibility has reached 94.1% for whole hive queries, and has reached 97.3% 
+if ACID queries are excluded.
+
+## Join Hints for Flink SQL
+
+The join hint is a common solution in the industry to improve the shortcomings of the optimizer 
+by affecting the execution plans. Join is the most widely used operator in batch jobs, 
+and Flink supports a variety of join strategies. Missing statistics or a poor cost model of 
+the optimizer can lead to the wrong choice of join strategy, which will cause slow execution 
+or even job failure. By specifying a [Join Hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#join-hints), 
+the optimizer will choose the join strategy specified by the user whenever possible. 
+It could avoid various shortcomings of the optimizer and ensure the production availability of 
+the batch job.
+
+## Adaptive Hash Join
+
+For batch jobs, the hash join operators may fail if the input data is severely skewed, 
+it's a very bad experience for users. To solve this, we introduce adaptive Hash-Join 
+which could automatically fall back to Sort-Merge-Join once the Hash-Join fails in runtime. 
+This mechanism ensures the Hash-Join could be always successful and improve the stability 
+while the users are completely unaware of it.
+
+## Speculative Execution for Batch Job
+
+Speculative execution is introduced in Flink 1.16 to mitigate batch job slowness 
+which is caused by problematic nodes. A problematic node may have hardware problems, 
+accident I/O busy, or high CPU load. These problems may make the hosted tasks run 
+much slower than tasks on other nodes, and affect the overall execution time of a batch job.
+
+When speculative execution is enabled, Flink will keep detecting slow tasks. 
+Once slow tasks are detected, the nodes that the slow tasks locate in will be identified 
+as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). 
+The scheduler will create new attempts for the slow tasks and deploy them to nodes 
+that are not blocked, while the existing attempts will keep running. 
+The new attempts process the same input data and produce the same data as the original attempt. 
+Once any attempt finishes first, it will be admitted as the only finished attempt of the task, 
+and the remaining attempts of the task will be canceled.
+
+Most existing sources can work with speculative execution(FLIP-245). Only if a source uses 
+SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support 
+speculative execution. Sinks do not support speculative execution yet so that 
+speculative execution will not happen on sinks at the moment.
+
+The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of 
+tasks and blocked task managers.
+
+## Hybrid Shuffle Mode
+
+We have introduced a new [Hybrid Shuffle](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/batch/batch_shuffle) 
+Mode for batch executions. It combines the advantages of blocking shuffle and 
+pipelined shuffle (in streaming mode).
+ - Like blocking shuffle, it does not require upstream and downstream tasks to run simultaneously, 
+   which allows executing a job with little resources.
+ - Like pipelined shuffle, it does not require downstream tasks to be executed 
+   after upstream tasks finish, which reduces the overall execution time of the job when 
+   given sufficient resources.
+ - It adapts to custom preferences between persisting less data and restarting less tasks on failures, 
+   by providing different spilling strategies.
+
+Note: This feature is experimental and by default not activated.
+
+## Further improvements of blocking shuffle
+
+We further improved the blocking shuffle usability and performance in this version, 
+including adaptive network buffer allocation, sequential IO optimization and result partition reuse 
+which allows multiple consumer vertices reusing the same physical result partition to reduce disk IO 
+and storage space. These optimizations can achieve an overall 7% performance gain for 
+10 T scale TPC-DS tests. In addition, two more compression algorithms (LZO and ZSTD) 
+with higher compression ratio were introduced which can further reduce the storage space with 
+some CPU cost compared to the default LZ4 compression algorithm.
+
+## Dynamic Partition Pruning
+
+For batch jobs, partitioned tables are more widely used than non-partitioned tables in 
+production environments. Currently Flink has support for static partition pruning, 
+where the optimizer pushes down the partition field related filter conditions in the WHERE clause 
+into the Source Connector during the optimization phase, thus reducing unnecessary partition scan IO. 
+The [star-schema](https://en.wikipedia.org/wiki/Star_schema) is the simplest of 
+the most commonly used data mart patterns. We have found that many user jobs cannot use 
+static partition pruning because partition pruning information can only be determined in execution, 
+which requires [dynamic partition pruning techniques](https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning),
+where partition pruning information is collected at runtime based on data from other related tables, 
+thus reducing unnecessary partition scan IO for partitioned tables. The use of dynamic partition pruning 
+has been validated with the 10 T dataset TPC-DS to improve performance by up to 30%.
+
+# Stream Processing
+
+In 1.16, we have made improvements in Checkpoints, SQL, Connector and other fields, 
+so that stream computing can continue to lead.
+
+## Generalized incremental checkpoint
+
+Changelog state backend aims at making checkpoint intervals shorter and more predictable, 
+this release is prod-ready and is dedicated to adapting changelog state backend to 
+the existing state backends and improving the usability of changelog state backend:
+ - Support state migration
+ - Support local recovery
+ - Introduce file cache to optimize restoring
+ - Support switch based on checkpoint
+ - Improve the monitoring experience of changelog state backend
+   - expose changelog's metrics
+   - expose changelog's configuration to webUI
+
+Table 1: The comparison between Changelog Enabled / Changelog Disabled on value state 
+(see [this blog](https://flink.apache.org/2022/05/30/changelog-state-backend.html) for more details)
+
+| Percentile  | End to End Duration | Checkpointed Data Size<sup>*<sup> | Full Checkpoint Data Size<sup>*<sup> |
+|-------------|---------------------|-----------------------------------|--------------------------------------|
+| 50%         | 311ms / 5s          | 14.8MB / 3.05GB                   | 24.2GB / 18.5GB                      |
+| 90%         | 664ms / 6s          | 23.5MB / 4.52GB                   | 25.2GB / 19.3GB                      |
+| 99%         | 1s / 7s             | 36.6MB / 5.19GB                   | 25.6GB / 19.6GB                      |
+| 99.9%       | 1s / 10s            | 52.8MB / 6.49GB                   | 25.7GB / 19.8GB                      |
+
+## RocksDB rescaling improvement & rescaling benchmark
+
+Rescaling is a frequent operation for cloud services built on Apache Flink, this release leverages 
+[deleteRange](https://rocksdb.org/blog/2018/11/21/delete-range.html) to optimize the rescaling of 
+Incremental RocksDB state backend. deleteRange is used to avoid massive scan-and-delete operations, 
+for upscaling with a large number of states that need to be deleted, 
+the speed of restoring can be almost [increased by 2~10 times](https://github.com/apache/flink/pull/19033#issuecomment-1072267658).
+
+<center>
+<img src="{{ site.baseurl }}/img/rocksdb_rescaling_benchmark.png" style="width:90%;margin:15px">
+</center>
+
+## Improve monitoring experience and usability of state backend
+
+This release also improves the monitoring experience and usability of state backend.
+Previously, RocksDB's log was located in its own DB folder, which made debugging RocksDB not so easy. 
+This release lets RocksDB's log stay in Flink's log directory by default. RocksDB statistics-based 
+metrics are introduced to help debug the performance at the database level, e.g. total block cache hit/miss
+count within the DB.
+
+## Support overdraft buffer
+
+A new concept of [overdraft network buffer](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers) 
+is introduced to mitigate the effects of uninterruptible blocking a subtask thread during back pressure, 
+which can be turned on through taskmanager.network.memory.max-overdraft-buffers-per-gate. 
+Starting from 1.16.0, Flink subtask can request by default up to 5 extra (overdraft) buffers over 
+the regular configured amount.  This change can slightly increase memory consumption of 
+the Flink Job but vastly reduce the checkpoint duration of the unaligned checkpoint.
+If the subtask is back pressure by downstream subtasks and the subtask requires more than 
+a single network buffer to finish what it's currently doing, overdraft buffer comes into play.
+Read more about this in [the documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/memory/network_mem_tuning/#overdraft-buffers).
+
+## Timeout aligned to unaligned checkpoint barrier in the output buffers of an upstream subtask
+
+This release updates the timing of switching from Aligned Checkpoint (AC)  to Unaligned Checkpoint (UC).
+With UC enabled, if [execution.checkpointing.aligned-checkpoint-timeout](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/state/checkpointing_under_backpressure/#aligned-checkpoint-timeout) 
+is configured, each checkpoint will still begin as an AC, but when the global checkpoint duration 
+exceeds the aligned-checkpoint-timeout, if the AC has not been completed, 
+then the checkpoint will be switched to unaligned.
+
+Previously, the switch of one subtask needs to wait for all barriers from upstream. 
+If the back pressure is served, the downstream subtask may not receive all barriers 
+within [checkpointing-timeout](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/config/#execution-checkpointing-timeout), 
+causing the checkpoint to fail.
+
+In this release, If the barrier cannot be sent from the output buffer to the downstream task 
+within the execution.checkpointing.aligned-checkpoint-timeout, Flink lets upstream subtasks 
+switch to UC first to send barriers to downstream, thereby decreasing 
+the probability of checkpoint timeout during back pressure.
+More details can be found in [this documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/ops/state/checkpointing_under_backpressure/#aligned-checkpoint-timeout).
+
+## Non-Determinism In Stream Processing
+
+Users often complain about the high cost of understanding stream processing. 
+One of the pain point is the non-determinisim in stream processing (usually not intuitive) 
+which may cause wrong results or errors, it is long lived since the first day 
+that flink sql was available.
+
+For complex streaming jobs, now it's possible to detect and resolve potential correctness issues 
+before running. If the problems can’t be resolved completely, a detailed message could prompt users to 
+adjust the SQL so as to avoid introducing non-deterministic problems. More details can be found 
+in [the documentation](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/concepts/determinism).
+
+## Enhanced Lookup Join
+
+Lookup join is widely used in stream processing, and we have introduced several improvements:
+- Adds a unified abstraction for lookup source cache and 
+  [related metrics](https://cwiki.apache.org/confluence/display/FLINK/FLIP-221%3A+Abstraction+for+lookup+source+cache+and+metric) 
+  to speed up lookup queries
+- Introduces the configurable asynchronous mode (ALLOW_UNORDERED) via 
+  [job configuration](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/config/#table-exec-async-lookup-output-mode) 
+  or [lookup hint](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/#lookup) 
+  to significantly  improve query throughput without compromising correctness.
+- Retryable lookup mechanism[10] gives users more tools to solve the delayed updates issue in external systems.

Review Comment:
   what is the `[10]` about?
   ```suggestion
   - Retryable lookup mechanism gives users more tools to solve the delayed updates issue in external systems.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org