You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@flink.apache.org by tr...@apache.org on 2022/01/04 13:58:48 UTC

[flink-web] branch asf-site updated (9e9375a -> d9a7d88)

This is an automated email from the ASF dual-hosted git repository.

trohrmann pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git.


    from 9e9375a  Rebuilt website
     new 6ffe80f  Add blog post "How We Improved Scheduler Performance for Large-scale Jobs"
     new d9a7d88  Rebuild website

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../2022-01-04-scheduler-performance-part-one.md   |  76 +++
 .../2022-01-04-scheduler-performance-part-two.md   | 148 +++++
 .../01/04/scheduler-performance-part-one.html}     | 139 +++--
 .../2022/01/04/scheduler-performance-part-two.html | 403 ++++++++++++++
 content/blog/feed.xml                              | 597 ++++++++-------------
 content/blog/index.html                            |  85 +--
 content/blog/page10/index.html                     |  85 +--
 content/blog/page11/index.html                     |  89 +--
 content/blog/page12/index.html                     |  92 ++--
 content/blog/page13/index.html                     |  90 ++--
 content/blog/page14/index.html                     |  87 +--
 content/blog/page15/index.html                     |  88 +--
 content/blog/page16/index.html                     |  89 +--
 content/blog/page17/index.html                     |  80 ++-
 content/blog/{page11 => page18}/index.html         | 181 ++-----
 content/blog/page2/index.html                      |  85 ++-
 content/blog/page3/index.html                      |  81 ++-
 content/blog/page4/index.html                      |  83 +--
 content/blog/page5/index.html                      |  85 +--
 content/blog/page6/index.html                      |  83 ++-
 content/blog/page7/index.html                      |  81 ++-
 content/blog/page8/index.html                      |  83 +--
 content/blog/page9/index.html                      |  83 ++-
 .../1-distribution-pattern.svg                     |   4 +
 .../2022-01-05-scheduler-performance/2-groups.svg  |   4 +
 .../3-how-shuffle-descriptors-are-distributed.svg  |   4 +
 .../4-pipelined-region.svg                         |   4 +
 .../5-scheduling-deadlock.svg                      |   4 +
 .../6-building-pipelined-region.svg                |   4 +
 content/index.html                                 |  12 +-
 content/zh/index.html                              |  12 +-
 .../1-distribution-pattern.svg                     |   4 +
 .../2022-01-05-scheduler-performance/2-groups.svg  |   4 +
 .../3-how-shuffle-descriptors-are-distributed.svg  |   4 +
 .../4-pipelined-region.svg                         |   4 +
 .../5-scheduling-deadlock.svg                      |   4 +
 .../6-building-pipelined-region.svg                |   4 +
 37 files changed, 1954 insertions(+), 1111 deletions(-)
 create mode 100644 _posts/2022-01-04-scheduler-performance-part-one.md
 create mode 100644 _posts/2022-01-04-scheduler-performance-part-two.md
 copy content/{news/2016/09/05/release-1.1.2.html => 2022/01/04/scheduler-performance-part-one.html} (66%)
 create mode 100644 content/2022/01/04/scheduler-performance-part-two.html
 copy content/blog/{page11 => page18}/index.html (89%)
 create mode 100644 content/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg
 create mode 100644 content/img/blog/2022-01-05-scheduler-performance/2-groups.svg
 create mode 100644 content/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg
 create mode 100644 content/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg
 create mode 100644 content/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg
 create mode 100644 content/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg
 create mode 100644 img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg
 create mode 100644 img/blog/2022-01-05-scheduler-performance/2-groups.svg
 create mode 100644 img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg
 create mode 100644 img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg
 create mode 100644 img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg
 create mode 100644 img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg

[flink-web] 01/02: Add blog post "How We Improved Scheduler Performance for Large-scale Jobs"

Posted by tr...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

trohrmann pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git

commit 6ffe80f166cfef73ae98a2242b1e7a3df8a52bf0
Author: Thesharing <cy...@outlook.com>
AuthorDate: Wed Dec 29 20:47:53 2021 +0800

    Add blog post "How We Improved Scheduler Performance for Large-scale Jobs"
    
    This closes #494.
---
 .../2022-01-04-scheduler-performance-part-one.md   |  76 +++++++++++
 .../2022-01-04-scheduler-performance-part-two.md   | 148 +++++++++++++++++++++
 .../1-distribution-pattern.svg                     |   4 +
 .../2022-01-05-scheduler-performance/2-groups.svg  |   4 +
 .../3-how-shuffle-descriptors-are-distributed.svg  |   4 +
 .../4-pipelined-region.svg                         |   4 +
 .../5-scheduling-deadlock.svg                      |   4 +
 .../6-building-pipelined-region.svg                |   4 +
 8 files changed, 248 insertions(+)

diff --git a/_posts/2022-01-04-scheduler-performance-part-one.md b/_posts/2022-01-04-scheduler-performance-part-one.md
new file mode 100644
index 0000000..97404a5
--- /dev/null
+++ b/_posts/2022-01-04-scheduler-performance-part-one.md
@@ -0,0 +1,76 @@
+---
+layout: post
+title: "How We Improved Scheduler Performance for Large-scale Jobs - Part One"
+date: 2022-01-04T08:00:00.000Z
+authors:
+- Zhilong Hong:
+  name: "Zhilong Hong"
+- Zhu Zhu:
+  name: "Zhu Zhu"
+- DaisyTsang:
+  name: "Daisy Tsang"
+- Till Rohrmann:
+  name: "Till Rohrmann"
+  twitter: "stsffap"
+
+excerpt: To improve the performance of the scheduler for large-scale jobs, several optimizations were introduced in Flink 1.13 and 1.14. In this blog post we'll take a look at them.
+---
+
+# Introduction
+
+When scheduling large-scale jobs in Flink 1.12, a lot of time is required to initialize jobs and deploy tasks. The scheduler also requires a large amount of heap memory in order to store the execution topology and host temporary deployment descriptors. For example, for a job with a topology that contains two vertices connected with an all-to-all edge and a parallelism of 10k (which means there are 10k source tasks and 10k sink tasks and every source task is connected to all sink tasks),  [...]
+
+Furthermore, task deployment may block the JobManager's main thread for a long time and the JobManager will not be able to respond to any other requests from TaskManagers. This could lead to heartbeat timeouts that trigger a failover. In the worst case, this will render the Flink cluster unusable because it cannot deploy the job.
+
+To improve the performance of the scheduler for large-scale jobs, we've implemented several optimizations in Flink 1.13 and 1.14:
+
+1. Introduce the concept of consuming groups to optimize procedures related to the complexity of topologies, including the initialization, scheduling, failover, and partition release. This also reduces the memory required to store the topology;
+2. Introduce a cache to optimize task deployment, which makes the process faster and requires less memory;
+3. Leverage characteristics of the logical topology and the scheduling topology to speed up the building of pipelined regions.
+
+# Benchmarking Results
+
+To estimate the effect of our optimizations, we conducted several experiments to compare the performance of Flink 1.12 (before the optimization) with Flink 1.14 (after the optimization). The job in our experiments contains two vertices connected with an all-to-all edge. The parallelisms of these vertices are both 10K. To make temporary deployment descriptors distributed via the blob server, we set the configuration [blob.offload.minsize]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/docs [...]
+
+<center>
+Table 1 - The comparison of time cost between Flink 1.12 and 1.14
+<table width="95%" border="1">
+  <thead>
+    <tr>
+      <th style="text-align: center">Procedure</th>
+      <th style="text-align: center">1.12</th>
+      <th style="text-align: center">1.14</th>
+      <th style="text-align: center">Reduction(%)</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td style="text-align: center">Job Initialization</td>
+      <td style="text-align: center">11,431ms</td>
+      <td style="text-align: center">627ms</td>
+      <td style="text-align: center">94.51%</td>
+    </tr>
+    <tr>
+      <td style="text-align: center">Task Deployment</td>
+      <td style="text-align: center">63,118ms</td>
+      <td style="text-align: center">17,183ms</td>
+      <td style="text-align: center">72.78%</td>
+    </tr>
+    <tr>
+      <td style="text-align: center">Computing tasks to restart when failover</td>
+      <td style="text-align: center">37,195ms</td>
+      <td style="text-align: center">170ms</td>
+      <td style="text-align: center">99.55%</td>
+    </tr>
+  </tbody>
+</table>
+</center>
+
+<br/>
+In addition to quicker speeds, the memory usage is significantly reduced. It requires 30 GiB heap memory for a JobManager to deploy the test job and keep it running stably with Flink 1.12, while the minimum heap memory required by the JobManager with Flink 1.14 is only 2 GiB.
+
+There are also less occurrences of long-term garbage collection. When running the test job with Flink 1.12, a garbage collection that lasts more than 10 seconds occurs during both job initialization and task deployment. With Flink 1.14, since there is no long-term garbage collection, there is also a decreased risk of heartbeat timeouts, which creates better cluster stability.
+
+In our experiment, it took more than 4 minutes for the large-scale job with Flink 1.12 to transition to running (excluding the time spent on allocating resources). With Flink 1.14, it took no more than 30 seconds (excluding the time spent on allocating resources). The time cost is reduced by 87%. Thus, for users who are running large-scale jobs for production and want better scheduling performance, please consider upgrading Flink to 1.14.
+
+In [part two](/2022/01/04/scheduler-performance-part-two) of this blog post, we are going to talk about these improvements in detail.
diff --git a/_posts/2022-01-04-scheduler-performance-part-two.md b/_posts/2022-01-04-scheduler-performance-part-two.md
new file mode 100644
index 0000000..87ce0af
--- /dev/null
+++ b/_posts/2022-01-04-scheduler-performance-part-two.md
@@ -0,0 +1,148 @@
+---
+layout: post
+title: "How We Improved Scheduler Performance for Large-scale Jobs - Part Two"
+date: 2022-01-04T08:00:00.000Z
+authors:
+- Zhilong Hong:
+  name: "Zhilong Hong"
+- Zhu Zhu:
+  name: "Zhu Zhu"
+- Daisy Tsang:
+  name: "Daisy Tsang"
+- Till Rohrmann:
+  name: "Till Rohrmann"
+  twitter: "stsffap"
+
+excerpt: Part one of this blog post briefly introduced the optimizations we’ve made to improve the performance of the scheduler; compared to Flink 1.12, the time cost and memory usage of scheduling large-scale jobs in Flink 1.14 is significantly reduced. In part two, we will elaborate on the details of these optimizations.
+---
+
+[Part one](/2022/01/04/scheduler-performance-part-one) of this blog post briefly introduced the optimizations we’ve made to improve the performance of the scheduler; compared to Flink 1.12, the time cost and memory usage of scheduling large-scale jobs in Flink 1.14 is significantly reduced. In part two, we will elaborate on the details of these optimizations.
+
+{% toc %}
+
+# Reducing complexity with groups
+
+A distribution pattern describes how consumer tasks are connected to producer tasks. Currently, there are two distribution patterns in Flink: pointwise and all-to-all. When the distribution pattern is pointwise between two vertices, the [computational complexity](https://en.wikipedia.org/wiki/Big_O_notation) of traversing all edges is O(n). When the distribution pattern is all-to-all, the complexity of traversing all edges is O(n<sup>2</sup>), which means that complexity increases rapidl [...]
+
+<center>
+<br/>
+<img src="{{site.baseurl}}/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg" width="75%"/>
+<br/>
+Fig. 1 - Two distribution patterns in Flink
+</center>
+
+<br/>
+In Flink 1.12, the [ExecutionEdge]({{site.DOCS_BASE_URL}}flink-docs-release-1.12/api/java/org/apache/flink/runtime/executiongraph/ExecutionEdge.html) class is used to store the information of connections between tasks. This means that for the all-to-all distribution pattern, there would be O(n<sup>2</sup>) ExecutionEdges, which would take up a lot of memory for large-scale jobs. For two [JobVertices]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/jobgraph [...]
+
+As we can see in Fig. 1, for two JobVertices connected with the all-to-all distribution pattern, all [IntermediateResultPartitions]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartition.html) produced by upstream [ExecutionVertices]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/ExecutionVertex.html) are [isomorphic](https://en.wikipedia.org/wiki/Isomorphism), which means  [...]
+
+For the all-to-all distribution pattern, since all downstream ExecutionVertices belonging to the same JobVertex are isomorphic and belong to a single group, all the result partitions they consume are connected to this group. This group is called [ConsumerVertexGroup]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/scheduler/strategy/ConsumerVertexGroup.html). Inversely, all the upstream result partitions are grouped into a single group, and all the consume [...]
+
+The basic idea of our optimizations is to put all the vertices that consume the same result partitions into one ConsumerVertexGroup, and put all the result partitions with the same consumer vertices into one ConsumedPartitionGroup.
+
+<center>
+<br/>
+<img src="{{site.baseurl}}/img/blog/2022-01-05-scheduler-performance/2-groups.svg" width="80%"/>
+<br/>
+Fig. 2 - How partitions and vertices are grouped w.r.t. distribution patterns
+</center>
+
+<br/>
+When scheduling tasks, Flink needs to iterate over all the connections between result partitions and consumer vertices. In the past, since there were O(n<sup>2</sup>) edges in total, the overall complexity of the iteration was O(n<sup>2</sup>). Now ExecutionEdge is replaced with ConsumerVertexGroup and ConsumedPartitionGroup. As all the isomorphic result partitions are connected to the same downstream ConsumerVertexGroup, when the scheduler iterates over all the connections, it just need [...]
+
+For the pointwise distribution pattern, one ConsumedPartitionGroup is connected to one ConsumerVertexGroup point-to-point. The number of groups is the same as the number of ExecutionEdges. Thus, the computational complexity of iterating over the groups is still O(n).
+
+For the example job we mentioned above, replacing ExecutionEdges with the groups can effectively reduce the memory usage of [ExecutionGraph]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.html) from more than 4 GiB to about 12 MiB. Based on the concept of groups, we further optimized several procedures, including job initialization, scheduling tasks, failover, and partition releasing. These procedures are all involved with tr [...]
+
+# Optimizations related to task deployment
+
+## The problem
+
+In Flink 1.12, it takes a long time to deploy tasks for large-scale jobs if they contain all-to-all edges. Furthermore, a heartbeat timeout may happen during or after task deployment, which makes the cluster unstable.
+
+Currently, task deployment includes the following steps:
+
+1. A JobManager creates [TaskDeploymentDescriptors]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/deployment/TaskDeploymentDescriptor.html) for each task, which happens in the JobManager's main thread;
+2. The JobManager serializes TaskDeploymentDescriptors asynchronously;
+3. The JobManager ships serialized TaskDeploymentDescriptors to TaskManagers via RPC messages;
+4. TaskManagers create new tasks based on the TaskDeploymentDescriptors and execute them.
+
+A TaskDeploymentDescriptor (TDD) contains all the information required by TaskManagers to create a task. At the beginning of task deployment, a JobManager creates the TDDs for all tasks. Since this happens in the main thread, the JobManager cannot respond to any other requests. For large-scale jobs, the main thread may get blocked for a long time, heartbeat timeouts may happen, and a failover would be triggered.
+
+A JobManager can become a bottleneck during task deployment since all descriptors are transmitted from it to all TaskManagers. For large-scale jobs, these temporary descriptors would require a lot of heap memory and cause frequent long-term garbage collection pauses.
+
+Thus, we need to speed up the creation of the TDDs. Furthermore, if the size of descriptors can be reduced, then they will be transmitted faster, which leads to faster task deployments.
+
+## The solution
+
+### Cache ShuffleDescriptors
+
+[ShuffleDescriptor]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/shuffle/ShuffleDescriptor.html)s are used to describe the information of result partitions that a task consumes and can be the largest part of a TaskDeploymentDescriptor. For an all-to-all edge, when the parallelisms of both upstream and downstream vertices are n, the number of ShuffleDescriptors for each downstream vertex is n, since they are connected to n upstream vertices. Thus, the to [...]
+
+However, the ShuffleDescriptors for the downstream vertices are all the same since they all consume the same upstream result partitions. Therefore, Flink doesn't need to create ShuffleDescriptors for each downstream vertex individually. Instead, it can create them once and cache them to be reused. This will decrease the overall complexity of creating TaskDeploymentDescriptors for tasks from O(n<sup>2</sup>) to O(n).
+
+To decrease the size of RPC messages and reduce the transmission of replicated data over the network, the cached ShuffleDescriptors can be compressed. For the example job we mentioned above, if the parallelisms of vertices are both 10k, then each downstream vertex has 10k ShuffleDescriptors. After compression, the size of the serialized value would be reduced by 72%.
+
+### Distribute ShuffleDescriptors via the blob server
+
+A [blob](https://en.wikipedia.org/wiki/Binary_large_object) (binary large objects) is a collection of binary data used to store large files. Flink hosts a blob server to transport large-sized data between the JobManager and TaskManagers. When a JobManager decides to transmit a large file to TaskManagers, it would first store the file in the blob server (will also upload files to the distributed file system) and get a token representing the blob, called the blob key. It would then transmi [...]
+
+During task deployment, the JobManager is responsible for distributing the ShuffleDescriptors to TaskManagers via RPC messages. The messages will be garbage collected once they are sent. However, if the JobManager cannot send the messages as fast as they are created, these messages would take up a lot of space in heap memory and become a heavy burden for the garbage collector to deal with. There will be more long-term garbage collections that stop the world and slow down the task deployment.
+
+To solve this problem, the blob server can be used to distribute large ShuffleDescriptors. The JobManager first sends ShuffleDescriptors to the blob server, which stores ShuffleDescriptors in the DFS. TaskManagers request ShuffleDescriptors from the DFS once they begin to process TaskDeploymentDescriptors. With this change, the JobManager doesn't need to keep all the copies of ShuffleDescriptors in heap memory until they are sent. Moreover, the frequency of garbage collections for large- [...]
+
+<center>
+<br/>
+<img src="{{site.baseurl}}/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg" width="80%"/>
+<br/>
+Fig. 3 - How ShuffleDescriptors are distributed
+</center>
+
+<br/>
+To avoid running out of space on the local disk, the cache will be cleared when the related partitions are no longer valid and a size limit is added for ShuffleDescriptors in the blob cache on TaskManagers. If the overall size exceeds the limit, the least recently used cached value will be removed. This ensures that the local disks on the JobManager and TaskManagers won't be filled up with ShuffleDescriptors, especially in session mode.
+
+# Optimizations when building pipelined regions
+
+In Flink, there are two types of data exchanges: pipelined and blocking. When using blocking data exchanges, result partitions are first fully produced and then consumed by the downstream vertices. The produced results are persisted and can be consumed multiple times. When using pipelined data exchanges, result partitions are produced and consumed concurrently. The produced results are not persisted and can be consumed only once.
+
+Since the pipelined data stream is produced and consumed simultaneously, Flink needs to make sure that the vertices connected via pipelined data exchanges execute at the same time. These vertices form a [pipelined region]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/topology/PipelinedRegion.html). The pipelined region is the basic unit of scheduling and failover by default. During scheduling, all vertices in a pipelined region will be scheduled together [...]
+
+<center>
+<br/>
+<img src="{{site.baseurl}}/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg" width="90%"/>
+<br/>
+Fig. 4 - The LogicalPipelinedRegion and the SchedulingPipelinedRegion
+</center>
+
+<br/>
+Currently, there are two types of pipelined regions in the scheduler: [LogicalPipelinedRegion]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/jobgraph/topology/LogicalPipelinedRegion.html) and [SchedulingPipelinedRegion]({{site.DOCS_BASE_URL}}flink-docs-release-1.14/api/java/org/apache/flink/runtime/scheduler/strategy/SchedulingPipelinedRegion.html). The LogicalPipelinedRegion denotes the pipelined regions on the logical level. It consists of JobVertices  [...]
+
+During the construction of pipelined regions, a problem arises: There may be cyclic dependencies between pipelined regions. A pipelined region can be scheduled if and only if all its dependencies have finished. However, if there are two pipelined regions with cyclic dependencies between each other, there will be a scheduling [deadlock](https://en.wikipedia.org/wiki/Deadlock). They are both waiting for the other one to be scheduled first, and none of them can be scheduled. Therefore, [Tar [...]
+
+<center>
+<br/>
+<img src="{{site.baseurl}}/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg" width="90%"/>
+<br/>
+Fig. 5 - The topology with scheduling deadlock
+</center>
+
+<br/>
+To speed up the construction of pipelined regions, the relevance between the logical topology and the scheduling topology can be leveraged. Since a SchedulingPipelinedRegion is derived from just one LogicalPipelinedRegion, Flink traverses all LogicalPipelinedRegions and converts them into SchedulingPipelinedRegions one by one. The conversion varies based on the distribution patterns of edges that connect vertices in the LogicalPipelinedRegion.
+
+If there are any all-to-all distribution patterns inside the region, the entire region can just be converted into one SchedulingPipelinedRegion directly. That's because for the all-to-all edge with the pipelined data exchange, all the regions connected to this edge must execute simultaneously, which means they are merged into one region. For the all-to-all edge with a blocking data exchange, it will introduce cyclic dependencies, as Fig. 5 shows. All the regions it connects must be merge [...]
+
+If there are only pointwise distribution patterns inside a region, Tarjan's strongly connected components algorithm is still used to ensure no cyclic dependencies. Since there are only pointwise distribution patterns, the number of edges in the topology is O(n), and the computational complexity of the algorithm will be O(n).
+
+<center>
+<br/>
+<img src="{{site.baseurl}}/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg" width="90%"/>
+<br/>
+Fig. 6 - How to convert a LogicalPipelinedRegion to ScheduledPipelinedRegions
+</center>
+
+<br/>
+After the optimization, the overall computational complexity of building pipelined regions decreases from O(n<sup>2</sup>) to O(n). In our experiments, for the job which contains two vertices connected with a blocking all-to-all edge, when their parallelisms are both 10K, the time of building pipelined regions decreases by 99%, from 8,257 ms to 120 ms.
+
+# Summary
+
+All in all, we've done several optimizations to improve the scheduler’s performance for large-scale jobs in Flink 1.13 and 1.14. The optimizations involve procedures including job initialization, scheduling, task deployment, and failover. If you have any questions about them, please feel free to start a discussion in the dev mail list.
diff --git a/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg b/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg
new file mode 100644
index 0000000..5424fbd
--- /dev/null
+++ b/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="987px" height="357px" viewBox="-0.5 -0.5 987 357" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2021-12-29T11:32:47.369Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; etag=&quot;WNKLNoexVU8kdb9qBtNl&quot; version=&quot;16.1.0&quot; type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git a/img/blog/2022-01-05-scheduler-performance/2-groups.svg b/img/blog/2022-01-05-scheduler-performance/2-groups.svg
new file mode 100644
index 0000000..f62484b
--- /dev/null
+++ b/img/blog/2022-01-05-scheduler-performance/2-groups.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="1117px" height="367px" viewBox="-0.5 -0.5 1117 367" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2021-12-29T11:48:39.835Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; etag=&quot;r17mJOWVV4jHEWX0ACX3&quot; version=&quot;16.1.0&quot; type=&quot;google&quot;&gt;&lt;diagram  [...]
\ No newline at end of file
diff --git a/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg b/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg
new file mode 100644
index 0000000..9032535
--- /dev/null
+++ b/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="832px" height="422px" viewBox="-0.5 -0.5 832 422" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2021-12-29T11:49:38.587Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; etag=&quot;sj7fJ-_3TWIaCKJk82m5&quot; version=&quot;16.1.0&quot; type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git a/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg b/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg
new file mode 100644
index 0000000..0f4494c
--- /dev/null
+++ b/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="962px" height="382px" viewBox="-0.5 -0.5 962 382" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2022-01-04T12:41:09.588Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; etag=&quot;M1L6mcgOaCav-WM3zpr-&quot; version=&quot;16.1.4&quot; type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git a/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg b/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg
new file mode 100644
index 0000000..2c743e8
--- /dev/null
+++ b/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="918px" height="361px" viewBox="-0.5 -0.5 918 361" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2022-01-04T12:36:25.839Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; etag=&quot;bH7J1WTlE5dDkxqGV3PL&quot; version=&quot;16.1.0&quot; type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git a/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg b/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg
new file mode 100644
index 0000000..b2a44e0
--- /dev/null
+++ b/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="971px" height="942px" viewBox="-0.5 -0.5 971 942" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2021-12-29T11:52:06.980Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; etag=&quot;W0obqumORf-6iY1HI_oF&quot; version=&quot;16.1.0&quot; type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file

[flink-web] 02/02: Rebuild website

Posted by tr...@apache.org.

This is an automated email from the ASF dual-hosted git repository.

trohrmann pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git

commit d9a7d88e60f674a284c71df60ceaa01cf2a278e0
Author: Till Rohrmann <tr...@apache.org>
AuthorDate: Tue Jan 4 14:45:05 2022 +0100

    Rebuild website
---
 .../01/04/scheduler-performance-part-one.html}     | 337 +++---------
 .../2022/01/04/scheduler-performance-part-two.html | 403 ++++++++++++++
 content/blog/feed.xml                              | 597 ++++++++-------------
 content/blog/index.html                            |  85 +--
 content/blog/page10/index.html                     |  85 +--
 content/blog/page11/index.html                     |  89 +--
 content/blog/page12/index.html                     |  92 ++--
 content/blog/page13/index.html                     |  90 ++--
 content/blog/page14/index.html                     |  87 +--
 content/blog/page15/index.html                     |  88 +--
 content/blog/page16/index.html                     |  89 +--
 content/blog/page17/index.html                     |  80 ++-
 content/blog/{page11 => page18}/index.html         | 181 ++-----
 content/blog/page2/index.html                      |  85 ++-
 content/blog/page3/index.html                      |  81 ++-
 content/blog/page4/index.html                      |  83 +--
 content/blog/page5/index.html                      |  85 +--
 content/blog/page6/index.html                      |  83 ++-
 content/blog/page7/index.html                      |  81 ++-
 content/blog/page8/index.html                      |  83 +--
 content/blog/page9/index.html                      |  83 ++-
 .../1-distribution-pattern.svg                     |   4 +
 .../2022-01-05-scheduler-performance/2-groups.svg  |   4 +
 .../3-how-shuffle-descriptors-are-distributed.svg  |   4 +
 .../4-pipelined-region.svg                         |   4 +
 .../5-scheduling-deadlock.svg                      |   4 +
 .../6-building-pipelined-region.svg                |   4 +
 content/index.html                                 |  12 +-
 content/zh/index.html                              |  12 +-
 29 files changed, 1725 insertions(+), 1290 deletions(-)

diff --git a/content/index.html b/content/2022/01/04/scheduler-performance-part-one.html
similarity index 55%
copy from content/index.html
copy to content/2022/01/04/scheduler-performance-part-one.html
index 6f78a70..8a4ef82 100644
--- a/content/index.html
+++ b/content/2022/01/04/scheduler-performance-part-one.html
@@ -5,7 +5,7 @@
     <meta http-equiv="X-UA-Compatible" content="IE=edge">
     <meta name="viewport" content="width=device-width, initial-scale=1">
     <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
-    <title>Apache Flink: Stateful Computations over Data Streams</title>
+    <title>Apache Flink: How We Improved Scheduler Performance for Large-scale Jobs - Part One</title>
     <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
     <link rel="icon" href="/favicon.ico" type="image/x-icon">
 
@@ -145,7 +145,7 @@
             <li>
               
                 
-                  <a href="/zh/">中文版</a>
+                  <a href="/zh/2022/01/04/scheduler-performance-part-one.html">中文版</a>
                 
               
             </li>
@@ -193,261 +193,94 @@
       </div>
       <div class="col-sm-9">
       <div class="row-fluid">
-
   <div class="col-sm-12">
-    <p class="lead">
-      <strong>Apache Flink<sup>®</sup> — Stateful Computations over Data Streams</strong>
-    </p>
-  </div>
-
-<div class="col-sm-12">
-  <hr />
-</div>
-
-</div>
-
-<!-- High-level architecture figure -->
-
-<div class="row front-graphic">
-  <hr />
-  <img src="/img/flink-home-graphic.png" width="800px" />
-</div>
-
-<!-- Feature grid -->
-
-<!--
-<div class="row">
-  <div class="col-sm-12">
-    <hr />
-    <h2><a href="/features.html">Features</a></h2>
-  </div>
-</div>
--->
-<div class="row">
-  <div class="col-sm-4">
-    <div class="panel panel-default">
-      <div class="panel-heading">
-        <span class="glyphicon glyphicon-th"></span> <b>All streaming use cases</b>
-      </div>
-      <div class="panel-body">
-        <ul style="font-size: small;">
-          <li>Event-driven Applications</li>
-          <li>Stream &amp; Batch Analytics</li>
-          <li>Data Pipelines &amp; ETL</li>
-        </ul>
-        <a href="/usecases.html">Learn more</a>
-      </div>
-    </div>
-  </div>
-  <div class="col-sm-4">
-    <div class="panel panel-default">
-      <div class="panel-heading">
-        <span class="glyphicon glyphicon-ok"></span> <b>Guaranteed correctness</b>
-      </div>
-      <div class="panel-body">
-        <ul style="font-size: small;">
-          <li>Exactly-once state consistency</li>
-          <li>Event-time processing</li>
-          <li>Sophisticated late data handling</li>
-        </ul>
-        <a href="/flink-applications.html#building-blocks-for-streaming-applications">Learn more</a>
-      </div>
-    </div>
-  </div>
-  <div class="col-sm-4">
-    <div class="panel panel-default">
-      <div class="panel-heading">
-        <span class="glyphicon glyphicon glyphicon-sort-by-attributes"></span> <b>Layered APIs</b>
-      </div>
-      <div class="panel-body">
-        <ul style="font-size: small;">
-          <li>SQL on Stream &amp; Batch Data</li>
-          <li>DataStream API &amp; DataSet API</li>
-          <li>ProcessFunction (Time &amp; State)</li>
-        </ul>
-        <a href="/flink-applications.html#layered-apis">Learn more</a>
-      </div>
-    </div>
-  </div>
-</div>
-<div class="row">
-  <div class="col-sm-4">
-    <div class="panel panel-default">
-      <div class="panel-heading">
-        <span class="glyphicon glyphicon-dashboard"></span> <b>Operational Focus</b>
-      </div>
-      <div class="panel-body">
-        <ul style="font-size: small;">
-          <li>Flexible deployment</li>
-          <li>High-availability setup</li>
-          <li>Savepoints</li>
-        </ul>
-        <a href="/flink-operations.html">Learn more</a>
-      </div>
-    </div>
-  </div>
-  <div class="col-sm-4">
-    <div class="panel panel-default">
-      <div class="panel-heading">
-        <span class="glyphicon glyphicon-fullscreen"></span> <b>Scales to any use case</b>
-      </div>
-      <div class="panel-body">
-        <ul style="font-size: small;">
-          <li>Scale-out architecture</li>
-          <li>Support for very large state</li>
-          <li>Incremental checkpointing</li>
-        </ul>
-        <a href="/flink-architecture.html#run-applications-at-any-scale">Learn more</a>
-      </div>
+    <div class="row">
+      <h1>How We Improved Scheduler Performance for Large-scale Jobs - Part One</h1>
+      <p><i></i></p>
+
+      <article>
+        <p>04 Jan 2022 Zhilong Hong , Zhu Zhu , Daisy Tsang , &amp; Till Rohrmann (<a href="https://twitter.com/stsffap">@stsffap</a>)</p>
+
+<h1 id="introduction">Introduction</h1>
+
+<p>When scheduling large-scale jobs in Flink 1.12, a lot of time is required to initialize jobs and deploy tasks. The scheduler also requires a large amount of heap memory in order to store the execution topology and host temporary deployment descriptors. For example, for a job with a topology that contains two vertices connected with an all-to-all edge and a parallelism of 10k (which means there are 10k source tasks and 10k sink tasks and every source task is connected to all sink tasks [...]
+
+<p>Furthermore, task deployment may block the JobManager’s main thread for a long time and the JobManager will not be able to respond to any other requests from TaskManagers. This could lead to heartbeat timeouts that trigger a failover. In the worst case, this will render the Flink cluster unusable because it cannot deploy the job.</p>
+
+<p>To improve the performance of the scheduler for large-scale jobs, we’ve implemented several optimizations in Flink 1.13 and 1.14:</p>
+
+<ol>
+  <li>Introduce the concept of consuming groups to optimize procedures related to the complexity of topologies, including the initialization, scheduling, failover, and partition release. This also reduces the memory required to store the topology;</li>
+  <li>Introduce a cache to optimize task deployment, which makes the process faster and requires less memory;</li>
+  <li>Leverage characteristics of the logical topology and the scheduling topology to speed up the building of pipelined regions.</li>
+</ol>
+
+<h1 id="benchmarking-results">Benchmarking Results</h1>
+
+<p>To estimate the effect of our optimizations, we conducted several experiments to compare the performance of Flink 1.12 (before the optimization) with Flink 1.14 (after the optimization). The job in our experiments contains two vertices connected with an all-to-all edge. The parallelisms of these vertices are both 10K. To make temporary deployment descriptors distributed via the blob server, we set the configuration <a href="https://nightlies.apache.org/flink/flink-docs-release-1.14/do [...]
+
+<center>
+Table 1 - The comparison of time cost between Flink 1.12 and 1.14
+<table width="95%" border="1">
+  <thead>
+    <tr>
+      <th style="text-align: center">Procedure</th>
+      <th style="text-align: center">1.12</th>
+      <th style="text-align: center">1.14</th>
+      <th style="text-align: center">Reduction(%)</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td style="text-align: center">Job Initialization</td>
+      <td style="text-align: center">11,431ms</td>
+      <td style="text-align: center">627ms</td>
+      <td style="text-align: center">94.51%</td>
+    </tr>
+    <tr>
+      <td style="text-align: center">Task Deployment</td>
+      <td style="text-align: center">63,118ms</td>
+      <td style="text-align: center">17,183ms</td>
+      <td style="text-align: center">72.78%</td>
+    </tr>
+    <tr>
+      <td style="text-align: center">Computing tasks to restart when failover</td>
+      <td style="text-align: center">37,195ms</td>
+      <td style="text-align: center">170ms</td>
+      <td style="text-align: center">99.55%</td>
+    </tr>
+  </tbody>
+</table>
+</center>
+
+<p><br />
+In addition to quicker speeds, the memory usage is significantly reduced. It requires 30 GiB heap memory for a JobManager to deploy the test job and keep it running stably with Flink 1.12, while the minimum heap memory required by the JobManager with Flink 1.14 is only 2 GiB.</p>
+
+<p>There are also less occurrences of long-term garbage collection. When running the test job with Flink 1.12, a garbage collection that lasts more than 10 seconds occurs during both job initialization and task deployment. With Flink 1.14, since there is no long-term garbage collection, there is also a decreased risk of heartbeat timeouts, which creates better cluster stability.</p>
+
+<p>In our experiment, it took more than 4 minutes for the large-scale job with Flink 1.12 to transition to running (excluding the time spent on allocating resources). With Flink 1.14, it took no more than 30 seconds (excluding the time spent on allocating resources). The time cost is reduced by 87%. Thus, for users who are running large-scale jobs for production and want better scheduling performance, please consider upgrading Flink to 1.14.</p>
+
+<p>In <a href="/2022/01/04/scheduler-performance-part-two">part two</a> of this blog post, we are going to talk about these improvements in detail.</p>
+
+      </article>
     </div>
-  </div>
-  <div class="col-sm-4">
-    <div class="panel panel-default">
-      <div class="panel-heading">
-        <span class="glyphicon glyphicon-flash"></span> <b>Excellent Performance</b>
-      </div>
-      <div class="panel-body">
-        <ul style="font-size: small;">
-          <li>Low latency</li>
-          <li>High throughput</li>
-          <li>In-Memory computing</li>
-        </ul>
-        <a href="/flink-architecture.html#leverage-in-memory-performance">Learn more</a>
-      </div>
+
+    <div class="row">
+      <div id="disqus_thread"></div>
+      <script type="text/javascript">
+        /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
+        var disqus_shortname = 'stratosphere-eu'; // required: replace example with your forum shortname
+
+        /* * * DON'T EDIT BELOW THIS LINE * * */
+        (function() {
+            var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+            dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
+             (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+        })();
+      </script>
     </div>
   </div>
 </div>
-
-<!-- Events section -->
-<div class="row">
-
-<div class="col-sm-12">
-  <hr />
-</div>
-
-<div class="col-sm-3">
-
-  <h2><a>Upcoming Events</a></h2>
-
-</div>
-<div class="col-sm-9">
-  <!-- Flink Forward -->
-  <a href="https://flink-forward.org" target="_blank">
-    <img style="width: 180px; padding-right: 10px" src="/img/flink-forward.png" alt="Flink Forward" />
-  </a>
-  <!-- ApacheCon -->
-  <a href="https://www.apache.org/events/current-event" target="_blank">
-    <img style="width: 200px; padding-right: 10px" src="https://www.apache.org/events/current-event-234x60.png" alt="ApacheCon" />
-  </a>
-    <!-- Flink Forward Asia -->
-    <a href="https://flink-forward.org.cn/" target="_blank">
-      <img style="width: 230px" src="/img/flink-forward-asia.png" alt="Flink Forward Asia" />
-    </a>
-</div>
-
-</div>
-
-<!-- Updates section -->
-
-<div class="row">
-
-<div class="col-sm-12">
-  <hr />
-</div>
-
-<div class="col-sm-3">
-
-  <h2><a href="/blog">Latest Blog Posts</a></h2>
-
-</div>
-
-<div class="col-sm-9">
-
-  <dl>
-      
-        <dt> <a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></dt>
-        <dd><p>The Apache Flink community has released an emergency bugfix version of Apache Flink Stateful Function 3.1.1.</p>
-
-</dd>
-      
-        <dt> <a href="/news/2021/12/16/log4j-patch-releases.html">Apache Flink Log4j emergency releases</a></dt>
-        <dd><p>The Apache Flink community has released emergency bugfix versions of Apache Flink for the 1.11, 1.12, 1.13 and 1.14 series.</p>
-
-</dd>
-      
-        <dt> <a href="/2021/12/10/log4j-cve.html">Advise on Apache Log4j Zero Day (CVE-2021-44228)</a></dt>
-        <dd>Apache Flink is affected by an Apache Log4j Zero Day (CVE-2021-44228). This blog post contains advise for users on how to address this.</dd>
-      
-        <dt> <a href="/2021/11/03/flink-backward.html">Flink Backward - The Apache Flink Retrospective</a></dt>
-        <dd>A look back at the development cycle for Flink 1.14</dd>
-      
-        <dt> <a href="/2021/10/26/sort-shuffle-part2.html">Sort-Based Blocking Shuffle Implementation in Flink - Part Two</a></dt>
-        <dd>Flink has implemented the sort-based blocking shuffle (FLIP-148) for batch data processing. In this blog post, we will take a close look at the design &amp; implementation details and see what we can gain from it.</dd>
-    
-  </dl>
-
-</div>
-
-<!-- Scripts section -->
-
-<script type="text/javascript" src="/js/jquery.jcarousel.min.js"></script>
-
-<script type="text/javascript">
-
-  $(window).load(function(){
-   $(function() {
-        var jcarousel = $('.jcarousel');
-
-        jcarousel
-            .on('jcarousel:reload jcarousel:create', function () {
-                var carousel = $(this),
-                    width = carousel.innerWidth();
-
-                if (width >= 600) {
-                    width = width / 4;
-                } else if (width >= 350) {
-                    width = width / 3;
-                }
-
-                carousel.jcarousel('items').css('width', Math.ceil(width) + 'px');
-            })
-            .jcarousel({
-                wrap: 'circular',
-                autostart: true
-            });
-
-        $('.jcarousel-control-prev')
-            .jcarouselControl({
-                target: '-=1'
-            });
-
-        $('.jcarousel-control-next')
-            .jcarouselControl({
-                target: '+=1'
-            });
-
-        $('.jcarousel-pagination')
-            .on('jcarouselpagination:active', 'a', function() {
-                $(this).addClass('active');
-            })
-            .on('jcarouselpagination:inactive', 'a', function() {
-                $(this).removeClass('active');
-            })
-            .on('click', function(e) {
-                e.preventDefault();
-            })
-            .jcarouselPagination({
-                perPage: 1,
-                item: function(page) {
-                    return '<a href="#' + page + '">' + page + '</a>';
-                }
-            });
-    });
-  });
-
-</script>
-</div>
-
       </div>
     </div>
 
diff --git a/content/2022/01/04/scheduler-performance-part-two.html b/content/2022/01/04/scheduler-performance-part-two.html
new file mode 100644
index 0000000..35cb684
--- /dev/null
+++ b/content/2022/01/04/scheduler-performance-part-two.html
@@ -0,0 +1,403 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
+    <title>Apache Flink: How We Improved Scheduler Performance for Large-scale Jobs - Part Two</title>
+    <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
+    <link rel="icon" href="/favicon.ico" type="image/x-icon">
+
+    <!-- Bootstrap -->
+    <link rel="stylesheet" href="/css/bootstrap.min.css">
+    <link rel="stylesheet" href="/css/flink.css">
+    <link rel="stylesheet" href="/css/syntax.css">
+
+    <!-- Blog RSS feed -->
+    <link href="/blog/feed.xml" rel="alternate" type="application/rss+xml" title="Apache Flink Blog: RSS feed" />
+
+    <!-- jQuery (necessary for Bootstrap's JavaScript plugins) -->
+    <!-- We need to load Jquery in the header for custom google analytics event tracking-->
+    <script src="/js/jquery.min.js"></script>
+
+    <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
+    <!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
+    <!--[if lt IE 9]>
+      <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
+      <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
+    <![endif]-->
+  </head>
+  <body>  
+    
+
+    <!-- Main content. -->
+    <div class="container">
+    <div class="row">
+
+      
+     <div id="sidebar" class="col-sm-3">
+        
+
+<!-- Top navbar. -->
+    <nav class="navbar navbar-default">
+        <!-- The logo. -->
+        <div class="navbar-header">
+          <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+          </button>
+          <div class="navbar-logo">
+            <a href="/">
+              <img alt="Apache Flink" src="/img/flink-header-logo.svg" width="147px" height="73px">
+            </a>
+          </div>
+        </div><!-- /.navbar-header -->
+
+        <!-- The navigation links. -->
+        <div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1">
+          <ul class="nav navbar-nav navbar-main">
+
+            <!-- First menu section explains visitors what Flink is -->
+
+            <!-- What is Stream Processing? -->
+            <!--
+            <li><a href="/streamprocessing1.html">What is Stream Processing?</a></li>
+            -->
+
+            <!-- What is Flink? -->
+            <li><a href="/flink-architecture.html">What is Apache Flink?</a></li>
+
+            
+
+            <!-- What is Stateful Functions? -->
+
+            <li><a href="/stateful-functions.html">What is Stateful Functions?</a></li>
+
+            <!-- Use cases -->
+            <li><a href="/usecases.html">Use Cases</a></li>
+
+            <!-- Powered by -->
+            <li><a href="/poweredby.html">Powered By</a></li>
+
+
+            &nbsp;
+            <!-- Second menu section aims to support Flink users -->
+
+            <!-- Downloads -->
+            <li><a href="/downloads.html">Downloads</a></li>
+
+            <!-- Getting Started -->
+            <li class="dropdown">
+              <a class="dropdown-toggle" data-toggle="dropdown" href="#">Getting Started<span class="caret"></span></a>
+              <ul class="dropdown-menu">
+                <li><a href="https://nightlies.apache.org/flink/flink-docs-release-1.14//docs/try-flink/local_installation/" target="_blank">With Flink <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+                <li><a href="https://nightlies.apache.org/flink/flink-statefun-docs-release-3.1/getting-started/project-setup.html" target="_blank">With Flink Stateful Functions <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+                <li><a href="/training.html">Training Course</a></li>
+              </ul>
+            </li>
+
+            <!-- Documentation -->
+            <li class="dropdown">
+              <a class="dropdown-toggle" data-toggle="dropdown" href="#">Documentation<span class="caret"></span></a>
+              <ul class="dropdown-menu">
+                <li><a href="https://nightlies.apache.org/flink/flink-docs-release-1.14" target="_blank">Flink 1.14 (Latest stable release) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+                <li><a href="https://nightlies.apache.org/flink/flink-docs-master" target="_blank">Flink Master (Latest Snapshot) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+                <li><a href="https://nightlies.apache.org/flink/flink-statefun-docs-release-3.1" target="_blank">Flink Stateful Functions 3.1 (Latest stable release) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+                <li><a href="https://nightlies.apache.org/flink/flink-statefun-docs-master" target="_blank">Flink Stateful Functions Master (Latest Snapshot) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+              </ul>
+            </li>
+
+            <!-- getting help -->
+            <li><a href="/gettinghelp.html">Getting Help</a></li>
+
+            <!-- Blog -->
+            <li><a href="/blog/"><b>Flink Blog</b></a></li>
+
+
+            <!-- Flink-packages -->
+            <li>
+              <a href="https://flink-packages.org" target="_blank">flink-packages.org <small><span class="glyphicon glyphicon-new-window"></span></small></a>
+            </li>
+            &nbsp;
+
+            <!-- Third menu section aim to support community and contributors -->
+
+            <!-- Community -->
+            <li><a href="/community.html">Community &amp; Project Info</a></li>
+
+            <!-- Roadmap -->
+            <li><a href="/roadmap.html">Roadmap</a></li>
+
+            <!-- Contribute -->
+            <li><a href="/contributing/how-to-contribute.html">How to Contribute</a></li>
+            
+
+            <!-- GitHub -->
+            <li>
+              <a href="https://github.com/apache/flink" target="_blank">Flink on GitHub <small><span class="glyphicon glyphicon-new-window"></span></small></a>
+            </li>
+
+            &nbsp;
+
+            <!-- Language Switcher -->
+            <li>
+              
+                
+                  <a href="/zh/2022/01/04/scheduler-performance-part-two.html">中文版</a>
+                
+              
+            </li>
+
+          </ul>
+
+          <style>
+            .smalllinks:link {
+              display: inline-block !important; background: none; padding-top: 0px; padding-bottom: 0px; padding-right: 0px; min-width: 75px;
+            }
+          </style>
+
+          <ul class="nav navbar-nav navbar-bottom">
+          <hr />
+
+            <!-- Twitter -->
+            <li><a href="https://twitter.com/apacheflink" target="_blank">@ApacheFlink <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+
+            <!-- Visualizer -->
+            <li class=" hidden-md hidden-sm"><a href="/visualizer/" target="_blank">Plan Visualizer <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+
+            <li >
+                  <a href="/security.html">Flink Security</a>
+            </li>
+
+          <hr />
+
+            <li><a href="https://apache.org" target="_blank">Apache Software Foundation <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
+
+            <li>
+
+              <a class="smalllinks" href="https://www.apache.org/licenses/" target="_blank">License</a> <small><span class="glyphicon glyphicon-new-window"></span></small>
+
+              <a class="smalllinks" href="https://www.apache.org/security/" target="_blank">Security</a> <small><span class="glyphicon glyphicon-new-window"></span></small>
+
+              <a class="smalllinks" href="https://www.apache.org/foundation/sponsorship.html" target="_blank">Donate</a> <small><span class="glyphicon glyphicon-new-window"></span></small>
+
+              <a class="smalllinks" href="https://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a> <small><span class="glyphicon glyphicon-new-window"></span></small>
+            </li>
+
+          </ul>
+        </div><!-- /.navbar-collapse -->
+    </nav>
+
+      </div>
+      <div class="col-sm-9">
+      <div class="row-fluid">
+  <div class="col-sm-12">
+    <div class="row">
+      <h1>How We Improved Scheduler Performance for Large-scale Jobs - Part Two</h1>
+      <p><i></i></p>
+
+      <article>
+        <p>04 Jan 2022 Zhilong Hong , Zhu Zhu , Daisy Tsang , &amp; Till Rohrmann (<a href="https://twitter.com/stsffap">@stsffap</a>)</p>
+
+<p><a href="/2022/01/04/scheduler-performance-part-one">Part one</a> of this blog post briefly introduced the optimizations we’ve made to improve the performance of the scheduler; compared to Flink 1.12, the time cost and memory usage of scheduling large-scale jobs in Flink 1.14 is significantly reduced. In part two, we will elaborate on the details of these optimizations.</p>
+
+<div class="page-toc">
+<ul id="markdown-toc">
+  <li><a href="#reducing-complexity-with-groups" id="markdown-toc-reducing-complexity-with-groups">Reducing complexity with groups</a></li>
+  <li><a href="#optimizations-related-to-task-deployment" id="markdown-toc-optimizations-related-to-task-deployment">Optimizations related to task deployment</a>    <ul>
+      <li><a href="#the-problem" id="markdown-toc-the-problem">The problem</a></li>
+      <li><a href="#the-solution" id="markdown-toc-the-solution">The solution</a>        <ul>
+          <li><a href="#cache-shuffledescriptors" id="markdown-toc-cache-shuffledescriptors">Cache ShuffleDescriptors</a></li>
+          <li><a href="#distribute-shuffledescriptors-via-the-blob-server" id="markdown-toc-distribute-shuffledescriptors-via-the-blob-server">Distribute ShuffleDescriptors via the blob server</a></li>
+        </ul>
+      </li>
+    </ul>
+  </li>
+  <li><a href="#optimizations-when-building-pipelined-regions" id="markdown-toc-optimizations-when-building-pipelined-regions">Optimizations when building pipelined regions</a></li>
+  <li><a href="#summary" id="markdown-toc-summary">Summary</a></li>
+</ul>
+
+</div>
+
+<h1 id="reducing-complexity-with-groups">Reducing complexity with groups</h1>
+
+<p>A distribution pattern describes how consumer tasks are connected to producer tasks. Currently, there are two distribution patterns in Flink: pointwise and all-to-all. When the distribution pattern is pointwise between two vertices, the <a href="https://en.wikipedia.org/wiki/Big_O_notation">computational complexity</a> of traversing all edges is O(n). When the distribution pattern is all-to-all, the complexity of traversing all edges is O(n<sup>2</sup>), which means that complexity in [...]
+
+<center>
+<br />
+<img src="/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg" width="75%" />
+<br />
+Fig. 1 - Two distribution patterns in Flink
+</center>
+
+<p><br />
+In Flink 1.12, the <a href="https://nightlies.apache.org/flink/flink-docs-release-1.12/api/java/org/apache/flink/runtime/executiongraph/ExecutionEdge.html">ExecutionEdge</a> class is used to store the information of connections between tasks. This means that for the all-to-all distribution pattern, there would be O(n<sup>2</sup>) ExecutionEdges, which would take up a lot of memory for large-scale jobs. For two <a href="https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/o [...]
+
+<p>As we can see in Fig. 1, for two JobVertices connected with the all-to-all distribution pattern, all <a href="https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartition.html">IntermediateResultPartitions</a> produced by upstream <a href="https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/ExecutionVertex.html">ExecutionVertices</a> are <a href="https://e [...]
+
+<p>For the all-to-all distribution pattern, since all downstream ExecutionVertices belonging to the same JobVertex are isomorphic and belong to a single group, all the result partitions they consume are connected to this group. This group is called <a href="https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/scheduler/strategy/ConsumerVertexGroup.html">ConsumerVertexGroup</a>. Inversely, all the upstream result partitions are grouped into a single [...]
+
+<p>The basic idea of our optimizations is to put all the vertices that consume the same result partitions into one ConsumerVertexGroup, and put all the result partitions with the same consumer vertices into one ConsumedPartitionGroup.</p>
+
+<center>
+<br />
+<img src="/img/blog/2022-01-05-scheduler-performance/2-groups.svg" width="80%" />
+<br />
+Fig. 2 - How partitions and vertices are grouped w.r.t. distribution patterns
+</center>
+
+<p><br />
+When scheduling tasks, Flink needs to iterate over all the connections between result partitions and consumer vertices. In the past, since there were O(n<sup>2</sup>) edges in total, the overall complexity of the iteration was O(n<sup>2</sup>). Now ExecutionEdge is replaced with ConsumerVertexGroup and ConsumedPartitionGroup. As all the isomorphic result partitions are connected to the same downstream ConsumerVertexGroup, when the scheduler iterates over all the connections, it just need [...]
+
+<p>For the pointwise distribution pattern, one ConsumedPartitionGroup is connected to one ConsumerVertexGroup point-to-point. The number of groups is the same as the number of ExecutionEdges. Thus, the computational complexity of iterating over the groups is still O(n).</p>
+
+<p>For the example job we mentioned above, replacing ExecutionEdges with the groups can effectively reduce the memory usage of <a href="https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.html">ExecutionGraph</a> from more than 4 GiB to about 12 MiB. Based on the concept of groups, we further optimized several procedures, including job initialization, scheduling tasks, failover, and partition releasing. These procedur [...]
+
+<h1 id="optimizations-related-to-task-deployment">Optimizations related to task deployment</h1>
+
+<h2 id="the-problem">The problem</h2>
+
+<p>In Flink 1.12, it takes a long time to deploy tasks for large-scale jobs if they contain all-to-all edges. Furthermore, a heartbeat timeout may happen during or after task deployment, which makes the cluster unstable.</p>
+
+<p>Currently, task deployment includes the following steps:</p>
+
+<ol>
+  <li>A JobManager creates <a href="https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/deployment/TaskDeploymentDescriptor.html">TaskDeploymentDescriptors</a> for each task, which happens in the JobManager’s main thread;</li>
+  <li>The JobManager serializes TaskDeploymentDescriptors asynchronously;</li>
+  <li>The JobManager ships serialized TaskDeploymentDescriptors to TaskManagers via RPC messages;</li>
+  <li>TaskManagers create new tasks based on the TaskDeploymentDescriptors and execute them.</li>
+</ol>
+
+<p>A TaskDeploymentDescriptor (TDD) contains all the information required by TaskManagers to create a task. At the beginning of task deployment, a JobManager creates the TDDs for all tasks. Since this happens in the main thread, the JobManager cannot respond to any other requests. For large-scale jobs, the main thread may get blocked for a long time, heartbeat timeouts may happen, and a failover would be triggered.</p>
+
+<p>A JobManager can become a bottleneck during task deployment since all descriptors are transmitted from it to all TaskManagers. For large-scale jobs, these temporary descriptors would require a lot of heap memory and cause frequent long-term garbage collection pauses.</p>
+
+<p>Thus, we need to speed up the creation of the TDDs. Furthermore, if the size of descriptors can be reduced, then they will be transmitted faster, which leads to faster task deployments.</p>
+
+<h2 id="the-solution">The solution</h2>
+
+<h3 id="cache-shuffledescriptors">Cache ShuffleDescriptors</h3>
+
+<p><a href="https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/shuffle/ShuffleDescriptor.html">ShuffleDescriptor</a>s are used to describe the information of result partitions that a task consumes and can be the largest part of a TaskDeploymentDescriptor. For an all-to-all edge, when the parallelisms of both upstream and downstream vertices are n, the number of ShuffleDescriptors for each downstream vertex is n, since they are connected to n upst [...]
+
+<p>However, the ShuffleDescriptors for the downstream vertices are all the same since they all consume the same upstream result partitions. Therefore, Flink doesn’t need to create ShuffleDescriptors for each downstream vertex individually. Instead, it can create them once and cache them to be reused. This will decrease the overall complexity of creating TaskDeploymentDescriptors for tasks from O(n<sup>2</sup>) to O(n).</p>
+
+<p>To decrease the size of RPC messages and reduce the transmission of replicated data over the network, the cached ShuffleDescriptors can be compressed. For the example job we mentioned above, if the parallelisms of vertices are both 10k, then each downstream vertex has 10k ShuffleDescriptors. After compression, the size of the serialized value would be reduced by 72%.</p>
+
+<h3 id="distribute-shuffledescriptors-via-the-blob-server">Distribute ShuffleDescriptors via the blob server</h3>
+
+<p>A <a href="https://en.wikipedia.org/wiki/Binary_large_object">blob</a> (binary large objects) is a collection of binary data used to store large files. Flink hosts a blob server to transport large-sized data between the JobManager and TaskManagers. When a JobManager decides to transmit a large file to TaskManagers, it would first store the file in the blob server (will also upload files to the distributed file system) and get a token representing the blob, called the blob key. It woul [...]
+
+<p>During task deployment, the JobManager is responsible for distributing the ShuffleDescriptors to TaskManagers via RPC messages. The messages will be garbage collected once they are sent. However, if the JobManager cannot send the messages as fast as they are created, these messages would take up a lot of space in heap memory and become a heavy burden for the garbage collector to deal with. There will be more long-term garbage collections that stop the world and slow down the task depl [...]
+
+<p>To solve this problem, the blob server can be used to distribute large ShuffleDescriptors. The JobManager first sends ShuffleDescriptors to the blob server, which stores ShuffleDescriptors in the DFS. TaskManagers request ShuffleDescriptors from the DFS once they begin to process TaskDeploymentDescriptors. With this change, the JobManager doesn’t need to keep all the copies of ShuffleDescriptors in heap memory until they are sent. Moreover, the frequency of garbage collections for lar [...]
+
+<center>
+<br />
+<img src="/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg" width="80%" />
+<br />
+Fig. 3 - How ShuffleDescriptors are distributed
+</center>
+
+<p><br />
+To avoid running out of space on the local disk, the cache will be cleared when the related partitions are no longer valid and a size limit is added for ShuffleDescriptors in the blob cache on TaskManagers. If the overall size exceeds the limit, the least recently used cached value will be removed. This ensures that the local disks on the JobManager and TaskManagers won’t be filled up with ShuffleDescriptors, especially in session mode.</p>
+
+<h1 id="optimizations-when-building-pipelined-regions">Optimizations when building pipelined regions</h1>
+
+<p>In Flink, there are two types of data exchanges: pipelined and blocking. When using blocking data exchanges, result partitions are first fully produced and then consumed by the downstream vertices. The produced results are persisted and can be consumed multiple times. When using pipelined data exchanges, result partitions are produced and consumed concurrently. The produced results are not persisted and can be consumed only once.</p>
+
+<p>Since the pipelined data stream is produced and consumed simultaneously, Flink needs to make sure that the vertices connected via pipelined data exchanges execute at the same time. These vertices form a <a href="https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/topology/PipelinedRegion.html">pipelined region</a>. The pipelined region is the basic unit of scheduling and failover by default. During scheduling, all vertices in a pipelined region [...]
+
+<center>
+<br />
+<img src="/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg" width="90%" />
+<br />
+Fig. 4 - The LogicalPipelinedRegion and the SchedulingPipelinedRegion
+</center>
+
+<p><br />
+Currently, there are two types of pipelined regions in the scheduler: <a href="https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/jobgraph/topology/LogicalPipelinedRegion.html">LogicalPipelinedRegion</a> and <a href="https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/scheduler/strategy/SchedulingPipelinedRegion.html">SchedulingPipelinedRegion</a>. The LogicalPipelinedRegion denotes the pipelined regions o [...]
+
+<p>During the construction of pipelined regions, a problem arises: There may be cyclic dependencies between pipelined regions. A pipelined region can be scheduled if and only if all its dependencies have finished. However, if there are two pipelined regions with cyclic dependencies between each other, there will be a scheduling <a href="https://en.wikipedia.org/wiki/Deadlock">deadlock</a>. They are both waiting for the other one to be scheduled first, and none of them can be scheduled. T [...]
+
+<center>
+<br />
+<img src="/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg" width="90%" />
+<br />
+Fig. 5 - The topology with scheduling deadlock
+</center>
+
+<p><br />
+To speed up the construction of pipelined regions, the relevance between the logical topology and the scheduling topology can be leveraged. Since a SchedulingPipelinedRegion is derived from just one LogicalPipelinedRegion, Flink traverses all LogicalPipelinedRegions and converts them into SchedulingPipelinedRegions one by one. The conversion varies based on the distribution patterns of edges that connect vertices in the LogicalPipelinedRegion.</p>
+
+<p>If there are any all-to-all distribution patterns inside the region, the entire region can just be converted into one SchedulingPipelinedRegion directly. That’s because for the all-to-all edge with the pipelined data exchange, all the regions connected to this edge must execute simultaneously, which means they are merged into one region. For the all-to-all edge with a blocking data exchange, it will introduce cyclic dependencies, as Fig. 5 shows. All the regions it connects must be me [...]
+
+<p>If there are only pointwise distribution patterns inside a region, Tarjan’s strongly connected components algorithm is still used to ensure no cyclic dependencies. Since there are only pointwise distribution patterns, the number of edges in the topology is O(n), and the computational complexity of the algorithm will be O(n).</p>
+
+<center>
+<br />
+<img src="/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg" width="90%" />
+<br />
+Fig. 6 - How to convert a LogicalPipelinedRegion to ScheduledPipelinedRegions
+</center>
+
+<p><br />
+After the optimization, the overall computational complexity of building pipelined regions decreases from O(n<sup>2</sup>) to O(n). In our experiments, for the job which contains two vertices connected with a blocking all-to-all edge, when their parallelisms are both 10K, the time of building pipelined regions decreases by 99%, from 8,257 ms to 120 ms.</p>
+
+<h1 id="summary">Summary</h1>
+
+<p>All in all, we’ve done several optimizations to improve the scheduler’s performance for large-scale jobs in Flink 1.13 and 1.14. The optimizations involve procedures including job initialization, scheduling, task deployment, and failover. If you have any questions about them, please feel free to start a discussion in the dev mail list.</p>
+
+      </article>
+    </div>
+
+    <div class="row">
+      <div id="disqus_thread"></div>
+      <script type="text/javascript">
+        /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
+        var disqus_shortname = 'stratosphere-eu'; // required: replace example with your forum shortname
+
+        /* * * DON'T EDIT BELOW THIS LINE * * */
+        (function() {
+            var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
+            dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
+             (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
+        })();
+      </script>
+    </div>
+  </div>
+</div>
+      </div>
+    </div>
+
+    <hr />
+
+    <div class="row">
+      <div class="footer text-center col-sm-12">
+        <p>Copyright © 2014-2021 <a href="http://apache.org">The Apache Software Foundation</a>. All Rights Reserved.</p>
+        <p>Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation.</p>
+        <p><a href="/privacy-policy.html">Privacy Policy</a> &middot; <a href="/blog/feed.xml">RSS feed</a></p>
+      </div>
+    </div>
+    </div><!-- /.container -->
+
+    <!-- Include all compiled plugins (below), or include individual files as needed -->
+    <script src="/js/jquery.matchHeight-min.js"></script>
+    <script src="/js/bootstrap.min.js"></script>
+    <script src="/js/codetabs.js"></script>
+    <script src="/js/stickysidebar.js"></script>
+
+    <!-- Google Analytics -->
+    <script>
+      (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+      (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
+      m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+      })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+      ga('create', 'UA-52545728-1', 'auto');
+      ga('send', 'pageview');
+    </script>
+  </body>
+</html>
diff --git a/content/blog/feed.xml b/content/blog/feed.xml
index 861940b..b192c25 100644
--- a/content/blog/feed.xml
+++ b/content/blog/feed.xml
@@ -7,6 +7,230 @@
 <atom:link href="https://flink.apache.org/blog/feed.xml" rel="self" type="application/rss+xml" />
 
 <item>
+<title>How We Improved Scheduler Performance for Large-scale Jobs - Part Two</title>
+<description>&lt;p&gt;&lt;a href=&quot;/2022/01/04/scheduler-performance-part-one&quot;&gt;Part one&lt;/a&gt; of this blog post briefly introduced the optimizations we’ve made to improve the performance of the scheduler; compared to Flink 1.12, the time cost and memory usage of scheduling large-scale jobs in Flink 1.14 is significantly reduced. In part two, we will elaborate on the details of these optimizations.&lt;/p&gt;
+
+&lt;div class=&quot;page-toc&quot;&gt;
+&lt;ul id=&quot;markdown-toc&quot;&gt;
+  &lt;li&gt;&lt;a href=&quot;#reducing-complexity-with-groups&quot; id=&quot;markdown-toc-reducing-complexity-with-groups&quot;&gt;Reducing complexity with groups&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a href=&quot;#optimizations-related-to-task-deployment&quot; id=&quot;markdown-toc-optimizations-related-to-task-deployment&quot;&gt;Optimizations related to task deployment&lt;/a&gt;    &lt;ul&gt;
+      &lt;li&gt;&lt;a href=&quot;#the-problem&quot; id=&quot;markdown-toc-the-problem&quot;&gt;The problem&lt;/a&gt;&lt;/li&gt;
+      &lt;li&gt;&lt;a href=&quot;#the-solution&quot; id=&quot;markdown-toc-the-solution&quot;&gt;The solution&lt;/a&gt;        &lt;ul&gt;
+          &lt;li&gt;&lt;a href=&quot;#cache-shuffledescriptors&quot; id=&quot;markdown-toc-cache-shuffledescriptors&quot;&gt;Cache ShuffleDescriptors&lt;/a&gt;&lt;/li&gt;
+          &lt;li&gt;&lt;a href=&quot;#distribute-shuffledescriptors-via-the-blob-server&quot; id=&quot;markdown-toc-distribute-shuffledescriptors-via-the-blob-server&quot;&gt;Distribute ShuffleDescriptors via the blob server&lt;/a&gt;&lt;/li&gt;
+        &lt;/ul&gt;
+      &lt;/li&gt;
+    &lt;/ul&gt;
+  &lt;/li&gt;
+  &lt;li&gt;&lt;a href=&quot;#optimizations-when-building-pipelined-regions&quot; id=&quot;markdown-toc-optimizations-when-building-pipelined-regions&quot;&gt;Optimizations when building pipelined regions&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;/div&gt;
+
+&lt;h1 id=&quot;reducing-complexity-with-groups&quot;&gt;Reducing complexity with groups&lt;/h1&gt;
+
+&lt;p&gt;A distribution pattern describes how consumer tasks are connected to producer tasks. Currently, there are two distribution patterns in Flink: pointwise and all-to-all. When the distribution pattern is pointwise between two vertices, the &lt;a href=&quot;https://en.wikipedia.org/wiki/Big_O_notation&quot;&gt;computational complexity&lt;/a&gt; of traversing all edges is O(n). When the distribution pattern is all-to-all, the complexity of traversing all edges is O(n&lt;sup&gt;2&lt;/ [...]
+
+&lt;center&gt;
+&lt;br /&gt;
+&lt;img src=&quot;/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg&quot; width=&quot;75%&quot; /&gt;
+&lt;br /&gt;
+Fig. 1 - Two distribution patterns in Flink
+&lt;/center&gt;
+
+&lt;p&gt;&lt;br /&gt;
+In Flink 1.12, the &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.12/api/java/org/apache/flink/runtime/executiongraph/ExecutionEdge.html&quot;&gt;ExecutionEdge&lt;/a&gt; class is used to store the information of connections between tasks. This means that for the all-to-all distribution pattern, there would be O(n&lt;sup&gt;2&lt;/sup&gt;) ExecutionEdges, which would take up a lot of memory for large-scale jobs. For two &lt;a href=&quot;https://nightlies.apache.or [...]
+
+&lt;p&gt;As we can see in Fig. 1, for two JobVertices connected with the all-to-all distribution pattern, all &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/IntermediateResultPartition.html&quot;&gt;IntermediateResultPartitions&lt;/a&gt; produced by upstream &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/ExecutionVertex.html&quot;&gt; [...]
+
+&lt;p&gt;For the all-to-all distribution pattern, since all downstream ExecutionVertices belonging to the same JobVertex are isomorphic and belong to a single group, all the result partitions they consume are connected to this group. This group is called &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/scheduler/strategy/ConsumerVertexGroup.html&quot;&gt;ConsumerVertexGroup&lt;/a&gt;. Inversely, all the upstream result partitio [...]
+
+&lt;p&gt;The basic idea of our optimizations is to put all the vertices that consume the same result partitions into one ConsumerVertexGroup, and put all the result partitions with the same consumer vertices into one ConsumedPartitionGroup.&lt;/p&gt;
+
+&lt;center&gt;
+&lt;br /&gt;
+&lt;img src=&quot;/img/blog/2022-01-05-scheduler-performance/2-groups.svg&quot; width=&quot;80%&quot; /&gt;
+&lt;br /&gt;
+Fig. 2 - How partitions and vertices are grouped w.r.t. distribution patterns
+&lt;/center&gt;
+
+&lt;p&gt;&lt;br /&gt;
+When scheduling tasks, Flink needs to iterate over all the connections between result partitions and consumer vertices. In the past, since there were O(n&lt;sup&gt;2&lt;/sup&gt;) edges in total, the overall complexity of the iteration was O(n&lt;sup&gt;2&lt;/sup&gt;). Now ExecutionEdge is replaced with ConsumerVertexGroup and ConsumedPartitionGroup. As all the isomorphic result partitions are connected to the same downstream ConsumerVertexGroup, when the scheduler iterates over all the c [...]
+
+&lt;p&gt;For the pointwise distribution pattern, one ConsumedPartitionGroup is connected to one ConsumerVertexGroup point-to-point. The number of groups is the same as the number of ExecutionEdges. Thus, the computational complexity of iterating over the groups is still O(n).&lt;/p&gt;
+
+&lt;p&gt;For the example job we mentioned above, replacing ExecutionEdges with the groups can effectively reduce the memory usage of &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.html&quot;&gt;ExecutionGraph&lt;/a&gt; from more than 4 GiB to about 12 MiB. Based on the concept of groups, we further optimized several procedures, including job initialization, scheduling tasks, failover, and partiti [...]
+
+&lt;h1 id=&quot;optimizations-related-to-task-deployment&quot;&gt;Optimizations related to task deployment&lt;/h1&gt;
+
+&lt;h2 id=&quot;the-problem&quot;&gt;The problem&lt;/h2&gt;
+
+&lt;p&gt;In Flink 1.12, it takes a long time to deploy tasks for large-scale jobs if they contain all-to-all edges. Furthermore, a heartbeat timeout may happen during or after task deployment, which makes the cluster unstable.&lt;/p&gt;
+
+&lt;p&gt;Currently, task deployment includes the following steps:&lt;/p&gt;
+
+&lt;ol&gt;
+  &lt;li&gt;A JobManager creates &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/deployment/TaskDeploymentDescriptor.html&quot;&gt;TaskDeploymentDescriptors&lt;/a&gt; for each task, which happens in the JobManager’s main thread;&lt;/li&gt;
+  &lt;li&gt;The JobManager serializes TaskDeploymentDescriptors asynchronously;&lt;/li&gt;
+  &lt;li&gt;The JobManager ships serialized TaskDeploymentDescriptors to TaskManagers via RPC messages;&lt;/li&gt;
+  &lt;li&gt;TaskManagers create new tasks based on the TaskDeploymentDescriptors and execute them.&lt;/li&gt;
+&lt;/ol&gt;
+
+&lt;p&gt;A TaskDeploymentDescriptor (TDD) contains all the information required by TaskManagers to create a task. At the beginning of task deployment, a JobManager creates the TDDs for all tasks. Since this happens in the main thread, the JobManager cannot respond to any other requests. For large-scale jobs, the main thread may get blocked for a long time, heartbeat timeouts may happen, and a failover would be triggered.&lt;/p&gt;
+
+&lt;p&gt;A JobManager can become a bottleneck during task deployment since all descriptors are transmitted from it to all TaskManagers. For large-scale jobs, these temporary descriptors would require a lot of heap memory and cause frequent long-term garbage collection pauses.&lt;/p&gt;
+
+&lt;p&gt;Thus, we need to speed up the creation of the TDDs. Furthermore, if the size of descriptors can be reduced, then they will be transmitted faster, which leads to faster task deployments.&lt;/p&gt;
+
+&lt;h2 id=&quot;the-solution&quot;&gt;The solution&lt;/h2&gt;
+
+&lt;h3 id=&quot;cache-shuffledescriptors&quot;&gt;Cache ShuffleDescriptors&lt;/h3&gt;
+
+&lt;p&gt;&lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/shuffle/ShuffleDescriptor.html&quot;&gt;ShuffleDescriptor&lt;/a&gt;s are used to describe the information of result partitions that a task consumes and can be the largest part of a TaskDeploymentDescriptor. For an all-to-all edge, when the parallelisms of both upstream and downstream vertices are n, the number of ShuffleDescriptors for each downstream vertex is n, since  [...]
+
+&lt;p&gt;However, the ShuffleDescriptors for the downstream vertices are all the same since they all consume the same upstream result partitions. Therefore, Flink doesn’t need to create ShuffleDescriptors for each downstream vertex individually. Instead, it can create them once and cache them to be reused. This will decrease the overall complexity of creating TaskDeploymentDescriptors for tasks from O(n&lt;sup&gt;2&lt;/sup&gt;) to O(n).&lt;/p&gt;
+
+&lt;p&gt;To decrease the size of RPC messages and reduce the transmission of replicated data over the network, the cached ShuffleDescriptors can be compressed. For the example job we mentioned above, if the parallelisms of vertices are both 10k, then each downstream vertex has 10k ShuffleDescriptors. After compression, the size of the serialized value would be reduced by 72%.&lt;/p&gt;
+
+&lt;h3 id=&quot;distribute-shuffledescriptors-via-the-blob-server&quot;&gt;Distribute ShuffleDescriptors via the blob server&lt;/h3&gt;
+
+&lt;p&gt;A &lt;a href=&quot;https://en.wikipedia.org/wiki/Binary_large_object&quot;&gt;blob&lt;/a&gt; (binary large objects) is a collection of binary data used to store large files. Flink hosts a blob server to transport large-sized data between the JobManager and TaskManagers. When a JobManager decides to transmit a large file to TaskManagers, it would first store the file in the blob server (will also upload files to the distributed file system) and get a token representing the blob,  [...]
+
+&lt;p&gt;During task deployment, the JobManager is responsible for distributing the ShuffleDescriptors to TaskManagers via RPC messages. The messages will be garbage collected once they are sent. However, if the JobManager cannot send the messages as fast as they are created, these messages would take up a lot of space in heap memory and become a heavy burden for the garbage collector to deal with. There will be more long-term garbage collections that stop the world and slow down the tas [...]
+
+&lt;p&gt;To solve this problem, the blob server can be used to distribute large ShuffleDescriptors. The JobManager first sends ShuffleDescriptors to the blob server, which stores ShuffleDescriptors in the DFS. TaskManagers request ShuffleDescriptors from the DFS once they begin to process TaskDeploymentDescriptors. With this change, the JobManager doesn’t need to keep all the copies of ShuffleDescriptors in heap memory until they are sent. Moreover, the frequency of garbage collections f [...]
+
+&lt;center&gt;
+&lt;br /&gt;
+&lt;img src=&quot;/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg&quot; width=&quot;80%&quot; /&gt;
+&lt;br /&gt;
+Fig. 3 - How ShuffleDescriptors are distributed
+&lt;/center&gt;
+
+&lt;p&gt;&lt;br /&gt;
+To avoid running out of space on the local disk, the cache will be cleared when the related partitions are no longer valid and a size limit is added for ShuffleDescriptors in the blob cache on TaskManagers. If the overall size exceeds the limit, the least recently used cached value will be removed. This ensures that the local disks on the JobManager and TaskManagers won’t be filled up with ShuffleDescriptors, especially in session mode.&lt;/p&gt;
+
+&lt;h1 id=&quot;optimizations-when-building-pipelined-regions&quot;&gt;Optimizations when building pipelined regions&lt;/h1&gt;
+
+&lt;p&gt;In Flink, there are two types of data exchanges: pipelined and blocking. When using blocking data exchanges, result partitions are first fully produced and then consumed by the downstream vertices. The produced results are persisted and can be consumed multiple times. When using pipelined data exchanges, result partitions are produced and consumed concurrently. The produced results are not persisted and can be consumed only once.&lt;/p&gt;
+
+&lt;p&gt;Since the pipelined data stream is produced and consumed simultaneously, Flink needs to make sure that the vertices connected via pipelined data exchanges execute at the same time. These vertices form a &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/topology/PipelinedRegion.html&quot;&gt;pipelined region&lt;/a&gt;. The pipelined region is the basic unit of scheduling and failover by default. During scheduling, all ve [...]
+
+&lt;center&gt;
+&lt;br /&gt;
+&lt;img src=&quot;/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg&quot; width=&quot;90%&quot; /&gt;
+&lt;br /&gt;
+Fig. 4 - The LogicalPipelinedRegion and the SchedulingPipelinedRegion
+&lt;/center&gt;
+
+&lt;p&gt;&lt;br /&gt;
+Currently, there are two types of pipelined regions in the scheduler: &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/jobgraph/topology/LogicalPipelinedRegion.html&quot;&gt;LogicalPipelinedRegion&lt;/a&gt; and &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/runtime/scheduler/strategy/SchedulingPipelinedRegion.html&quot;&gt;SchedulingPipelinedRegion&lt;/a&gt;. The LogicalPip [...]
+
+&lt;p&gt;During the construction of pipelined regions, a problem arises: There may be cyclic dependencies between pipelined regions. A pipelined region can be scheduled if and only if all its dependencies have finished. However, if there are two pipelined regions with cyclic dependencies between each other, there will be a scheduling &lt;a href=&quot;https://en.wikipedia.org/wiki/Deadlock&quot;&gt;deadlock&lt;/a&gt;. They are both waiting for the other one to be scheduled first, and none [...]
+
+&lt;center&gt;
+&lt;br /&gt;
+&lt;img src=&quot;/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg&quot; width=&quot;90%&quot; /&gt;
+&lt;br /&gt;
+Fig. 5 - The topology with scheduling deadlock
+&lt;/center&gt;
+
+&lt;p&gt;&lt;br /&gt;
+To speed up the construction of pipelined regions, the relevance between the logical topology and the scheduling topology can be leveraged. Since a SchedulingPipelinedRegion is derived from just one LogicalPipelinedRegion, Flink traverses all LogicalPipelinedRegions and converts them into SchedulingPipelinedRegions one by one. The conversion varies based on the distribution patterns of edges that connect vertices in the LogicalPipelinedRegion.&lt;/p&gt;
+
+&lt;p&gt;If there are any all-to-all distribution patterns inside the region, the entire region can just be converted into one SchedulingPipelinedRegion directly. That’s because for the all-to-all edge with the pipelined data exchange, all the regions connected to this edge must execute simultaneously, which means they are merged into one region. For the all-to-all edge with a blocking data exchange, it will introduce cyclic dependencies, as Fig. 5 shows. All the regions it connects must [...]
+
+&lt;p&gt;If there are only pointwise distribution patterns inside a region, Tarjan’s strongly connected components algorithm is still used to ensure no cyclic dependencies. Since there are only pointwise distribution patterns, the number of edges in the topology is O(n), and the computational complexity of the algorithm will be O(n).&lt;/p&gt;
+
+&lt;center&gt;
+&lt;br /&gt;
+&lt;img src=&quot;/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg&quot; width=&quot;90%&quot; /&gt;
+&lt;br /&gt;
+Fig. 6 - How to convert a LogicalPipelinedRegion to ScheduledPipelinedRegions
+&lt;/center&gt;
+
+&lt;p&gt;&lt;br /&gt;
+After the optimization, the overall computational complexity of building pipelined regions decreases from O(n&lt;sup&gt;2&lt;/sup&gt;) to O(n). In our experiments, for the job which contains two vertices connected with a blocking all-to-all edge, when their parallelisms are both 10K, the time of building pipelined regions decreases by 99%, from 8,257 ms to 120 ms.&lt;/p&gt;
+
+&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
+
+&lt;p&gt;All in all, we’ve done several optimizations to improve the scheduler’s performance for large-scale jobs in Flink 1.13 and 1.14. The optimizations involve procedures including job initialization, scheduling, task deployment, and failover. If you have any questions about them, please feel free to start a discussion in the dev mail list.&lt;/p&gt;
+</description>
+<pubDate>Tue, 04 Jan 2022 09:00:00 +0100</pubDate>
+<link>https://flink.apache.org/2022/01/04/scheduler-performance-part-two.html</link>
+<guid isPermaLink="true">/2022/01/04/scheduler-performance-part-two.html</guid>
+</item>
+
+<item>
+<title>How We Improved Scheduler Performance for Large-scale Jobs - Part One</title>
+<description>&lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;
+
+&lt;p&gt;When scheduling large-scale jobs in Flink 1.12, a lot of time is required to initialize jobs and deploy tasks. The scheduler also requires a large amount of heap memory in order to store the execution topology and host temporary deployment descriptors. For example, for a job with a topology that contains two vertices connected with an all-to-all edge and a parallelism of 10k (which means there are 10k source tasks and 10k sink tasks and every source task is connected to all sink [...]
+
+&lt;p&gt;Furthermore, task deployment may block the JobManager’s main thread for a long time and the JobManager will not be able to respond to any other requests from TaskManagers. This could lead to heartbeat timeouts that trigger a failover. In the worst case, this will render the Flink cluster unusable because it cannot deploy the job.&lt;/p&gt;
+
+&lt;p&gt;To improve the performance of the scheduler for large-scale jobs, we’ve implemented several optimizations in Flink 1.13 and 1.14:&lt;/p&gt;
+
+&lt;ol&gt;
+  &lt;li&gt;Introduce the concept of consuming groups to optimize procedures related to the complexity of topologies, including the initialization, scheduling, failover, and partition release. This also reduces the memory required to store the topology;&lt;/li&gt;
+  &lt;li&gt;Introduce a cache to optimize task deployment, which makes the process faster and requires less memory;&lt;/li&gt;
+  &lt;li&gt;Leverage characteristics of the logical topology and the scheduling topology to speed up the building of pipelined regions.&lt;/li&gt;
+&lt;/ol&gt;
+
+&lt;h1 id=&quot;benchmarking-results&quot;&gt;Benchmarking Results&lt;/h1&gt;
+
+&lt;p&gt;To estimate the effect of our optimizations, we conducted several experiments to compare the performance of Flink 1.12 (before the optimization) with Flink 1.14 (after the optimization). The job in our experiments contains two vertices connected with an all-to-all edge. The parallelisms of these vertices are both 10K. To make temporary deployment descriptors distributed via the blob server, we set the configuration &lt;a href=&quot;https://nightlies.apache.org/flink/flink-docs-r [...]
+
+&lt;center&gt;
+Table 1 - The comparison of time cost between Flink 1.12 and 1.14
+&lt;table width=&quot;95%&quot; border=&quot;1&quot;&gt;
+  &lt;thead&gt;
+    &lt;tr&gt;
+      &lt;th style=&quot;text-align: center&quot;&gt;Procedure&lt;/th&gt;
+      &lt;th style=&quot;text-align: center&quot;&gt;1.12&lt;/th&gt;
+      &lt;th style=&quot;text-align: center&quot;&gt;1.14&lt;/th&gt;
+      &lt;th style=&quot;text-align: center&quot;&gt;Reduction(%)&lt;/th&gt;
+    &lt;/tr&gt;
+  &lt;/thead&gt;
+  &lt;tbody&gt;
+    &lt;tr&gt;
+      &lt;td style=&quot;text-align: center&quot;&gt;Job Initialization&lt;/td&gt;
+      &lt;td style=&quot;text-align: center&quot;&gt;11,431ms&lt;/td&gt;
+      &lt;td style=&quot;text-align: center&quot;&gt;627ms&lt;/td&gt;
+      &lt;td style=&quot;text-align: center&quot;&gt;94.51%&lt;/td&gt;
+    &lt;/tr&gt;
+    &lt;tr&gt;
+      &lt;td style=&quot;text-align: center&quot;&gt;Task Deployment&lt;/td&gt;
+      &lt;td style=&quot;text-align: center&quot;&gt;63,118ms&lt;/td&gt;
+      &lt;td style=&quot;text-align: center&quot;&gt;17,183ms&lt;/td&gt;
+      &lt;td style=&quot;text-align: center&quot;&gt;72.78%&lt;/td&gt;
+    &lt;/tr&gt;
+    &lt;tr&gt;
+      &lt;td style=&quot;text-align: center&quot;&gt;Computing tasks to restart when failover&lt;/td&gt;
+      &lt;td style=&quot;text-align: center&quot;&gt;37,195ms&lt;/td&gt;
+      &lt;td style=&quot;text-align: center&quot;&gt;170ms&lt;/td&gt;
+      &lt;td style=&quot;text-align: center&quot;&gt;99.55%&lt;/td&gt;
+    &lt;/tr&gt;
+  &lt;/tbody&gt;
+&lt;/table&gt;
+&lt;/center&gt;
+
+&lt;p&gt;&lt;br /&gt;
+In addition to quicker speeds, the memory usage is significantly reduced. It requires 30 GiB heap memory for a JobManager to deploy the test job and keep it running stably with Flink 1.12, while the minimum heap memory required by the JobManager with Flink 1.14 is only 2 GiB.&lt;/p&gt;
+
+&lt;p&gt;There are also less occurrences of long-term garbage collection. When running the test job with Flink 1.12, a garbage collection that lasts more than 10 seconds occurs during both job initialization and task deployment. With Flink 1.14, since there is no long-term garbage collection, there is also a decreased risk of heartbeat timeouts, which creates better cluster stability.&lt;/p&gt;
+
+&lt;p&gt;In our experiment, it took more than 4 minutes for the large-scale job with Flink 1.12 to transition to running (excluding the time spent on allocating resources). With Flink 1.14, it took no more than 30 seconds (excluding the time spent on allocating resources). The time cost is reduced by 87%. Thus, for users who are running large-scale jobs for production and want better scheduling performance, please consider upgrading Flink to 1.14.&lt;/p&gt;
+
+&lt;p&gt;In &lt;a href=&quot;/2022/01/04/scheduler-performance-part-two&quot;&gt;part two&lt;/a&gt; of this blog post, we are going to talk about these improvements in detail.&lt;/p&gt;
+</description>
+<pubDate>Tue, 04 Jan 2022 09:00:00 +0100</pubDate>
+<link>https://flink.apache.org/2022/01/04/scheduler-performance-part-one.html</link>
+<guid isPermaLink="true">/2022/01/04/scheduler-performance-part-one.html</guid>
+</item>
+
+<item>
 <title>Apache Flink StateFun Log4j emergency release</title>
 <description>&lt;p&gt;The Apache Flink community has released an emergency bugfix version of Apache Flink Stateful Function 3.1.1.&lt;/p&gt;
 
@@ -19602,378 +19826,5 @@ Once this effort is finished, we can add Blink’s scheduling and recovery strat
 <guid isPermaLink="true">/news/2019/02/13/unified-batch-streaming-blink.html</guid>
 </item>
 
-<item>
-<title>Apache Flink 1.5.6 Released</title>
-<description>&lt;p&gt;The Apache Flink community released the sixth and last bugfix version of the Apache Flink 1.5 series.&lt;/p&gt;
-
-&lt;p&gt;This release includes more than 47 fixes and minor improvements for Flink 1.5.5. The list below includes a detailed list of all fixes.&lt;/p&gt;
-
-&lt;p&gt;We highly recommend all users to upgrade to Flink 1.5.6.&lt;/p&gt;
-
-&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
-
-&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.5.6&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.5.6&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.5.6&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
-
-&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;http://flink.apache.org/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
-
-&lt;p&gt;List of resolved issues:&lt;/p&gt;
-
-&lt;h2&gt;        Sub-task
-&lt;/h2&gt;
-&lt;ul&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10252&quot;&gt;FLINK-10252&lt;/a&gt;] -         Handle oversized metric messages
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10863&quot;&gt;FLINK-10863&lt;/a&gt;] -         Assign uids to all operators
-&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;h2&gt;        Bug
-&lt;/h2&gt;
-&lt;ul&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-8336&quot;&gt;FLINK-8336&lt;/a&gt;] -         YarnFileStageTestS3ITCase.testRecursiveUploadForYarnS3 test instability
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-9646&quot;&gt;FLINK-9646&lt;/a&gt;] -         ExecutionGraphCoLocationRestartTest.testConstraintsAfterRestart failed on Travis
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10166&quot;&gt;FLINK-10166&lt;/a&gt;] -         Dependency problems when executing SQL query in sql-client
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10309&quot;&gt;FLINK-10309&lt;/a&gt;] -         Cancel with savepoint fails with java.net.ConnectException when using the per job-mode
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10419&quot;&gt;FLINK-10419&lt;/a&gt;] -         ClassNotFoundException while deserializing user exceptions from checkpointing
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10455&quot;&gt;FLINK-10455&lt;/a&gt;] -         Potential Kafka producer leak in case of failures
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10482&quot;&gt;FLINK-10482&lt;/a&gt;] -         java.lang.IllegalArgumentException: Negative number of in progress checkpoints
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10491&quot;&gt;FLINK-10491&lt;/a&gt;] -         Deadlock during spilling data in SpillableSubpartition 
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10566&quot;&gt;FLINK-10566&lt;/a&gt;] -         Flink Planning is exponential in the number of stages
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10581&quot;&gt;FLINK-10581&lt;/a&gt;] -         YarnConfigurationITCase.testFlinkContainerMemory test instability
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10642&quot;&gt;FLINK-10642&lt;/a&gt;] -         CodeGen split fields errors when maxGeneratedCodeLength equals 1
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10655&quot;&gt;FLINK-10655&lt;/a&gt;] -         RemoteRpcInvocation not overwriting ObjectInputStream&amp;#39;s ClassNotFoundException
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10669&quot;&gt;FLINK-10669&lt;/a&gt;] -         Exceptions &amp;amp; errors are not properly checked in logs in e2e tests
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10670&quot;&gt;FLINK-10670&lt;/a&gt;] -         Fix Correlate codegen error
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10674&quot;&gt;FLINK-10674&lt;/a&gt;] -         Fix handling of retractions after clean up
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10690&quot;&gt;FLINK-10690&lt;/a&gt;] -         Tests leak resources via Files.list
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10693&quot;&gt;FLINK-10693&lt;/a&gt;] -         Fix Scala EitherSerializer duplication
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10715&quot;&gt;FLINK-10715&lt;/a&gt;] -         E2e tests fail with ConcurrentModificationException in MetricRegistryImpl
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10750&quot;&gt;FLINK-10750&lt;/a&gt;] -         SocketClientSinkTest.testRetry fails on Travis
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10752&quot;&gt;FLINK-10752&lt;/a&gt;] -         Result of AbstractYarnClusterDescriptor#validateClusterResources is ignored
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10753&quot;&gt;FLINK-10753&lt;/a&gt;] -         Propagate and log snapshotting exceptions
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10770&quot;&gt;FLINK-10770&lt;/a&gt;] -         Some generated functions are not opened properly.
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10773&quot;&gt;FLINK-10773&lt;/a&gt;] -         Resume externalized checkpoint end-to-end test fails
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10821&quot;&gt;FLINK-10821&lt;/a&gt;] -         Resuming Externalized Checkpoint E2E test does not resume from Externalized Checkpoint
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10839&quot;&gt;FLINK-10839&lt;/a&gt;] -         Fix implementation of PojoSerializer.duplicate() w.r.t. subclass serializer
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10856&quot;&gt;FLINK-10856&lt;/a&gt;] -         Harden resume from externalized checkpoint E2E test
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10857&quot;&gt;FLINK-10857&lt;/a&gt;] -         Conflict between JMX and Prometheus Metrics reporter
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10880&quot;&gt;FLINK-10880&lt;/a&gt;] -         Failover strategies should not be applied to Batch Execution
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10913&quot;&gt;FLINK-10913&lt;/a&gt;] -         ExecutionGraphRestartTest.testRestartAutomatically unstable on Travis
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10925&quot;&gt;FLINK-10925&lt;/a&gt;] -         NPE in PythonPlanStreamer
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10990&quot;&gt;FLINK-10990&lt;/a&gt;] -         Enforce minimum timespan in MeterView
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10998&quot;&gt;FLINK-10998&lt;/a&gt;] -         flink-metrics-ganglia has LGPL dependency
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11011&quot;&gt;FLINK-11011&lt;/a&gt;] -         Elasticsearch 6 sink end-to-end test unstable
-&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;h2&gt;        Improvement
-&lt;/h2&gt;
-&lt;ul&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-4173&quot;&gt;FLINK-4173&lt;/a&gt;] -         Replace maven-assembly-plugin by maven-shade-plugin in flink-metrics
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-9869&quot;&gt;FLINK-9869&lt;/a&gt;] -         Send PartitionInfo in batch to Improve perfornance
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10613&quot;&gt;FLINK-10613&lt;/a&gt;] -         Remove logger casts in HBaseConnectorITCase
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10614&quot;&gt;FLINK-10614&lt;/a&gt;] -         Update test_batch_allround.sh e2e to new testing infrastructure
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10637&quot;&gt;FLINK-10637&lt;/a&gt;] -         Start MiniCluster with random REST port
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10678&quot;&gt;FLINK-10678&lt;/a&gt;] -         Add a switch to run_test to configure if logs should be checked for errors/excepions
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10906&quot;&gt;FLINK-10906&lt;/a&gt;] -         docker-entrypoint.sh logs credentails during startup
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10916&quot;&gt;FLINK-10916&lt;/a&gt;] -         Include duplicated user-specified uid into error message
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11005&quot;&gt;FLINK-11005&lt;/a&gt;] -         Define flink-sql-client uber-jar dependencies via artifactSet
-&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;h2&gt;        Test
-&lt;/h2&gt;
-&lt;ul&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10606&quot;&gt;FLINK-10606&lt;/a&gt;] -         Construct NetworkEnvironment simple for tests
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10607&quot;&gt;FLINK-10607&lt;/a&gt;] -         Unify to remove duplicated NoOpResultPartitionConsumableNotifier
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10827&quot;&gt;FLINK-10827&lt;/a&gt;] -         Add test for duplicate() to SerializerTestBase
-&lt;/li&gt;
-&lt;/ul&gt;
-</description>
-<pubDate>Wed, 26 Dec 2018 13:00:00 +0100</pubDate>
-<link>https://flink.apache.org/news/2018/12/26/release-1.5.6.html</link>
-<guid isPermaLink="true">/news/2018/12/26/release-1.5.6.html</guid>
-</item>
-
-<item>
-<title>Apache Flink 1.6.3 Released</title>
-<description>&lt;p&gt;The Apache Flink community released the third bugfix version of the Apache Flink 1.6 series.&lt;/p&gt;
-
-&lt;p&gt;This release includes more than 80 fixes and minor improvements for Flink 1.6.2. The list below includes a detailed list of all fixes.&lt;/p&gt;
-
-&lt;p&gt;We highly recommend all users to upgrade to Flink 1.6.3.&lt;/p&gt;
-
-&lt;p&gt;Updated Maven dependencies:&lt;/p&gt;
-
-&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-java&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.6.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-streaming-java_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.6.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.flink&lt;span class=&quot;nt&quot;&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;flink-clients_2.11&lt;span class=&quot;nt&quot;&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
-  &lt;span class=&quot;nt&quot;&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.6.3&lt;span class=&quot;nt&quot;&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
-&lt;span class=&quot;nt&quot;&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
-
-&lt;p&gt;You can find the binaries on the updated &lt;a href=&quot;http://flink.apache.org/downloads.html&quot;&gt;Downloads page&lt;/a&gt;.&lt;/p&gt;
-
-&lt;p&gt;List of resolved issues:&lt;/p&gt;
-
-&lt;h2&gt;        Sub-task
-&lt;/h2&gt;
-&lt;ul&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10097&quot;&gt;FLINK-10097&lt;/a&gt;] -         More tests to increase StreamingFileSink test coverage
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10252&quot;&gt;FLINK-10252&lt;/a&gt;] -         Handle oversized metric messages
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10367&quot;&gt;FLINK-10367&lt;/a&gt;] -         Avoid recursion stack overflow during releasing SingleInputGate
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10863&quot;&gt;FLINK-10863&lt;/a&gt;] -         Assign uids to all operators in general purpose testing job
-&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;h2&gt;        Bug
-&lt;/h2&gt;
-&lt;ul&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-8336&quot;&gt;FLINK-8336&lt;/a&gt;] -         YarnFileStageTestS3ITCase.testRecursiveUploadForYarnS3 test instability
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-9635&quot;&gt;FLINK-9635&lt;/a&gt;] -         Local recovery scheduling can cause spread out of tasks
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-9646&quot;&gt;FLINK-9646&lt;/a&gt;] -         ExecutionGraphCoLocationRestartTest.testConstraintsAfterRestart failed on Travis
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-9878&quot;&gt;FLINK-9878&lt;/a&gt;] -         IO worker threads BLOCKED on SSL Session Cache while CMS full gc
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10149&quot;&gt;FLINK-10149&lt;/a&gt;] -         Fink Mesos allocates extra port when not configured to do so.
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10166&quot;&gt;FLINK-10166&lt;/a&gt;] -         Dependency problems when executing SQL query in sql-client
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10309&quot;&gt;FLINK-10309&lt;/a&gt;] -         Cancel with savepoint fails with java.net.ConnectException when using the per job-mode
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10357&quot;&gt;FLINK-10357&lt;/a&gt;] -         Streaming File Sink end-to-end test failed with mismatch
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10359&quot;&gt;FLINK-10359&lt;/a&gt;] -         Scala example in DataSet docs is broken
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10364&quot;&gt;FLINK-10364&lt;/a&gt;] -         Test instability in NonHAQueryableStateFsBackendITCase#testMapState
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10419&quot;&gt;FLINK-10419&lt;/a&gt;] -         ClassNotFoundException while deserializing user exceptions from checkpointing
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10425&quot;&gt;FLINK-10425&lt;/a&gt;] -         taskmanager.host is not respected
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10455&quot;&gt;FLINK-10455&lt;/a&gt;] -         Potential Kafka producer leak in case of failures
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10463&quot;&gt;FLINK-10463&lt;/a&gt;] -         Null literal cannot be properly parsed in Java Table API function call
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10481&quot;&gt;FLINK-10481&lt;/a&gt;] -         Wordcount end-to-end test in docker env unstable
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10482&quot;&gt;FLINK-10482&lt;/a&gt;] -         java.lang.IllegalArgumentException: Negative number of in progress checkpoints
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10491&quot;&gt;FLINK-10491&lt;/a&gt;] -         Deadlock during spilling data in SpillableSubpartition 
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10566&quot;&gt;FLINK-10566&lt;/a&gt;] -         Flink Planning is exponential in the number of stages
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10567&quot;&gt;FLINK-10567&lt;/a&gt;] -         Lost serialize fields when ttl state store with the mutable serializer
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10570&quot;&gt;FLINK-10570&lt;/a&gt;] -         State grows unbounded when &amp;quot;within&amp;quot; constraint not applied
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10581&quot;&gt;FLINK-10581&lt;/a&gt;] -         YarnConfigurationITCase.testFlinkContainerMemory test instability
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10642&quot;&gt;FLINK-10642&lt;/a&gt;] -         CodeGen split fields errors when maxGeneratedCodeLength equals 1
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10655&quot;&gt;FLINK-10655&lt;/a&gt;] -         RemoteRpcInvocation not overwriting ObjectInputStream&amp;#39;s ClassNotFoundException
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10663&quot;&gt;FLINK-10663&lt;/a&gt;] -         Closing StreamingFileSink can cause NPE
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10669&quot;&gt;FLINK-10669&lt;/a&gt;] -         Exceptions &amp;amp; errors are not properly checked in logs in e2e tests
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10670&quot;&gt;FLINK-10670&lt;/a&gt;] -         Fix Correlate codegen error
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10674&quot;&gt;FLINK-10674&lt;/a&gt;] -         Fix handling of retractions after clean up
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10681&quot;&gt;FLINK-10681&lt;/a&gt;] -         elasticsearch6.ElasticsearchSinkITCase fails if wrong JNA library installed
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10690&quot;&gt;FLINK-10690&lt;/a&gt;] -         Tests leak resources via Files.list
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10693&quot;&gt;FLINK-10693&lt;/a&gt;] -         Fix Scala EitherSerializer duplication
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10715&quot;&gt;FLINK-10715&lt;/a&gt;] -         E2e tests fail with ConcurrentModificationException in MetricRegistryImpl
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10750&quot;&gt;FLINK-10750&lt;/a&gt;] -         SocketClientSinkTest.testRetry fails on Travis
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10752&quot;&gt;FLINK-10752&lt;/a&gt;] -         Result of AbstractYarnClusterDescriptor#validateClusterResources is ignored
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10753&quot;&gt;FLINK-10753&lt;/a&gt;] -         Propagate and log snapshotting exceptions
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10763&quot;&gt;FLINK-10763&lt;/a&gt;] -         Interval join produces wrong result type in Scala API
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10770&quot;&gt;FLINK-10770&lt;/a&gt;] -         Some generated functions are not opened properly.
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10773&quot;&gt;FLINK-10773&lt;/a&gt;] -         Resume externalized checkpoint end-to-end test fails
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10809&quot;&gt;FLINK-10809&lt;/a&gt;] -         Using DataStreamUtils.reinterpretAsKeyedStream produces corrupted keyed state after restore
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10816&quot;&gt;FLINK-10816&lt;/a&gt;] -         Fix LockableTypeSerializer.duplicate() 
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10821&quot;&gt;FLINK-10821&lt;/a&gt;] -         Resuming Externalized Checkpoint E2E test does not resume from Externalized Checkpoint
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10839&quot;&gt;FLINK-10839&lt;/a&gt;] -         Fix implementation of PojoSerializer.duplicate() w.r.t. subclass serializer
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10842&quot;&gt;FLINK-10842&lt;/a&gt;] -         Waiting loops are broken in e2e/common.sh
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10856&quot;&gt;FLINK-10856&lt;/a&gt;] -         Harden resume from externalized checkpoint E2E test
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10857&quot;&gt;FLINK-10857&lt;/a&gt;] -         Conflict between JMX and Prometheus Metrics reporter
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10880&quot;&gt;FLINK-10880&lt;/a&gt;] -         Failover strategies should not be applied to Batch Execution
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10913&quot;&gt;FLINK-10913&lt;/a&gt;] -         ExecutionGraphRestartTest.testRestartAutomatically unstable on Travis
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10925&quot;&gt;FLINK-10925&lt;/a&gt;] -         NPE in PythonPlanStreamer
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10946&quot;&gt;FLINK-10946&lt;/a&gt;] -         Resuming Externalized Checkpoint (rocks, incremental, scale up) end-to-end test failed on Travis
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10990&quot;&gt;FLINK-10990&lt;/a&gt;] -         Enforce minimum timespan in MeterView
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10992&quot;&gt;FLINK-10992&lt;/a&gt;] -         Jepsen: Do not use /tmp as HDFS Data Directory
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10997&quot;&gt;FLINK-10997&lt;/a&gt;] -         Avro-confluent-registry does not bundle any dependency
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10998&quot;&gt;FLINK-10998&lt;/a&gt;] -         flink-metrics-ganglia has LGPL dependency
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11011&quot;&gt;FLINK-11011&lt;/a&gt;] -         Elasticsearch 6 sink end-to-end test unstable
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11017&quot;&gt;FLINK-11017&lt;/a&gt;] -         Time interval for window aggregations in SQL is wrongly translated if specified with YEAR_MONTH resolution
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11029&quot;&gt;FLINK-11029&lt;/a&gt;] -         Incorrect parameter in Working with state doc
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11041&quot;&gt;FLINK-11041&lt;/a&gt;] -         ReinterpretDataStreamAsKeyedStreamITCase.testReinterpretAsKeyedStream failed on Travis
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11045&quot;&gt;FLINK-11045&lt;/a&gt;] -         UserCodeClassLoader has not been set correctly for RuntimeUDFContext in CollectionExecutor
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11083&quot;&gt;FLINK-11083&lt;/a&gt;] -         CRowSerializerConfigSnapshot is not instantiable
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11087&quot;&gt;FLINK-11087&lt;/a&gt;] -         Broadcast state migration Incompatibility from 1.5.3 to 1.7.0
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11123&quot;&gt;FLINK-11123&lt;/a&gt;] -         Missing import in ML quickstart docs
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11136&quot;&gt;FLINK-11136&lt;/a&gt;] -         Fix the logical of merge for DISTINCT aggregates
-&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;h2&gt;        Improvement
-&lt;/h2&gt;
-&lt;ul&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-4173&quot;&gt;FLINK-4173&lt;/a&gt;] -         Replace maven-assembly-plugin by maven-shade-plugin in flink-metrics
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10353&quot;&gt;FLINK-10353&lt;/a&gt;] -         Restoring a KafkaProducer with Semantic.EXACTLY_ONCE from a savepoint written with Semantic.AT_LEAST_ONCE fails with NPE
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10608&quot;&gt;FLINK-10608&lt;/a&gt;] -         Add avro files generated by datastream-allround-test to RAT exclusions
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10613&quot;&gt;FLINK-10613&lt;/a&gt;] -         Remove logger casts in HBaseConnectorITCase
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10614&quot;&gt;FLINK-10614&lt;/a&gt;] -         Update test_batch_allround.sh e2e to new testing infrastructure
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10637&quot;&gt;FLINK-10637&lt;/a&gt;] -         Start MiniCluster with random REST port
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10678&quot;&gt;FLINK-10678&lt;/a&gt;] -         Add a switch to run_test to configure if logs should be checked for errors/excepions
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10692&quot;&gt;FLINK-10692&lt;/a&gt;] -         Harden Confluent schema E2E test
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10883&quot;&gt;FLINK-10883&lt;/a&gt;] -         Submitting a jobs without enough slots times out due to a unspecified timeout
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10906&quot;&gt;FLINK-10906&lt;/a&gt;] -         docker-entrypoint.sh logs credentails during startup
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10916&quot;&gt;FLINK-10916&lt;/a&gt;] -         Include duplicated user-specified uid into error message
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10951&quot;&gt;FLINK-10951&lt;/a&gt;] -         Disable enforcing of YARN container virtual memory limits in tests
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-11005&quot;&gt;FLINK-11005&lt;/a&gt;] -         Define flink-sql-client uber-jar dependencies via artifactSet
-&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;h2&gt;        Test
-&lt;/h2&gt;
-&lt;ul&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10606&quot;&gt;FLINK-10606&lt;/a&gt;] -         Construct NetworkEnvironment simple for tests
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10607&quot;&gt;FLINK-10607&lt;/a&gt;] -         Unify to remove duplicated NoOpResultPartitionConsumableNotifier
-&lt;/li&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10827&quot;&gt;FLINK-10827&lt;/a&gt;] -         Add test for duplicate() to SerializerTestBase
-&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;h2&gt;        Wish
-&lt;/h2&gt;
-&lt;ul&gt;
-&lt;li&gt;[&lt;a href=&quot;https://issues.apache.org/jira/browse/FLINK-10793&quot;&gt;FLINK-10793&lt;/a&gt;] -         Change visibility of TtlValue and TtlSerializer to public for external tools
-&lt;/li&gt;
-&lt;/ul&gt;
-</description>
-<pubDate>Sat, 22 Dec 2018 13:00:00 +0100</pubDate>
-<link>https://flink.apache.org/news/2018/12/22/release-1.6.3.html</link>
-<guid isPermaLink="true">/news/2018/12/22/release-1.6.3.html</guid>
-</item>
-
 </channel>
 </rss>
diff --git a/content/blog/index.html b/content/blog/index.html
index 87c3707..fed055d 100644
--- a/content/blog/index.html
+++ b/content/blog/index.html
@@ -201,6 +201,32 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></h2>
+
+      <p>04 Jan 2022
+       Zhilong Hong , Zhu Zhu , Daisy Tsang , &amp; Till Rohrmann (<a href="https://twitter.com/stsffap">@stsffap</a>)</p>
+
+      <p>Part one of this blog post briefly introduced the optimizations we’ve made to improve the performance of the scheduler; compared to Flink 1.12, the time cost and memory usage of scheduling large-scale jobs in Flink 1.14 is significantly reduced. In part two, we will elaborate on the details of these optimizations.</p>
+
+      <p><a href="/2022/01/04/scheduler-performance-part-two.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></h2>
+
+      <p>04 Jan 2022
+       Zhilong Hong , Zhu Zhu , Daisy Tsang , &amp; Till Rohrmann (<a href="https://twitter.com/stsffap">@stsffap</a>)</p>
+
+      <p>To improve the performance of the scheduler for large-scale jobs, several optimizations were introduced in Flink 1.13 and 1.14. In this blog post we'll take a look at them.</p>
+
+      <p><a href="/2022/01/04/scheduler-performance-part-one.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></h2>
 
       <p>22 Dec 2021
@@ -310,36 +336,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/2021/09/07/connector-table-sql-api-part2.html">Implementing a custom source connector for Table API and SQL - Part Two </a></h2>
-
-      <p>07 Sep 2021
-       Ingo Buerk  &amp; Daisy Tsang </p>
-
-      <p><p>In <a href="/2021/09/07/connector-table-sql-api-part1">part one</a> of this tutorial, you learned how to build a custom source connector for Flink. In part two, you will learn how to integrate the connector with a test email inbox through the IMAP protocol and filter out emails using Flink SQL.</p>
-
-</p>
-
-      <p><a href="/2021/09/07/connector-table-sql-api-part2.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/2021/09/07/connector-table-sql-api-part1.html">Implementing a Custom Source Connector for Table API and SQL - Part One </a></h2>
-
-      <p>07 Sep 2021
-       Ingo Buerk  &amp; Daisy Tsang </p>
-
-      <p><p>Part one of this tutorial will teach you how to build and run a custom source connector to be used with Table API and SQL, two high-level abstractions in Flink. The tutorial comes with a bundled <a href="https://docs.docker.com/compose/">docker-compose</a> setup that lets you easily run the connector. You can then try it out with Flink’s SQL client.</p>
-
-</p>
-
-      <p><a href="/2021/09/07/connector-table-sql-api-part1.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -350,7 +346,7 @@
       
       </li>
       <li>
-        <span class="page_number ">Page: 1 of 17</span>
+        <span class="page_number ">Page: 1 of 18</span>
       </li>
       <li>
       
@@ -368,10 +364,35 @@
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
 
     <ul id="markdown-toc">
       
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
+
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
+    <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
+      
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
       
diff --git a/content/blog/page10/index.html b/content/blog/page10/index.html
index 75dec8f..0e6c0fe 100644
--- a/content/blog/page10/index.html
+++ b/content/blog/page10/index.html
@@ -201,6 +201,32 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/2019/05/14/temporal-tables.html">Flux capacitor, huh? Temporal Tables and Joins in Streaming SQL</a></h2>
+
+      <p>14 May 2019
+       Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
+
+      <p>Apache Flink natively supports temporal table joins since the 1.7 release for straightforward temporal data handling. In this blog post, we provide an overview of how this new concept can be leveraged for effective point-in-time analysis in streaming scenarios.</p>
+
+      <p><a href="/2019/05/14/temporal-tables.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/2019/05/03/pulsar-flink.html">When Flink & Pulsar Come Together</a></h2>
+
+      <p>03 May 2019
+       Sijie Guo (<a href="https://twitter.com/sijieg">@sijieg</a>)</p>
+
+      <p>Apache Flink and Apache Pulsar are distributed data processing systems. When combined, they offer elastic data processing at large scale. This post describes how Pulsar and Flink can work together to provide a seamless developer experience.</p>
+
+      <p><a href="/2019/05/03/pulsar-flink.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2019/04/17/sod.html">Apache Flink's Application to Season of Docs</a></h2>
 
       <p>17 Apr 2019
@@ -316,36 +342,6 @@ for more details.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/12/26/release-1.5.6.html">Apache Flink 1.5.6 Released</a></h2>
-
-      <p>26 Dec 2018
-      </p>
-
-      <p><p>The Apache Flink community released the sixth and last bugfix version of the Apache Flink 1.5 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/12/26/release-1.5.6.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/12/22/release-1.6.3.html">Apache Flink 1.6.3 Released</a></h2>
-
-      <p>22 Dec 2018
-      </p>
-
-      <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.6 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/12/22/release-1.6.3.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -356,7 +352,7 @@ for more details.</p>
       
       </li>
       <li>
-        <span class="page_number ">Page: 10 of 17</span>
+        <span class="page_number ">Page: 10 of 18</span>
       </li>
       <li>
       
@@ -374,9 +370,34 @@ for more details.</p>
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
+
+    <ul id="markdown-toc">
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
 
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
     <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
       
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
diff --git a/content/blog/page11/index.html b/content/blog/page11/index.html
index 623b1a2..4dd371b 100644
--- a/content/blog/page11/index.html
+++ b/content/blog/page11/index.html
@@ -201,6 +201,36 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/news/2018/12/26/release-1.5.6.html">Apache Flink 1.5.6 Released</a></h2>
+
+      <p>26 Dec 2018
+      </p>
+
+      <p><p>The Apache Flink community released the sixth and last bugfix version of the Apache Flink 1.5 series.</p>
+
+</p>
+
+      <p><a href="/news/2018/12/26/release-1.5.6.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/news/2018/12/22/release-1.6.3.html">Apache Flink 1.6.3 Released</a></h2>
+
+      <p>22 Dec 2018
+      </p>
+
+      <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.6 series.</p>
+
+</p>
+
+      <p><a href="/news/2018/12/22/release-1.6.3.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2018/12/21/release-1.7.1.html">Apache Flink 1.7.1 Released</a></h2>
 
       <p>21 Dec 2018
@@ -322,36 +352,6 @@ Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/07/31/release-1.5.2.html">Apache Flink 1.5.2 Released</a></h2>
-
-      <p>31 Jul 2018
-      </p>
-
-      <p><p>The Apache Flink community released the second bugfix version of the Apache Flink 1.5 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/07/31/release-1.5.2.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/07/12/release-1.5.1.html">Apache Flink 1.5.1 Released</a></h2>
-
-      <p>12 Jul 2018
-      </p>
-
-      <p><p>The Apache Flink community released the first bugfix version of the Apache Flink 1.5 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/07/12/release-1.5.1.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -362,7 +362,7 @@ Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
       
       </li>
       <li>
-        <span class="page_number ">Page: 11 of 17</span>
+        <span class="page_number ">Page: 11 of 18</span>
       </li>
       <li>
       
@@ -380,10 +380,35 @@ Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
 
     <ul id="markdown-toc">
       
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
+
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
+    <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
+      
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
       
diff --git a/content/blog/page12/index.html b/content/blog/page12/index.html
index 2819fe0..8833c3f 100644
--- a/content/blog/page12/index.html
+++ b/content/blog/page12/index.html
@@ -201,6 +201,36 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/news/2018/07/31/release-1.5.2.html">Apache Flink 1.5.2 Released</a></h2>
+
+      <p>31 Jul 2018
+      </p>
+
+      <p><p>The Apache Flink community released the second bugfix version of the Apache Flink 1.5 series.</p>
+
+</p>
+
+      <p><a href="/news/2018/07/31/release-1.5.2.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/news/2018/07/12/release-1.5.1.html">Apache Flink 1.5.1 Released</a></h2>
+
+      <p>12 Jul 2018
+      </p>
+
+      <p><p>The Apache Flink community released the first bugfix version of the Apache Flink 1.5 series.</p>
+
+</p>
+
+      <p><a href="/news/2018/07/12/release-1.5.1.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2018/05/25/release-1.5.0.html">Apache Flink 1.5.0 Release Announcement</a></h2>
 
       <p>25 May 2018
@@ -316,39 +346,6 @@ for more detail.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2017/11/22/release-1.4-and-1.5-timeline.html">Looking Ahead to Apache Flink 1.4.0 and 1.5.0</a></h2>
-
-      <p>22 Nov 2017
-       Stephan Ewen (<a href="https://twitter.com/StephanEwen">@StephanEwen</a>), Aljoscha Krettek (<a href="https://twitter.com/aljoscha">@aljoscha</a>), &amp; Mike Winters (<a href="https://twitter.com/wints">@wints</a>)</p>
-
-      <p><p>The Apache Flink 1.4.0 release is on track to happen in the next couple of weeks, and for all of the
-readers out there who haven’t been following the release discussion on <a href="http://flink.apache.org/community.html#mailing-lists">Flink’s developer mailing
-list</a>, we’d like to provide some details on
-what’s coming in Flink 1.4.0 as well as a preview of what the Flink community will save for 1.5.0.</p>
-
-</p>
-
-      <p><a href="/news/2017/11/22/release-1.4-and-1.5-timeline.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2017/08/05/release-1.3.2.html">Apache Flink 1.3.2 Released</a></h2>
-
-      <p>05 Aug 2017
-      </p>
-
-      <p><p>The Apache Flink community released the second bugfix version of the Apache Flink 1.3 series.</p>
-
-</p>
-
-      <p><a href="/news/2017/08/05/release-1.3.2.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -359,7 +356,7 @@ what’s coming in Flink 1.4.0 as well as a preview of what the Flink community
       
       </li>
       <li>
-        <span class="page_number ">Page: 12 of 17</span>
+        <span class="page_number ">Page: 12 of 18</span>
       </li>
       <li>
       
@@ -377,9 +374,34 @@ what’s coming in Flink 1.4.0 as well as a preview of what the Flink community
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
+
+    <ul id="markdown-toc">
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
 
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
     <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
       
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
diff --git a/content/blog/page13/index.html b/content/blog/page13/index.html
index 33b4b2e..bd2e924 100644
--- a/content/blog/page13/index.html
+++ b/content/blog/page13/index.html
@@ -201,6 +201,39 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/news/2017/11/22/release-1.4-and-1.5-timeline.html">Looking Ahead to Apache Flink 1.4.0 and 1.5.0</a></h2>
+
+      <p>22 Nov 2017
+       Stephan Ewen (<a href="https://twitter.com/StephanEwen">@StephanEwen</a>), Aljoscha Krettek (<a href="https://twitter.com/aljoscha">@aljoscha</a>), &amp; Mike Winters (<a href="https://twitter.com/wints">@wints</a>)</p>
+
+      <p><p>The Apache Flink 1.4.0 release is on track to happen in the next couple of weeks, and for all of the
+readers out there who haven’t been following the release discussion on <a href="http://flink.apache.org/community.html#mailing-lists">Flink’s developer mailing
+list</a>, we’d like to provide some details on
+what’s coming in Flink 1.4.0 as well as a preview of what the Flink community will save for 1.5.0.</p>
+
+</p>
+
+      <p><a href="/news/2017/11/22/release-1.4-and-1.5-timeline.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/news/2017/08/05/release-1.3.2.html">Apache Flink 1.3.2 Released</a></h2>
+
+      <p>05 Aug 2017
+      </p>
+
+      <p><p>The Apache Flink community released the second bugfix version of the Apache Flink 1.3 series.</p>
+
+</p>
+
+      <p><a href="/news/2017/08/05/release-1.3.2.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/features/2017/07/04/flink-rescalable-state.html">A Deep Dive into Rescalable State in Apache Flink</a></h2>
 
       <p>04 Jul 2017 by Stefan Richter (<a href="https://twitter.com/">@StefanRRichter</a>)
@@ -315,34 +348,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2017/02/06/release-1.2.0.html">Announcing Apache Flink 1.2.0</a></h2>
-
-      <p>06 Feb 2017 by Robert Metzger
-      </p>
-
-      <p><p>The Apache Flink community is excited to announce the 1.2.0 release.</p></p>
-
-      <p><a href="/news/2017/02/06/release-1.2.0.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2016/12/21/release-1.1.4.html">Apache Flink 1.1.4 Released</a></h2>
-
-      <p>21 Dec 2016
-      </p>
-
-      <p><p>The Apache Flink community released the next bugfix version of the Apache Flink 1.1 series.</p>
-
-</p>
-
-      <p><a href="/news/2016/12/21/release-1.1.4.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -353,7 +358,7 @@
       
       </li>
       <li>
-        <span class="page_number ">Page: 13 of 17</span>
+        <span class="page_number ">Page: 13 of 18</span>
       </li>
       <li>
       
@@ -371,9 +376,34 @@
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
+
+    <ul id="markdown-toc">
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
 
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
     <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
       
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
diff --git a/content/blog/page14/index.html b/content/blog/page14/index.html
index 69995df..87ac1a9 100644
--- a/content/blog/page14/index.html
+++ b/content/blog/page14/index.html
@@ -201,6 +201,34 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/news/2017/02/06/release-1.2.0.html">Announcing Apache Flink 1.2.0</a></h2>
+
+      <p>06 Feb 2017 by Robert Metzger
+      </p>
+
+      <p><p>The Apache Flink community is excited to announce the 1.2.0 release.</p></p>
+
+      <p><a href="/news/2017/02/06/release-1.2.0.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/news/2016/12/21/release-1.1.4.html">Apache Flink 1.1.4 Released</a></h2>
+
+      <p>21 Dec 2016
+      </p>
+
+      <p><p>The Apache Flink community released the next bugfix version of the Apache Flink 1.1 series.</p>
+
+</p>
+
+      <p><a href="/news/2016/12/21/release-1.1.4.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2016/12/19/2016-year-in-review.html">Apache Flink in 2016: Year in Review</a></h2>
 
       <p>19 Dec 2016 by Mike Winters
@@ -317,36 +345,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2016/04/22/release-1.0.2.html">Flink 1.0.2 Released</a></h2>
-
-      <p>22 Apr 2016
-      </p>
-
-      <p><p>Today, the Flink community released Flink version <strong>1.0.2</strong>, the second bugfix release of the 1.0 series.</p>
-
-</p>
-
-      <p><a href="/news/2016/04/22/release-1.0.2.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2016/04/14/flink-forward-announce.html">Flink Forward 2016 Call for Submissions Is Now Open</a></h2>
-
-      <p>14 Apr 2016 by Aljoscha Krettek (<a href="https://twitter.com/">@aljoscha</a>)
-      </p>
-
-      <p><p>We are happy to announce that the call for submissions for Flink Forward 2016 is now open! The conference will take place September 12-14, 2016 in Berlin, Germany, bringing together the open source stream processing community. Most Apache Flink committers will attend the conference, making it the ideal venue to learn more about the project and its roadmap and connect with the community.</p>
-
-</p>
-
-      <p><a href="/news/2016/04/14/flink-forward-announce.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -357,7 +355,7 @@
       
       </li>
       <li>
-        <span class="page_number ">Page: 14 of 17</span>
+        <span class="page_number ">Page: 14 of 18</span>
       </li>
       <li>
       
@@ -375,9 +373,34 @@
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
+
+    <ul id="markdown-toc">
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
 
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
     <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
       
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
diff --git a/content/blog/page15/index.html b/content/blog/page15/index.html
index a8c6703..6a22a44 100644
--- a/content/blog/page15/index.html
+++ b/content/blog/page15/index.html
@@ -201,6 +201,36 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/news/2016/04/22/release-1.0.2.html">Flink 1.0.2 Released</a></h2>
+
+      <p>22 Apr 2016
+      </p>
+
+      <p><p>Today, the Flink community released Flink version <strong>1.0.2</strong>, the second bugfix release of the 1.0 series.</p>
+
+</p>
+
+      <p><a href="/news/2016/04/22/release-1.0.2.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/news/2016/04/14/flink-forward-announce.html">Flink Forward 2016 Call for Submissions Is Now Open</a></h2>
+
+      <p>14 Apr 2016 by Aljoscha Krettek (<a href="https://twitter.com/">@aljoscha</a>)
+      </p>
+
+      <p><p>We are happy to announce that the call for submissions for Flink Forward 2016 is now open! The conference will take place September 12-14, 2016 in Berlin, Germany, bringing together the open source stream processing community. Most Apache Flink committers will attend the conference, making it the ideal venue to learn more about the project and its roadmap and connect with the community.</p>
+
+</p>
+
+      <p><a href="/news/2016/04/14/flink-forward-announce.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2016/04/06/cep-monitoring.html">Introducing Complex Event Processing (CEP) with Apache Flink</a></h2>
 
       <p>06 Apr 2016 by Till Rohrmann (<a href="https://twitter.com/">@stsffap</a>)
@@ -313,35 +343,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2015/11/16/release-0.10.0.html">Announcing Apache Flink 0.10.0</a></h2>
-
-      <p>16 Nov 2015
-      </p>
-
-      <p><p>The Apache Flink community is pleased to announce the availability of the 0.10.0 release. The community put significant effort into improving and extending Apache Flink since the last release, focusing on data stream processing and operational features. About 80 contributors provided bug fixes, improvements, and new features such that in total more than 400 JIRA issues could be resolved.</p>
-
-</p>
-
-      <p><a href="/news/2015/11/16/release-0.10.0.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2015/09/16/off-heap-memory.html">Off-heap Memory in Apache Flink and the curious JIT compiler</a></h2>
-
-      <p>16 Sep 2015 by Stephan Ewen (<a href="https://twitter.com/">@stephanewen</a>)
-      </p>
-
-      <p><p>Running data-intensive code in the JVM and making it well-behaved is tricky. Systems that put billions of data objects naively onto the JVM heap face unpredictable OutOfMemoryErrors and Garbage Collection stalls. Of course, you still want to to keep your data in memory as much as possible, for speed and responsiveness of the processing applications. In that context, &quot;off-heap&quot; has become almost something like a magic word to solve these problems.</p>
-<p>In this blog post, we will look at how Flink exploits off-heap memory. The feature is part of the upcoming release, but you can try it out with the latest nightly builds. We will also give a few interesting insights into the behavior for Java's JIT compiler for highly optimized methods and loops.</p></p>
-
-      <p><a href="/news/2015/09/16/off-heap-memory.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -352,7 +353,7 @@
       
       </li>
       <li>
-        <span class="page_number ">Page: 15 of 17</span>
+        <span class="page_number ">Page: 15 of 18</span>
       </li>
       <li>
       
@@ -370,9 +371,34 @@
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
+
+    <ul id="markdown-toc">
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
 
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
     <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
       
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
diff --git a/content/blog/page16/index.html b/content/blog/page16/index.html
index f55fc65..4d77f99 100644
--- a/content/blog/page16/index.html
+++ b/content/blog/page16/index.html
@@ -201,6 +201,35 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/news/2015/11/16/release-0.10.0.html">Announcing Apache Flink 0.10.0</a></h2>
+
+      <p>16 Nov 2015
+      </p>
+
+      <p><p>The Apache Flink community is pleased to announce the availability of the 0.10.0 release. The community put significant effort into improving and extending Apache Flink since the last release, focusing on data stream processing and operational features. About 80 contributors provided bug fixes, improvements, and new features such that in total more than 400 JIRA issues could be resolved.</p>
+
+</p>
+
+      <p><a href="/news/2015/11/16/release-0.10.0.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/news/2015/09/16/off-heap-memory.html">Off-heap Memory in Apache Flink and the curious JIT compiler</a></h2>
+
+      <p>16 Sep 2015 by Stephan Ewen (<a href="https://twitter.com/">@stephanewen</a>)
+      </p>
+
+      <p><p>Running data-intensive code in the JVM and making it well-behaved is tricky. Systems that put billions of data objects naively onto the JVM heap face unpredictable OutOfMemoryErrors and Garbage Collection stalls. Of course, you still want to to keep your data in memory as much as possible, for speed and responsiveness of the processing applications. In that context, &quot;off-heap&quot; has become almost something like a magic word to solve these problems.</p>
+<p>In this blog post, we will look at how Flink exploits off-heap memory. The feature is part of the upcoming release, but you can try it out with the latest nightly builds. We will also give a few interesting insights into the behavior for Java's JIT compiler for highly optimized methods and loops.</p></p>
+
+      <p><a href="/news/2015/09/16/off-heap-memory.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2015/09/03/flink-forward.html">Announcing Flink Forward 2015</a></h2>
 
       <p>03 Sep 2015
@@ -329,37 +358,6 @@ release is a preview release that contains known issues.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html">Peeking into Apache Flink's Engine Room</a></h2>
-
-      <p>13 Mar 2015 by Fabian Hüske (<a href="https://twitter.com/">@fhueske</a>)
-      </p>
-
-      <p>Joins are prevalent operations in many data processing applications. Most data processing systems feature APIs that make joining data sets very easy. However, the internal algorithms for join processing are much more involved – especially if large data sets need to be efficiently handled. In this blog post, we cut through Apache Flink’s layered architecture and take a look at its internals with a focus on how it handles joins.</p>
-
-      <p><a href="/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2015/03/02/february-2015-in-flink.html">February 2015 in the Flink community</a></h2>
-
-      <p>02 Mar 2015
-      </p>
-
-      <p><p>February might be the shortest month of the year, but this does not
-mean that the Flink community has not been busy adding features to the
-system and fixing bugs. Here’s a rundown of the activity in the Flink
-community last month.</p>
-
-</p>
-
-      <p><a href="/news/2015/03/02/february-2015-in-flink.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -370,7 +368,7 @@ community last month.</p>
       
       </li>
       <li>
-        <span class="page_number ">Page: 16 of 17</span>
+        <span class="page_number ">Page: 16 of 18</span>
       </li>
       <li>
       
@@ -388,10 +386,35 @@ community last month.</p>
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
 
     <ul id="markdown-toc">
       
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
+
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
+    <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
+      
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
       
diff --git a/content/blog/page17/index.html b/content/blog/page17/index.html
index 799193b..3f2d350 100644
--- a/content/blog/page17/index.html
+++ b/content/blog/page17/index.html
@@ -201,6 +201,37 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html">Peeking into Apache Flink's Engine Room</a></h2>
+
+      <p>13 Mar 2015 by Fabian Hüske (<a href="https://twitter.com/">@fhueske</a>)
+      </p>
+
+      <p>Joins are prevalent operations in many data processing applications. Most data processing systems feature APIs that make joining data sets very easy. However, the internal algorithms for join processing are much more involved – especially if large data sets need to be efficiently handled. In this blog post, we cut through Apache Flink’s layered architecture and take a look at its internals with a focus on how it handles joins.</p>
+
+      <p><a href="/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/news/2015/03/02/february-2015-in-flink.html">February 2015 in the Flink community</a></h2>
+
+      <p>02 Mar 2015
+      </p>
+
+      <p><p>February might be the shortest month of the year, but this does not
+mean that the Flink community has not been busy adding features to the
+system and fixing bugs. Here’s a rundown of the activity in the Flink
+community last month.</p>
+
+</p>
+
+      <p><a href="/news/2015/03/02/february-2015-in-flink.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2015/02/09/streaming-example.html">Introducing Flink Streaming</a></h2>
 
       <p>09 Feb 2015
@@ -324,24 +355,6 @@ and offers a new API including definition of flexible windows.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2014/08/26/release-0.6.html">Apache Flink 0.6 available</a></h2>
-
-      <p>26 Aug 2014
-      </p>
-
-      <p><p>We are happy to announce the availability of Flink 0.6. This is the
-first release of the system inside the Apache Incubator and under the
-name Flink. Releases up to 0.5 were under the name Stratosphere, the
-academic and open source project that Flink originates from.</p>
-
-</p>
-
-      <p><a href="/news/2014/08/26/release-0.6.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -352,11 +365,11 @@ academic and open source project that Flink originates from.</p>
       
       </li>
       <li>
-        <span class="page_number ">Page: 17 of 17</span>
+        <span class="page_number ">Page: 17 of 18</span>
       </li>
       <li>
       
-        <span>Next</span>
+        <a href="/blog/page18" class="next">Next</a>
       
       </li>
     </ul>
@@ -370,9 +383,34 @@ academic and open source project that Flink originates from.</p>
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
+
+    <ul id="markdown-toc">
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
 
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
     <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
       
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
diff --git a/content/blog/page11/index.html b/content/blog/page18/index.html
similarity index 89%
copy from content/blog/page11/index.html
copy to content/blog/page18/index.html
index 623b1a2..641eec0 100644
--- a/content/blog/page11/index.html
+++ b/content/blog/page18/index.html
@@ -201,153 +201,19 @@
     <!-- Blog posts -->
     
     <article>
-      <h2 class="blog-title"><a href="/news/2018/12/21/release-1.7.1.html">Apache Flink 1.7.1 Released</a></h2>
+      <h2 class="blog-title"><a href="/news/2014/08/26/release-0.6.html">Apache Flink 0.6 available</a></h2>
 
-      <p>21 Dec 2018
+      <p>26 Aug 2014
       </p>
 
-      <p><p>The Apache Flink community released the first bugfix version of the Apache Flink 1.7 series.</p>
+      <p><p>We are happy to announce the availability of Flink 0.6. This is the
+first release of the system inside the Apache Incubator and under the
+name Flink. Releases up to 0.5 were under the name Stratosphere, the
+academic and open source project that Flink originates from.</p>
 
 </p>
 
-      <p><a href="/news/2018/12/21/release-1.7.1.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/11/30/release-1.7.0.html">Apache Flink 1.7.0 Release Announcement</a></h2>
-
-      <p>30 Nov 2018
-       Till Rohrmann (<a href="https://twitter.com/stsffap">@stsffap</a>)</p>
-
-      <p><p>The Apache Flink community is pleased to announce Apache Flink 1.7.0. 
-The latest release includes more than 420 resolved issues and some exciting additions to Flink that we describe in the following sections of this post. 
-Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;version=12343585">complete changelog</a> for more details.</p>
-
-</p>
-
-      <p><a href="/news/2018/11/30/release-1.7.0.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/10/29/release-1.6.2.html">Apache Flink 1.6.2 Released</a></h2>
-
-      <p>29 Oct 2018
-      </p>
-
-      <p><p>The Apache Flink community released the second bugfix version of the Apache Flink 1.6 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/10/29/release-1.6.2.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/10/29/release-1.5.5.html">Apache Flink 1.5.5 Released</a></h2>
-
-      <p>29 Oct 2018
-      </p>
-
-      <p><p>The Apache Flink community released the fifth bugfix version of the Apache Flink 1.5 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/10/29/release-1.5.5.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/09/20/release-1.6.1.html">Apache Flink 1.6.1 Released</a></h2>
-
-      <p>20 Sep 2018
-      </p>
-
-      <p><p>The Apache Flink community released the first bugfix version of the Apache Flink 1.6 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/09/20/release-1.6.1.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/09/20/release-1.5.4.html">Apache Flink 1.5.4 Released</a></h2>
-
-      <p>20 Sep 2018
-      </p>
-
-      <p><p>The Apache Flink community released the fourth bugfix version of the Apache Flink 1.5 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/09/20/release-1.5.4.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/08/21/release-1.5.3.html">Apache Flink 1.5.3 Released</a></h2>
-
-      <p>21 Aug 2018
-      </p>
-
-      <p><p>The Apache Flink community released the third bugfix version of the Apache Flink 1.5 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/08/21/release-1.5.3.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/08/09/release-1.6.0.html">Apache Flink 1.6.0 Release Announcement</a></h2>
-
-      <p>09 Aug 2018
-       Till Rohrmann (<a href="https://twitter.com/stsffap">@stsffap</a>)</p>
-
-      <p><p>The Apache Flink community is proud to announce the 1.6.0 release. Over the past 2 months, the Flink community has worked hard to resolve more than 360 issues. Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&amp;version=12342760">complete changelog</a> for more details.</p>
-
-</p>
-
-      <p><a href="/news/2018/08/09/release-1.6.0.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/07/31/release-1.5.2.html">Apache Flink 1.5.2 Released</a></h2>
-
-      <p>31 Jul 2018
-      </p>
-
-      <p><p>The Apache Flink community released the second bugfix version of the Apache Flink 1.5 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/07/31/release-1.5.2.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2018/07/12/release-1.5.1.html">Apache Flink 1.5.1 Released</a></h2>
-
-      <p>12 Jul 2018
-      </p>
-
-      <p><p>The Apache Flink community released the first bugfix version of the Apache Flink 1.5 series.</p>
-
-</p>
-
-      <p><a href="/news/2018/07/12/release-1.5.1.html">Continue reading &raquo;</a></p>
+      <p><a href="/news/2014/08/26/release-0.6.html">Continue reading &raquo;</a></p>
     </article>
 
     <hr>
@@ -358,15 +224,15 @@ Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
     <ul class="pager">
       <li>
       
-        <a href="/blog/page10" class="previous">Previous</a>
+        <a href="/blog/page17" class="previous">Previous</a>
       
       </li>
       <li>
-        <span class="page_number ">Page: 11 of 17</span>
+        <span class="page_number ">Page: 18 of 18</span>
       </li>
       <li>
       
-        <a href="/blog/page12" class="next">Next</a>
+        <span>Next</span>
       
       </li>
     </ul>
@@ -380,10 +246,35 @@ Please check the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
 
     <ul id="markdown-toc">
       
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
+
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
+    <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
+      
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
       
diff --git a/content/blog/page2/index.html b/content/blog/page2/index.html
index 00df501..3748350 100644
--- a/content/blog/page2/index.html
+++ b/content/blog/page2/index.html
@@ -201,6 +201,36 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/2021/09/07/connector-table-sql-api-part2.html">Implementing a custom source connector for Table API and SQL - Part Two </a></h2>
+
+      <p>07 Sep 2021
+       Ingo Buerk  &amp; Daisy Tsang </p>
+
+      <p><p>In <a href="/2021/09/07/connector-table-sql-api-part1">part one</a> of this tutorial, you learned how to build a custom source connector for Flink. In part two, you will learn how to integrate the connector with a test email inbox through the IMAP protocol and filter out emails using Flink SQL.</p>
+
+</p>
+
+      <p><a href="/2021/09/07/connector-table-sql-api-part2.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/2021/09/07/connector-table-sql-api-part1.html">Implementing a Custom Source Connector for Table API and SQL - Part One </a></h2>
+
+      <p>07 Sep 2021
+       Ingo Buerk  &amp; Daisy Tsang </p>
+
+      <p><p>Part one of this tutorial will teach you how to build and run a custom source connector to be used with Table API and SQL, two high-level abstractions in Flink. The tutorial comes with a bundled <a href="https://docs.docker.com/compose/">docker-compose</a> setup that lets you easily run the connector. You can then try it out with Flink’s SQL client.</p>
+
+</p>
+
+      <p><a href="/2021/09/07/connector-table-sql-api-part1.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2021/08/31/release-statefun-3.1.0.html">Stateful Functions 3.1.0 Release Announcement</a></h2>
 
       <p>31 Aug 2021
@@ -326,32 +356,6 @@ This new release brings various improvements to the StateFun runtime, a leaner w
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/2021/05/06/reactive-mode.html">Scaling Flink automatically with Reactive Mode</a></h2>
-
-      <p>06 May 2021
-       Robert Metzger (<a href="https://twitter.com/rmetzger_">@rmetzger_</a>)</p>
-
-      <p>Apache Flink 1.13 introduced Reactive Mode, a big step forward in Flink's ability to dynamically adjust to changing workloads, reducing resource utilization and overall costs. This blog post showcases how to use this new feature on Kubernetes, including some lessons learned.</p>
-
-      <p><a href="/2021/05/06/reactive-mode.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2021/05/03/release-1.13.0.html">Apache Flink 1.13.0 Release Announcement</a></h2>
-
-      <p>03 May 2021
-       Stephan Ewen (<a href="https://twitter.com/StephanEwen">@StephanEwen</a>) &amp; Dawid Wysakowicz (<a href="https://twitter.com/dwysakowicz">@dwysakowicz</a>)</p>
-
-      <p>The Apache Flink community is excited to announce the release of Flink 1.13.0! Around 200 contributors worked on over 1,000 issues to bring significant improvements to usability and observability as well as new features that improve the elasticity of Flink's Application-style deployments.</p>
-
-      <p><a href="/news/2021/05/03/release-1.13.0.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -362,7 +366,7 @@ This new release brings various improvements to the StateFun runtime, a leaner w
       
       </li>
       <li>
-        <span class="page_number ">Page: 2 of 17</span>
+        <span class="page_number ">Page: 2 of 18</span>
       </li>
       <li>
       
@@ -380,10 +384,35 @@ This new release brings various improvements to the StateFun runtime, a leaner w
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
 
     <ul id="markdown-toc">
       
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
+
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
+    <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
+      
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
       
diff --git a/content/blog/page3/index.html b/content/blog/page3/index.html
index f9b9516..bae0363 100644
--- a/content/blog/page3/index.html
+++ b/content/blog/page3/index.html
@@ -201,6 +201,32 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/2021/05/06/reactive-mode.html">Scaling Flink automatically with Reactive Mode</a></h2>
+
+      <p>06 May 2021
+       Robert Metzger (<a href="https://twitter.com/rmetzger_">@rmetzger_</a>)</p>
+
+      <p>Apache Flink 1.13 introduced Reactive Mode, a big step forward in Flink's ability to dynamically adjust to changing workloads, reducing resource utilization and overall costs. This blog post showcases how to use this new feature on Kubernetes, including some lessons learned.</p>
+
+      <p><a href="/2021/05/06/reactive-mode.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/news/2021/05/03/release-1.13.0.html">Apache Flink 1.13.0 Release Announcement</a></h2>
+
+      <p>03 May 2021
+       Stephan Ewen (<a href="https://twitter.com/StephanEwen">@StephanEwen</a>) &amp; Dawid Wysakowicz (<a href="https://twitter.com/dwysakowicz">@dwysakowicz</a>)</p>
+
+      <p>The Apache Flink community is excited to announce the release of Flink 1.13.0! Around 200 contributors worked on over 1,000 issues to bring significant improvements to usability and observability as well as new features that improve the elasticity of Flink's Application-style deployments.</p>
+
+      <p><a href="/news/2021/05/03/release-1.13.0.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2021/04/29/release-1.12.3.html">Apache Flink 1.12.3 Released</a></h2>
 
       <p>29 Apr 2021
@@ -316,32 +342,6 @@ to develop scalable, consistent, and elastic distributed applications.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2021/01/11/batch-fine-grained-fault-tolerance.html">Exploring fine-grained recovery of bounded data sets on Flink</a></h2>
-
-      <p>11 Jan 2021
-       Robert Metzger (<a href="https://twitter.com/rmetzger_">@rmetzger_</a>)</p>
-
-      <p>Apache Flink 1.9 introduced fine-grained recovery through FLIP-1. The Flink APIs that are made for bounded workloads benefit from this change by individually recovering failed operators, re-using results from the previous processing step. This blog post gives an overview over these changes and evaluates their effectiveness.</p>
-
-      <p><a href="/news/2021/01/11/batch-fine-grained-fault-tolerance.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></h2>
-
-      <p>07 Jan 2021
-       Jianyun Zhao (<a href="https://twitter.com/yihy8023">@yihy8023</a>) &amp; Jennifer Huang (<a href="https://twitter.com/Jennife06125739">@Jennife06125739</a>)</p>
-
-      <p>With the unification of batch and streaming regarded as the future in data processing, the Pulsar Flink Connector provides an ideal solution for unified batch and stream processing with Apache Pulsar and Apache Flink. The Pulsar Flink Connector 2.7.0 supports features in Pulsar 2.7 and Flink 1.12 and is fully compatible with Flink's data format. The Pulsar Flink Connector 2.7.0 will be contributed to the Flink repository soon and the contribution process is ongoing.</p>
-
-      <p><a href="/2021/01/07/pulsar-flink-connector-270.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -352,7 +352,7 @@ to develop scalable, consistent, and elastic distributed applications.</p>
       
       </li>
       <li>
-        <span class="page_number ">Page: 3 of 17</span>
+        <span class="page_number ">Page: 3 of 18</span>
       </li>
       <li>
       
@@ -370,9 +370,34 @@ to develop scalable, consistent, and elastic distributed applications.</p>
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
+
+    <ul id="markdown-toc">
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
 
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
     <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
       
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
diff --git a/content/blog/page4/index.html b/content/blog/page4/index.html
index 5a3b9f4..f8da703 100644
--- a/content/blog/page4/index.html
+++ b/content/blog/page4/index.html
@@ -201,6 +201,32 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/news/2021/01/11/batch-fine-grained-fault-tolerance.html">Exploring fine-grained recovery of bounded data sets on Flink</a></h2>
+
+      <p>11 Jan 2021
+       Robert Metzger (<a href="https://twitter.com/rmetzger_">@rmetzger_</a>)</p>
+
+      <p>Apache Flink 1.9 introduced fine-grained recovery through FLIP-1. The Flink APIs that are made for bounded workloads benefit from this change by individually recovering failed operators, re-using results from the previous processing step. This blog post gives an overview over these changes and evaluates their effectiveness.</p>
+
+      <p><a href="/news/2021/01/11/batch-fine-grained-fault-tolerance.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/2021/01/07/pulsar-flink-connector-270.html">What's New in the Pulsar Flink Connector 2.7.0</a></h2>
+
+      <p>07 Jan 2021
+       Jianyun Zhao (<a href="https://twitter.com/yihy8023">@yihy8023</a>) &amp; Jennifer Huang (<a href="https://twitter.com/Jennife06125739">@Jennife06125739</a>)</p>
+
+      <p>With the unification of batch and streaming regarded as the future in data processing, the Pulsar Flink Connector provides an ideal solution for unified batch and stream processing with Apache Pulsar and Apache Flink. The Pulsar Flink Connector 2.7.0 supports features in Pulsar 2.7 and Flink 1.12 and is fully compatible with Flink's data format. The Pulsar Flink Connector 2.7.0 will be contributed to the Flink repository soon and the contribution process is ongoing.</p>
+
+      <p><a href="/2021/01/07/pulsar-flink-connector-270.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2021/01/02/release-statefun-2.2.2.html">Stateful Functions 2.2.2 Release Announcement</a></h2>
 
       <p>02 Jan 2021
@@ -316,34 +342,6 @@ as well as increased observability for operational purposes.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2020/09/17/release-1.11.2.html">Apache Flink 1.11.2 Released</a></h2>
-
-      <p>17 Sep 2020
-       Zhu Zhu (<a href="https://twitter.com/zhuzhv">@zhuzhv</a>)</p>
-
-      <p><p>The Apache Flink community released the second bugfix version of the Apache Flink 1.11 series.</p>
-
-</p>
-
-      <p><a href="/news/2020/09/17/release-1.11.2.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2020/09/04/community-update.html">Flink Community Update - August'20</a></h2>
-
-      <p>04 Sep 2020
-       Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
-
-      <p>Ah, so much for a quiet August month. This time around, we bring you some new Flink Improvement Proposals (FLIPs), a preview of the upcoming Flink Stateful Functions 2.2 release and a look into how far Flink has come in comparison to 2019.</p>
-
-      <p><a href="/news/2020/09/04/community-update.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -354,7 +352,7 @@ as well as increased observability for operational purposes.</p>
       
       </li>
       <li>
-        <span class="page_number ">Page: 4 of 17</span>
+        <span class="page_number ">Page: 4 of 18</span>
       </li>
       <li>
       
@@ -372,9 +370,34 @@ as well as increased observability for operational purposes.</p>
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
+
+    <ul id="markdown-toc">
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
 
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
     <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
       
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
diff --git a/content/blog/page5/index.html b/content/blog/page5/index.html
index 86457af..8e8a18b 100644
--- a/content/blog/page5/index.html
+++ b/content/blog/page5/index.html
@@ -201,6 +201,34 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/news/2020/09/17/release-1.11.2.html">Apache Flink 1.11.2 Released</a></h2>
+
+      <p>17 Sep 2020
+       Zhu Zhu (<a href="https://twitter.com/zhuzhv">@zhuzhv</a>)</p>
+
+      <p><p>The Apache Flink community released the second bugfix version of the Apache Flink 1.11 series.</p>
+
+</p>
+
+      <p><a href="/news/2020/09/17/release-1.11.2.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/news/2020/09/04/community-update.html">Flink Community Update - August'20</a></h2>
+
+      <p>04 Sep 2020
+       Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
+
+      <p>Ah, so much for a quiet August month. This time around, we bring you some new Flink Improvement Proposals (FLIPs), a preview of the upcoming Flink Stateful Functions 2.2 release and a look into how far Flink has come in comparison to 2019.</p>
+
+      <p><a href="/news/2020/09/04/community-update.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/2020/09/01/flink-1.11-memory-management-improvements.html">Memory Management improvements for Flink’s JobManager in Apache Flink 1.11</a></h2>
 
       <p>01 Sep 2020
@@ -308,34 +336,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2020/07/27/community-update.html">Flink Community Update - July'20</a></h2>
-
-      <p>27 Jul 2020
-       Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
-
-      <p>As July draws to an end, we look back at a monthful of activity in the Flink community, including two releases (!) and some work around improving the first-time contribution experience in the project. Also, events are starting to pick up again, so we've put together a list of some great events you can (virtually) attend in August!</p>
-
-      <p><a href="/news/2020/07/27/community-update.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/2020/07/23/catalogs.html">Sharing is caring - Catalogs in Flink SQL</a></h2>
-
-      <p>23 Jul 2020
-       Dawid Wysakowicz (<a href="https://twitter.com/dwysakowicz">@dwysakowicz</a>)</p>
-
-      <p><p>With an ever-growing number of people working with data, it’s a common practice for companies to build self-service platforms with the goal of democratizing their access across different teams and — especially — to enable users from any background to be independent in their data needs. In such environments, metadata management becomes a crucial aspect. Without it, users often work blindly, spending too much time searching for datasets and their location, figuring out data for [...]
-
-</p>
-
-      <p><a href="/2020/07/23/catalogs.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -346,7 +346,7 @@
       
       </li>
       <li>
-        <span class="page_number ">Page: 5 of 17</span>
+        <span class="page_number ">Page: 5 of 18</span>
       </li>
       <li>
       
@@ -364,9 +364,34 @@
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
+
+    <ul id="markdown-toc">
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
 
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
     <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
       
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
diff --git a/content/blog/page6/index.html b/content/blog/page6/index.html
index b0c7d41..68602b0 100644
--- a/content/blog/page6/index.html
+++ b/content/blog/page6/index.html
@@ -201,6 +201,34 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/news/2020/07/27/community-update.html">Flink Community Update - July'20</a></h2>
+
+      <p>27 Jul 2020
+       Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
+
+      <p>As July draws to an end, we look back at a monthful of activity in the Flink community, including two releases (!) and some work around improving the first-time contribution experience in the project. Also, events are starting to pick up again, so we've put together a list of some great events you can (virtually) attend in August!</p>
+
+      <p><a href="/news/2020/07/27/community-update.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/2020/07/23/catalogs.html">Sharing is caring - Catalogs in Flink SQL</a></h2>
+
+      <p>23 Jul 2020
+       Dawid Wysakowicz (<a href="https://twitter.com/dwysakowicz">@dwysakowicz</a>)</p>
+
+      <p><p>With an ever-growing number of people working with data, it’s a common practice for companies to build self-service platforms with the goal of democratizing their access across different teams and — especially — to enable users from any background to be independent in their data needs. In such environments, metadata management becomes a crucial aspect. Without it, users often work blindly, spending too much time searching for datasets and their location, figuring out data for [...]
+
+</p>
+
+      <p><a href="/2020/07/23/catalogs.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2020/07/21/release-1.11.1.html">Apache Flink 1.11.1 Released</a></h2>
 
       <p>21 Jul 2020
@@ -324,32 +352,6 @@ and provide a tutorial for running Streaming ETL with Flink on Zeppelin.</p>
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2020/05/07/community-update.html">Flink Community Update - May'20</a></h2>
-
-      <p>07 May 2020
-       Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
-
-      <p>Can you smell it? It’s release month! This time around, we’re warming up for Flink 1.11 and peeping back to the past month in the Flink community — with the release of Stateful Functions 2.0, a new self-paced Flink training and some efforts to improve the Flink documentation experience.</p>
-
-      <p><a href="/news/2020/05/07/community-update.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2020/05/04/season-of-docs.html">Applying to Google Season of Docs 2020</a></h2>
-
-      <p>04 May 2020
-       Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
-
-      <p>The Flink community is thrilled to share that the project is applying again to Google Season of Docs (GSoD) this year! If you’re unfamiliar with the program, GSoD is a great initiative organized by Google Open Source to pair technical writers with mentors to work on documentation for open source projects. Does working shoulder to shoulder with the Flink community on documentation sound exciting? We’d love to hear from you!</p>
-
-      <p><a href="/news/2020/05/04/season-of-docs.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -360,7 +362,7 @@ and provide a tutorial for running Streaming ETL with Flink on Zeppelin.</p>
       
       </li>
       <li>
-        <span class="page_number ">Page: 6 of 17</span>
+        <span class="page_number ">Page: 6 of 18</span>
       </li>
       <li>
       
@@ -378,10 +380,35 @@ and provide a tutorial for running Streaming ETL with Flink on Zeppelin.</p>
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
 
     <ul id="markdown-toc">
       
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
+
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
+    <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
+      
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
       
diff --git a/content/blog/page7/index.html b/content/blog/page7/index.html
index 5d68b87..1c0943e 100644
--- a/content/blog/page7/index.html
+++ b/content/blog/page7/index.html
@@ -201,6 +201,32 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/news/2020/05/07/community-update.html">Flink Community Update - May'20</a></h2>
+
+      <p>07 May 2020
+       Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
+
+      <p>Can you smell it? It’s release month! This time around, we’re warming up for Flink 1.11 and peeping back to the past month in the Flink community — with the release of Stateful Functions 2.0, a new self-paced Flink training and some efforts to improve the Flink documentation experience.</p>
+
+      <p><a href="/news/2020/05/07/community-update.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/news/2020/05/04/season-of-docs.html">Applying to Google Season of Docs 2020</a></h2>
+
+      <p>04 May 2020
+       Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
+
+      <p>The Flink community is thrilled to share that the project is applying again to Google Season of Docs (GSoD) this year! If you’re unfamiliar with the program, GSoD is a great initiative organized by Google Open Source to pair technical writers with mentors to work on documentation for open source projects. Does working shoulder to shoulder with the Flink community on documentation sound exciting? We’d love to hear from you!</p>
+
+      <p><a href="/news/2020/05/04/season-of-docs.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2020/04/24/release-1.9.3.html">Apache Flink 1.9.3 Released</a></h2>
 
       <p>24 Apr 2020
@@ -311,32 +337,6 @@ This release marks a big milestone: Stateful Functions 2.0 is not only an API up
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/ecosystem/2020/02/22/apache-beam-how-beam-runs-on-top-of-flink.html">Apache Beam: How Beam Runs on Top of Flink</a></h2>
-
-      <p>22 Feb 2020
-       Maximilian Michels (<a href="https://twitter.com/stadtlegende">@stadtlegende</a>) &amp; Markos Sfikas (<a href="https://twitter.com/MarkSfik">@MarkSfik</a>)</p>
-
-      <p>This blog post discusses the reasons to use Flink together with Beam for your stream processing needs and takes a closer look at how Flink works with Beam under the hood.</p>
-
-      <p><a href="/ecosystem/2020/02/22/apache-beam-how-beam-runs-on-top-of-flink.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/news/2020/02/20/ddl.html">No Java Required: Configuring Sources and Sinks in SQL</a></h2>
-
-      <p>20 Feb 2020
-       Seth Wiesman (<a href="https://twitter.com/sjwiesman">@sjwiesman</a>)</p>
-
-      <p>This post discusses the efforts of the Flink community as they relate to end to end applications with SQL in Apache Flink.</p>
-
-      <p><a href="/news/2020/02/20/ddl.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -347,7 +347,7 @@ This release marks a big milestone: Stateful Functions 2.0 is not only an API up
       
       </li>
       <li>
-        <span class="page_number ">Page: 7 of 17</span>
+        <span class="page_number ">Page: 7 of 18</span>
       </li>
       <li>
       
@@ -365,9 +365,34 @@ This release marks a big milestone: Stateful Functions 2.0 is not only an API up
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
+
+    <ul id="markdown-toc">
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
 
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
     <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
       
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
diff --git a/content/blog/page8/index.html b/content/blog/page8/index.html
index 690880e..13eb939 100644
--- a/content/blog/page8/index.html
+++ b/content/blog/page8/index.html
@@ -201,6 +201,32 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/ecosystem/2020/02/22/apache-beam-how-beam-runs-on-top-of-flink.html">Apache Beam: How Beam Runs on Top of Flink</a></h2>
+
+      <p>22 Feb 2020
+       Maximilian Michels (<a href="https://twitter.com/stadtlegende">@stadtlegende</a>) &amp; Markos Sfikas (<a href="https://twitter.com/MarkSfik">@MarkSfik</a>)</p>
+
+      <p>This blog post discusses the reasons to use Flink together with Beam for your stream processing needs and takes a closer look at how Flink works with Beam under the hood.</p>
+
+      <p><a href="/ecosystem/2020/02/22/apache-beam-how-beam-runs-on-top-of-flink.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/news/2020/02/20/ddl.html">No Java Required: Configuring Sources and Sinks in SQL</a></h2>
+
+      <p>20 Feb 2020
+       Seth Wiesman (<a href="https://twitter.com/sjwiesman">@sjwiesman</a>)</p>
+
+      <p>This post discusses the efforts of the Flink community as they relate to end to end applications with SQL in Apache Flink.</p>
+
+      <p><a href="/news/2020/02/20/ddl.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2020/02/11/release-1.10.0.html">Apache Flink 1.10.0 Release Announcement</a></h2>
 
       <p>11 Feb 2020
@@ -310,34 +336,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/news/2019/10/18/release-1.9.1.html">Apache Flink 1.9.1 Released</a></h2>
-
-      <p>18 Oct 2019
-       Jark Wu (<a href="https://twitter.com/JarkWu">@JarkWu</a>)</p>
-
-      <p><p>The Apache Flink community released the first bugfix version of the Apache Flink 1.9 series.</p>
-
-</p>
-
-      <p><a href="/news/2019/10/18/release-1.9.1.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/feature/2019/09/13/state-processor-api.html">The State Processor API: How to Read, write and modify the state of Flink applications</a></h2>
-
-      <p>13 Sep 2019
-       Seth Wiesman (<a href="https://twitter.com/sjwiesman">@sjwiesman</a>) &amp; Fabian Hueske (<a href="https://twitter.com/fhueske">@fhueske</a>)</p>
-
-      <p>This post explores the State Processor API, introduced with Flink 1.9.0, why this feature is a big step for Flink, what you can use it for, how to use it and explores some future directions that align the feature with Apache Flink's evolution into a system for unified batch and stream processing.</p>
-
-      <p><a href="/feature/2019/09/13/state-processor-api.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -348,7 +346,7 @@
       
       </li>
       <li>
-        <span class="page_number ">Page: 8 of 17</span>
+        <span class="page_number ">Page: 8 of 18</span>
       </li>
       <li>
       
@@ -366,9 +364,34 @@
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
+
+    <ul id="markdown-toc">
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
 
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
     <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
       
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
diff --git a/content/blog/page9/index.html b/content/blog/page9/index.html
index 8003af4..b111b0e 100644
--- a/content/blog/page9/index.html
+++ b/content/blog/page9/index.html
@@ -201,6 +201,34 @@
     <!-- Blog posts -->
     
     <article>
+      <h2 class="blog-title"><a href="/news/2019/10/18/release-1.9.1.html">Apache Flink 1.9.1 Released</a></h2>
+
+      <p>18 Oct 2019
+       Jark Wu (<a href="https://twitter.com/JarkWu">@JarkWu</a>)</p>
+
+      <p><p>The Apache Flink community released the first bugfix version of the Apache Flink 1.9 series.</p>
+
+</p>
+
+      <p><a href="/news/2019/10/18/release-1.9.1.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
+      <h2 class="blog-title"><a href="/feature/2019/09/13/state-processor-api.html">The State Processor API: How to Read, write and modify the state of Flink applications</a></h2>
+
+      <p>13 Sep 2019
+       Seth Wiesman (<a href="https://twitter.com/sjwiesman">@sjwiesman</a>) &amp; Fabian Hueske (<a href="https://twitter.com/fhueske">@fhueske</a>)</p>
+
+      <p>This post explores the State Processor API, introduced with Flink 1.9.0, why this feature is a big step for Flink, what you can use it for, how to use it and explores some future directions that align the feature with Apache Flink's evolution into a system for unified batch and stream processing.</p>
+
+      <p><a href="/feature/2019/09/13/state-processor-api.html">Continue reading &raquo;</a></p>
+    </article>
+
+    <hr>
+    
+    <article>
       <h2 class="blog-title"><a href="/news/2019/09/11/release-1.8.2.html">Apache Flink 1.8.2 Released</a></h2>
 
       <p>11 Sep 2019
@@ -311,32 +339,6 @@
 
     <hr>
     
-    <article>
-      <h2 class="blog-title"><a href="/2019/05/14/temporal-tables.html">Flux capacitor, huh? Temporal Tables and Joins in Streaming SQL</a></h2>
-
-      <p>14 May 2019
-       Marta Paes (<a href="https://twitter.com/morsapaes">@morsapaes</a>)</p>
-
-      <p>Apache Flink natively supports temporal table joins since the 1.7 release for straightforward temporal data handling. In this blog post, we provide an overview of how this new concept can be leveraged for effective point-in-time analysis in streaming scenarios.</p>
-
-      <p><a href="/2019/05/14/temporal-tables.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
-    <article>
-      <h2 class="blog-title"><a href="/2019/05/03/pulsar-flink.html">When Flink & Pulsar Come Together</a></h2>
-
-      <p>03 May 2019
-       Sijie Guo (<a href="https://twitter.com/sijieg">@sijieg</a>)</p>
-
-      <p>Apache Flink and Apache Pulsar are distributed data processing systems. When combined, they offer elastic data processing at large scale. This post describes how Pulsar and Flink can work together to provide a seamless developer experience.</p>
-
-      <p><a href="/2019/05/03/pulsar-flink.html">Continue reading &raquo;</a></p>
-    </article>
-
-    <hr>
-    
 
     <!-- Pagination links -->
     
@@ -347,7 +349,7 @@
       
       </li>
       <li>
-        <span class="page_number ">Page: 9 of 17</span>
+        <span class="page_number ">Page: 9 of 18</span>
       </li>
       <li>
       
@@ -365,10 +367,35 @@
       
 
       
-    <h2>2021</h2>
+    <h2>2022</h2>
 
     <ul id="markdown-toc">
       
+      <li><a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></li>
+
+      
+        
+      
+    
+      
+      
+
+      
+      <li><a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></li>
+
+      
+        
+    </ul>
+        <hr>
+        <h2>2021</h2>
+    <ul id="markdown-toc">
+        
+      
+    
+      
+      
+
+      
       <li><a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></li>
 
       
diff --git a/content/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg b/content/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg
new file mode 100644
index 0000000..5424fbd
--- /dev/null
+++ b/content/img/blog/2022-01-05-scheduler-performance/1-distribution-pattern.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="987px" height="357px" viewBox="-0.5 -0.5 987 357" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2021-12-29T11:32:47.369Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; etag=&quot;WNKLNoexVU8kdb9qBtNl&quot; version=&quot;16.1.0&quot; type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git a/content/img/blog/2022-01-05-scheduler-performance/2-groups.svg b/content/img/blog/2022-01-05-scheduler-performance/2-groups.svg
new file mode 100644
index 0000000..f62484b
--- /dev/null
+++ b/content/img/blog/2022-01-05-scheduler-performance/2-groups.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="1117px" height="367px" viewBox="-0.5 -0.5 1117 367" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2021-12-29T11:48:39.835Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; etag=&quot;r17mJOWVV4jHEWX0ACX3&quot; version=&quot;16.1.0&quot; type=&quot;google&quot;&gt;&lt;diagram  [...]
\ No newline at end of file
diff --git a/content/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg b/content/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg
new file mode 100644
index 0000000..9032535
--- /dev/null
+++ b/content/img/blog/2022-01-05-scheduler-performance/3-how-shuffle-descriptors-are-distributed.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="832px" height="422px" viewBox="-0.5 -0.5 832 422" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2021-12-29T11:49:38.587Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; etag=&quot;sj7fJ-_3TWIaCKJk82m5&quot; version=&quot;16.1.0&quot; type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git a/content/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg b/content/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg
new file mode 100644
index 0000000..0f4494c
--- /dev/null
+++ b/content/img/blog/2022-01-05-scheduler-performance/4-pipelined-region.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="962px" height="382px" viewBox="-0.5 -0.5 962 382" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2022-01-04T12:41:09.588Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; etag=&quot;M1L6mcgOaCav-WM3zpr-&quot; version=&quot;16.1.4&quot; type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git a/content/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg b/content/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg
new file mode 100644
index 0000000..2c743e8
--- /dev/null
+++ b/content/img/blog/2022-01-05-scheduler-performance/5-scheduling-deadlock.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="918px" height="361px" viewBox="-0.5 -0.5 918 361" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2022-01-04T12:36:25.839Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; etag=&quot;bH7J1WTlE5dDkxqGV3PL&quot; version=&quot;16.1.0&quot; type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git a/content/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg b/content/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg
new file mode 100644
index 0000000..b2a44e0
--- /dev/null
+++ b/content/img/blog/2022-01-05-scheduler-performance/6-building-pipelined-region.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than diagrams.net -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="971px" height="942px" viewBox="-0.5 -0.5 971 942" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2021-12-29T11:52:06.980Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36&quot; etag=&quot;W0obqumORf-6iY1HI_oF&quot; version=&quot;16.1.0&quot; type=&quot;google&quot;&gt;&lt;diagram id [...]
\ No newline at end of file
diff --git a/content/index.html b/content/index.html
index 6f78a70..4e601df 100644
--- a/content/index.html
+++ b/content/index.html
@@ -365,6 +365,12 @@
 
   <dl>
       
+        <dt> <a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></dt>
+        <dd>Part one of this blog post briefly introduced the optimizations we’ve made to improve the performance of the scheduler; compared to Flink 1.12, the time cost and memory usage of scheduling large-scale jobs in Flink 1.14 is significantly reduced. In part two, we will elaborate on the details of these optimizations.</dd>
+      
+        <dt> <a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></dt>
+        <dd>To improve the performance of the scheduler for large-scale jobs, several optimizations were introduced in Flink 1.13 and 1.14. In this blog post we'll take a look at them.</dd>
+      
         <dt> <a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></dt>
         <dd><p>The Apache Flink community has released an emergency bugfix version of Apache Flink Stateful Function 3.1.1.</p>
 
@@ -377,12 +383,6 @@
       
         <dt> <a href="/2021/12/10/log4j-cve.html">Advise on Apache Log4j Zero Day (CVE-2021-44228)</a></dt>
         <dd>Apache Flink is affected by an Apache Log4j Zero Day (CVE-2021-44228). This blog post contains advise for users on how to address this.</dd>
-      
-        <dt> <a href="/2021/11/03/flink-backward.html">Flink Backward - The Apache Flink Retrospective</a></dt>
-        <dd>A look back at the development cycle for Flink 1.14</dd>
-      
-        <dt> <a href="/2021/10/26/sort-shuffle-part2.html">Sort-Based Blocking Shuffle Implementation in Flink - Part Two</a></dt>
-        <dd>Flink has implemented the sort-based blocking shuffle (FLIP-148) for batch data processing. In this blog post, we will take a close look at the design &amp; implementation details and see what we can gain from it.</dd>
     
   </dl>
 
diff --git a/content/zh/index.html b/content/zh/index.html
index baf9dd6..406fc98 100644
--- a/content/zh/index.html
+++ b/content/zh/index.html
@@ -362,6 +362,12 @@
 
   <dl>
       
+        <dt> <a href="/2022/01/04/scheduler-performance-part-two.html">How We Improved Scheduler Performance for Large-scale Jobs - Part Two</a></dt>
+        <dd>Part one of this blog post briefly introduced the optimizations we’ve made to improve the performance of the scheduler; compared to Flink 1.12, the time cost and memory usage of scheduling large-scale jobs in Flink 1.14 is significantly reduced. In part two, we will elaborate on the details of these optimizations.</dd>
+      
+        <dt> <a href="/2022/01/04/scheduler-performance-part-one.html">How We Improved Scheduler Performance for Large-scale Jobs - Part One</a></dt>
+        <dd>To improve the performance of the scheduler for large-scale jobs, several optimizations were introduced in Flink 1.13 and 1.14. In this blog post we'll take a look at them.</dd>
+      
         <dt> <a href="/news/2021/12/22/log4j-statefun-release.html">Apache Flink StateFun Log4j emergency release</a></dt>
         <dd><p>The Apache Flink community has released an emergency bugfix version of Apache Flink Stateful Function 3.1.1.</p>
 
@@ -374,12 +380,6 @@
       
         <dt> <a href="/2021/12/10/log4j-cve.html">Advise on Apache Log4j Zero Day (CVE-2021-44228)</a></dt>
         <dd>Apache Flink is affected by an Apache Log4j Zero Day (CVE-2021-44228). This blog post contains advise for users on how to address this.</dd>
-      
-        <dt> <a href="/2021/11/03/flink-backward.html">Flink Backward - The Apache Flink Retrospective</a></dt>
-        <dd>A look back at the development cycle for Flink 1.14</dd>
-      
-        <dt> <a href="/2021/10/26/sort-shuffle-part2.html">Sort-Based Blocking Shuffle Implementation in Flink - Part Two</a></dt>
-        <dd>Flink has implemented the sort-based blocking shuffle (FLIP-148) for batch data processing. In this blog post, we will take a close look at the design &amp; implementation details and see what we can gain from it.</dd>
     
   </dl>