You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by nk...@apache.org on 2019/07/23 15:48:19 UTC
[flink-web] 03/05: [Blog] style-tuning for Network Stack Vol. 1

This is an automated email from the ASF dual-hosted git repository.

nkruber pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git

commit 9dba6e8756560ed0ad00c482bc06932a70e63a17
Author: Nico Kruber <ni...@ververica.com>
AuthorDate: Wed Jul 17 15:00:35 2019 +0200

    [Blog] style-tuning for Network Stack Vol. 1
---
 _posts/2019-06-05-flink-network-stack.md    | 151 ++++++++++++------------
 content/2019/06/05/flink-network-stack.html | 172 +++++++++++++++------------
 content/blog/feed.xml                       | 174 +++++++++++++++-------------
 3 files changed, 266 insertions(+), 231 deletions(-)

diff --git a/_posts/2019-06-05-flink-network-stack.md b/_posts/2019-06-05-flink-network-stack.md
index 6e26fbd..f7c36d9 100644
--- a/_posts/2019-06-05-flink-network-stack.md
+++ b/_posts/2019-06-05-flink-network-stack.md
@@ -5,16 +5,27 @@ date: 2019-06-05T08:45:00.000Z
 authors:
 - Nico:
   name: "Nico Kruber"
-  
+
 
 excerpt: Flink’s network stack is one of the core components that make up Apache Flink's runtime module sitting at the core of every Flink job. In this post, which is the first in a series of posts about the network stack, we look at the abstractions exposed to the stream operators and detail their physical implementation and various optimisations in Apache Flink.
 ---
 
+<style type="text/css">
+.tg  {border-collapse:collapse;border-spacing:0;}
+.tg td{padding:10px 20px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;}
+.tg th{padding:10px 20px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;background-color:#eff0f1;}
+.tg .tg-wide{padding:10px 30px;}
+.tg .tg-top{vertical-align:top}
+.tg .tg-center{text-align:center;vertical-align:center}
+</style>
+
 Flink’s network stack is one of the core components that make up the `flink-runtime` module and sit at the heart of every Flink job. It connects individual work units (subtasks) from all TaskManagers. This is where your streamed-in data flows through and it is therefore crucial to the performance of your Flink job for both the throughput as well as latency you observe. In contrast to the coordination channels between TaskManagers and JobManagers which are using RPCs via Akka, the network [...]
 
 This blog post is the first in a series of posts about the network stack. In the sections below, we will first have a high-level look at what abstractions are exposed to the stream operators and then go into detail on the physical implementation and various optimisations Flink did. We will briefly present the result of these optimisations and Flink’s trade-off between throughput and latency. Future blog posts in this series will elaborate more on monitoring and metrics, tuning parameters [...]
 
-# Logical View
+{% toc %}
+
+## Logical View
 
 Flink’s network stack provides the following logical view to the subtasks when communicating with each other, for example during a network shuffle as required by a `keyBy()`.
 
@@ -54,42 +65,34 @@ Batch jobs may also produce results in a blocking fashion, depending on the oper
 The following table summarises the valid combinations:
 <br>
 <center>
-<style type="text/css">
-.tg  {border-collapse:collapse;border-spacing:0;}
-.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 30px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 30px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg .tg-wwp9{font-size:15px;background-color:#9b9b9b;border-color:#343434;text-align:left}
-.tg .tg-sogj{font-size:15px;text-align:left}
-.tg .tg-cbs6{font-size:15px;text-align:left;vertical-align:top}
-</style>
 <table class="tg">
   <tr>
-    <th class="tg-wwp9">Output Type</th>
-    <th class="tg-wwp9">Scheduling Type</th>
-    <th class="tg-wwp9">Applies to…</th>
+    <th>Output Type</th>
+    <th>Scheduling Type</th>
+    <th>Applies to…</th>
   </tr>
   <tr>
-    <td class="tg-sogj" rowspan="2">pipelined, unbounded</td>
-    <td class="tg-sogj">all at once</td>
-    <td class="tg-sogj">Streaming jobs</td>
+    <td rowspan="2">pipelined, unbounded</td>
+    <td>all at once</td>
+    <td>Streaming jobs</td>
   </tr>
   <tr>
-    <td class="tg-sogj">next stage on first output</td>
-    <td class="tg-sogj">n/a¹</td>
+    <td>next stage on first output</td>
+    <td>n/a¹</td>
   </tr>
   <tr>
-    <td class="tg-sogj" rowspan="2">pipelined, bounded</td>
-    <td class="tg-sogj">all at once</td>
-    <td class="tg-sogj">n/a²</td>
+    <td rowspan="2">pipelined, bounded</td>
+    <td>all at once</td>
+    <td>n/a²</td>
   </tr>
   <tr>
-    <td class="tg-sogj">next stage on first output</td>
-    <td class="tg-sogj">Batch jobs</td>
+    <td>next stage on first output</td>
+    <td>Batch jobs</td>
   </tr>
   <tr>
-    <td class="tg-cbs6">blocking</td>
-    <td class="tg-cbs6">next stage on complete output</td>
-    <td class="tg-cbs6">Batch jobs</td>
+    <td>blocking</td>
+    <td>next stage on complete output</td>
+    <td>Batch jobs</td>
   </tr>
 </table>
 </center>
@@ -105,7 +108,7 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 
 <br>
 
-# Physical Transport
+## Physical Transport
 
 In order to understand the physical data connections, please recall that, in Flink, different tasks may share the same slot via [slot sharing groups]({{ site.DOCS_BASE_URL }}flink-docs-release-1.8/dev/stream/operators/#task-chaining-and-resource-groups). TaskManagers may also provide more than one slot to allow multiple subtasks of the same task to be scheduled onto the same TaskManager.
 
@@ -113,37 +116,29 @@ For the example pictured below, we will assume a parallelism of 4 and a deployme
 <br>
 
 <center>
-<style type="text/css">
-.tg  {border-collapse:collapse;border-spacing:10;}
-.tg td{font-family:Arial, sans-serif;font-size:15px;padding:10px 80px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg th{font-family:Arial, sans-serif;font-size:15px;font-weight:normal;padding:10px 80px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg .tg-266k{background-color:#9b9b9b;border-color:inherit;text-align:left;vertical-align:center}
-.tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:center}
-.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:center}
-</style>
 <table class="tg">
   <tr>
-    <th class="tg-266k"></th>
-    <th class="tg-266k">B.1</th>
-    <th class="tg-266k">B.2</th>
-    <th class="tg-266k">B.3</th>
-    <th class="tg-266k">B.4</th>
+    <th></th>
+    <th class="tg-wide">B.1</th>
+    <th class="tg-wide">B.2</th>
+    <th class="tg-wide">B.3</th>
+    <th class="tg-wide">B.4</th>
   </tr>
   <tr>
-    <td class="tg-0pky">A.1</td>
-    <td class="tg-c3ow" colspan="2" rowspan="2">local</td>
-    <td class="tg-c3ow" colspan="2" rowspan="2">remote</td>
+    <th class="tg-wide">A.1</th>
+    <td class="tg-center" colspan="2" rowspan="2">local</td>
+    <td class="tg-center" colspan="2" rowspan="2">remote</td>
   </tr>
   <tr>
-    <td class="tg-0pky">A.2</td>
+    <th class="tg-wide">A.2</th>
   </tr>
   <tr>
-    <td class="tg-0pky">A.3</td>
-    <td class="tg-c3ow" colspan="2" rowspan="2">remote</td>
-    <td class="tg-c3ow" colspan="2" rowspan="2">local</td>
+    <th class="tg-wide">A.3</th>
+    <td class="tg-center" colspan="2" rowspan="2">remote</td>
+    <td class="tg-center" colspan="2" rowspan="2">local</td>
   </tr>
   <tr>
-    <td class="tg-0pky">A.4</td>
+    <th class="tg-wide">A.4</th>
   </tr>
 </table>
 </center>
@@ -164,7 +159,7 @@ The results of each subtask are called [ResultPartition]({{ site.DOCS_BASE_URL }
 
 The total number of buffers on a single TaskManager usually does not need configuration. See the [Configuring the Network Buffers]({{ site.DOCS_BASE_URL }}flink-docs-release-1.8/ops/config.html#configuring-the-network-buffers) documentation for details on how to do so if needed.
 
-## Inflicting Backpressure (1)
+### Inflicting Backpressure (1)
 
 Whenever a subtask’s sending buffer pool is exhausted — buffers reside in either a result subpartition's buffer queue or inside the lower, Netty-backed network stack — the producer is blocked, cannot continue, and experiences backpressure. The receiver works in a similar fashion: any incoming Netty buffer in the lower network stack needs to be made available to Flink via a network buffer. If there is no network buffer available in the appropriate subtask's buffer pool, Flink will stop re [...]
 
@@ -179,7 +174,7 @@ To prevent this situation from even happening, Flink 1.5 introduced its own flow
 
 <br>
 
-# Credit-based Flow Control
+## Credit-based Flow Control
 
 Credit-based flow control makes sure that whatever is “on the wire” will have capacity at the receiver to handle. It is based on the availability of network buffers as a natural extension of the mechanisms Flink had before. Instead of only having a shared local buffer pool, each remote input channel now has its own set of **exclusive buffers**. Conversely, buffers in the local buffer pool are called **floating buffers** as they will float around and are available to every input channel.
 
@@ -196,11 +191,11 @@ Credit-based flow control will use [buffers-per-channel]({{ site.DOCS_BASE_URL }
 
 <sup>3</sup>If there are not enough buffers available, each buffer pool will get the same share of the globally available ones (± 1).
 
-## Inflicting Backpressure (2)
+### Inflicting Backpressure (2)
 
 As opposed to the receiver's backpressure mechanisms without flow control, credits provide a more direct control: If a receiver cannot keep up, its available credits will eventually hit 0 and stop the sender from forwarding buffers to the lower network stack. There is backpressure on this logical channel only and there is no need to block reading from a multiplexed TCP channel. Other receivers are therefore not affected in processing available buffers.
 
-## What do we Gain? Where is the Catch?
+### What do we Gain? Where is the Catch?
 
 <img align="right" src="{{ site.baseurl }}/img/blog/2019-06-05-network-stack/flink-network-stack5.png" width="300" height="200" alt="Physical-transport-credit-flow-checkpoints-Flink's Network Stack"/>
 
@@ -213,36 +208,40 @@ There is one more thing you may notice when using credit-based flow control: sin
 <br>
 
 <center>
-<style type="text/css">
-.tg  {border-collapse:collapse;border-spacing:0;}
-.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 30px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 30px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg .tg-0vnf{font-size:15px;text-align:center}
-.tg .tg-rc1r{font-size:15px;background-color:#9b9b9b;text-align:left}
-.tg .tg-sogj{font-size:15px;text-align:left}
-</style>
 <table class="tg">
   <tr>
-    <th class="tg-rc1r">Advantages</th>
-    <th class="tg-rc1r">Disadvantages</th>
+    <th>Advantages</th>
+    <th>Disadvantages</th>
   </tr>
   <tr>
-    <td class="tg-sogj">• better resource utilisation with data skew in multiplexed connections <br><br>• improved checkpoint alignment<br><br>• reduced memory use (less data in lower network layers)</td>
-    <td class="tg-sogj">• additional credit-announce messages<br><br>• additional backlog-announce messages (piggy-backed with buffer messages, almost no overhead)<br><br>• potential round-trip latency</td>
+    <td class="tg-top">
+    • better resource utilisation with data skew in multiplexed connections <br><br>
+    • improved checkpoint alignment<br><br>
+    • reduced memory use (less data in lower network layers)</td>
+    <td class="tg-top">
+    • additional credit-announce messages<br><br>
+    • additional backlog-announce messages (piggy-backed with buffer messages, almost no overhead)<br><br>
+    • potential round-trip latency</td>
   </tr>
   <tr>
-    <td class="tg-0vnf" colspan="2">• backpressure appears earlier</td>
+    <td class="tg-center" colspan="2">• backpressure appears earlier</td>
   </tr>
 </table>
 </center>
 <br>
 
-> _NOTE:_ If you need to turn off credit-based flow control, you can add this to your `flink-conf.yaml`: `taskmanager.network.credit-model: false`. 
-> This parameter, however, is deprecated and will eventually be removed along with the non-credit-based flow control code.
+<div class="alert alert-info" markdown="1">
+<span class="label label-info" style="display: inline-block"><span class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> Note</span>
+If you need to turn off credit-based flow control, you can add this to your `flink-conf.yaml`:
+
+`taskmanager.network.credit-model: false`
+
+This parameter, however, is deprecated and will eventually be removed along with the non-credit-based flow control code.
+</div>
 
 <br>
 
-# Writing Records into Network Buffers and Reading them again
+## Writing Records into Network Buffers and Reading them again
 
 The following picture extends the slightly more high-level view from above with further details of the network stack and its surrounding components, from the collection of a record in your sending operator to the receiving operator getting it:
 <br>
@@ -257,7 +256,7 @@ After creating a record and passing it along, for example via `Collector#collect
 On the receiver’s side, the lower network stack (netty) is writing received buffers into the appropriate input channels. The (stream) tasks’s thread eventually reads from these queues and tries to deserialise the accumulated bytes into Java objects with the help of the [RecordReader]({{ site.DOCS_BASE_URL }}flink-docs-release-1.8/api/java/org/apache/flink/runtime/io/network/api/reader/RecordReader.html) and going through the [SpillingAdaptiveSpanningRecordDeserializer]({{ site.DOCS_BASE_ [...]
 <br>
 
-## Flushing Buffers to Netty
+### Flushing Buffers to Netty
 
 In the picture above, the credit-based flow control mechanics actually sit inside the “Netty Server” (and “Netty Client”) components and the buffer the RecordWriter is writing to is always added to the result subpartition in an empty state and then gradually filled with (serialised) records. But when does Netty actually get the buffer? Obviously, it cannot take bytes whenever they become available since that would not only add substantial costs due to cross-thread communication and synch [...]
 
@@ -268,7 +267,7 @@ In Flink, there are three situations that make a buffer available for consumptio
 * a special event such as a checkpoint barrier is sent.<br>
 <br>
 
-### Flush after Buffer Full
+#### Flush after Buffer Full
 
 The RecordWriter works with a local serialisation buffer for the current record and will gradually write these bytes to one or more network buffers sitting at the appropriate result subpartition queue. Although a RecordWriter can work on multiple subpartitions, each subpartition has only one RecordWriter writing data to it. The Netty server, on the other hand, is reading from multiple result subpartitions and multiplexing the appropriate ones into a single channel as described above. Thi [...]
 <br>
@@ -281,7 +280,7 @@ The RecordWriter works with a local serialisation buffer for the current record
 <sup>4</sup>We can assume it already got the notification if there are more finished buffers in the queue.
 <br>
 
-### Flush after Buffer Timeout
+#### Flush after Buffer Timeout
 
 In order to support low-latency use cases, we cannot only rely on buffers being full in order to send data downstream. There may be cases where a certain communication channel does not have too many records flowing through and unnecessarily increase the latency of the few records you actually have. Therefore, a periodic process will flush whatever data is available down the stack: the output flusher. The periodic interval can be configured via [StreamExecutionEnvironment#setBufferTimeout [...]
 <br>
@@ -294,12 +293,12 @@ In order to support low-latency use cases, we cannot only rely on buffers being
 <sup>5</sup>Strictly speaking, the output flusher does not give any guarantees - it only sends a notification to Netty which can pick it up at will / capacity. This also means that the output flusher has no effect if the channel is backpressured.
 <br>
 
-### Flush after special event
+#### Flush after special event
 
 Some special events also trigger immediate flushes if being sent through the RecordWriter. The most important ones are checkpoint barriers or end-of-partition events which obviously should go quickly and not wait for the output flusher to kick in.
 <br>
 
-### Further remarks
+#### Further remarks
 
 In contrast to Flink < 1.5, please note that (a) network buffers are now placed in the subpartition queues directly and (b) we are not closing the buffer on each flush. This gives us a few advantages:
 
@@ -310,13 +309,13 @@ In contrast to Flink < 1.5, please note that (a) network buffers are now placed
 However, you may notice an increased CPU use and TCP packet rate during low load scenarios. This is because, with the changes, Flink will use any *available* CPU cycles to try to maintain the desired latency. Once the load increases, this will self-adjust by buffers filling up more. High load scenarios are not affected and even get a better throughput because of the reduced synchronisation overhead.
 <br>
 
-## Buffer Builder & Buffer Consumer
+### Buffer Builder & Buffer Consumer
 
 If you want to dig deeper into how the producer-consumer mechanics are implemented in Flink, please take a closer look at the [BufferBuilder]({{ site.DOCS_BASE_URL }}flink-docs-release-1.8/api/java/org/apache/flink/runtime/io/network/buffer/BufferBuilder.html) and [BufferConsumer]({{ site.DOCS_BASE_URL }}flink-docs-release-1.8/api/java/org/apache/flink/runtime/io/network/buffer/BufferConsumer.html) classes which have been introduced in Flink 1.5. While reading is potentially only *per bu [...]
 
 <br>
 
-# Latency vs. Throughput
+## Latency vs. Throughput
 
 Network buffers were introduced to get higher resource utilisation and higher throughput at the cost of having some records wait in buffers a little longer. Although an upper limit to this wait time can be given via the buffer timeout, you may be curious to find out more about the trade-off between these two dimensions: latency and throughput, as, obviously, you cannot get both. The following plot shows various values for the buffer timeout starting at 0 (flush with every record) to 100m [...]
 <br>
@@ -330,7 +329,7 @@ As you can see, with Flink 1.5+, even very low buffer timeouts such as 1ms (for
 
 <br>
 
-# Conclusion
+## Conclusion
 
 Now you know about result partitions, the different network connections and scheduling types for both batch and streaming. You also know about credit-based flow control and how the network stack works internally, in order to reason about network-related tuning parameters and about certain job behaviours. Future blog posts in this series will build upon this knowledge and go into more operational details including relevant metrics to look at, further network stack tuning, and common antip [...]
 
diff --git a/content/2019/06/05/flink-network-stack.html b/content/2019/06/05/flink-network-stack.html
index 2445ec5..c1e0fcd 100644
--- a/content/2019/06/05/flink-network-stack.html
+++ b/content/2019/06/05/flink-network-stack.html
@@ -173,11 +173,43 @@
       <article>
         <p>05 Jun 2019 Nico Kruber </p>
 
+<style type="text/css">
+.tg  {border-collapse:collapse;border-spacing:0;}
+.tg td{padding:10px 20px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;}
+.tg th{padding:10px 20px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;background-color:#eff0f1;}
+.tg .tg-wide{padding:10px 30px;}
+.tg .tg-top{vertical-align:top}
+.tg .tg-center{text-align:center;vertical-align:center}
+</style>
+
 <p>Flink’s network stack is one of the core components that make up the <code>flink-runtime</code> module and sit at the heart of every Flink job. It connects individual work units (subtasks) from all TaskManagers. This is where your streamed-in data flows through and it is therefore crucial to the performance of your Flink job for both the throughput as well as latency you observe. In contrast to the coordination channels between TaskManagers and JobManagers which are using RPCs via Akk [...]
 
 <p>This blog post is the first in a series of posts about the network stack. In the sections below, we will first have a high-level look at what abstractions are exposed to the stream operators and then go into detail on the physical implementation and various optimisations Flink did. We will briefly present the result of these optimisations and Flink’s trade-off between throughput and latency. Future blog posts in this series will elaborate more on monitoring and metrics, tuning paramet [...]
 
-<h1 id="logical-view">Logical View</h1>
+<div class="page-toc">
+<ul id="markdown-toc">
+  <li><a href="#logical-view" id="markdown-toc-logical-view">Logical View</a></li>
+  <li><a href="#physical-transport" id="markdown-toc-physical-transport">Physical Transport</a>    <ul>
+      <li><a href="#inflicting-backpressure-1" id="markdown-toc-inflicting-backpressure-1">Inflicting Backpressure (1)</a></li>
+    </ul>
+  </li>
+  <li><a href="#credit-based-flow-control" id="markdown-toc-credit-based-flow-control">Credit-based Flow Control</a>    <ul>
+      <li><a href="#inflicting-backpressure-2" id="markdown-toc-inflicting-backpressure-2">Inflicting Backpressure (2)</a></li>
+      <li><a href="#what-do-we-gain-where-is-the-catch" id="markdown-toc-what-do-we-gain-where-is-the-catch">What do we Gain? Where is the Catch?</a></li>
+    </ul>
+  </li>
+  <li><a href="#writing-records-into-network-buffers-and-reading-them-again" id="markdown-toc-writing-records-into-network-buffers-and-reading-them-again">Writing Records into Network Buffers and Reading them again</a>    <ul>
+      <li><a href="#flushing-buffers-to-netty" id="markdown-toc-flushing-buffers-to-netty">Flushing Buffers to Netty</a></li>
+      <li><a href="#buffer-builder--buffer-consumer" id="markdown-toc-buffer-builder--buffer-consumer">Buffer Builder &amp; Buffer Consumer</a></li>
+    </ul>
+  </li>
+  <li><a href="#latency-vs-throughput" id="markdown-toc-latency-vs-throughput">Latency vs. Throughput</a></li>
+  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
+</ul>
+
+</div>
+
+<h2 id="logical-view">Logical View</h2>
 
 <p>Flink’s network stack provides the following logical view to the subtasks when communicating with each other, for example during a network shuffle as required by a <code>keyBy()</code>.</p>
 
@@ -226,42 +258,34 @@
 <p>The following table summarises the valid combinations:
 <br /></p>
 <center>
-<style type="text/css">
-.tg  {border-collapse:collapse;border-spacing:0;}
-.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 30px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 30px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg .tg-wwp9{font-size:15px;background-color:#9b9b9b;border-color:#343434;text-align:left}
-.tg .tg-sogj{font-size:15px;text-align:left}
-.tg .tg-cbs6{font-size:15px;text-align:left;vertical-align:top}
-</style>
 <table class="tg">
   <tr>
-    <th class="tg-wwp9">Output Type</th>
-    <th class="tg-wwp9">Scheduling Type</th>
-    <th class="tg-wwp9">Applies to…</th>
+    <th>Output Type</th>
+    <th>Scheduling Type</th>
+    <th>Applies to…</th>
   </tr>
   <tr>
-    <td class="tg-sogj" rowspan="2">pipelined, unbounded</td>
-    <td class="tg-sogj">all at once</td>
-    <td class="tg-sogj">Streaming jobs</td>
+    <td rowspan="2">pipelined, unbounded</td>
+    <td>all at once</td>
+    <td>Streaming jobs</td>
   </tr>
   <tr>
-    <td class="tg-sogj">next stage on first output</td>
-    <td class="tg-sogj">n/a¹</td>
+    <td>next stage on first output</td>
+    <td>n/a¹</td>
   </tr>
   <tr>
-    <td class="tg-sogj" rowspan="2">pipelined, bounded</td>
-    <td class="tg-sogj">all at once</td>
-    <td class="tg-sogj">n/a²</td>
+    <td rowspan="2">pipelined, bounded</td>
+    <td>all at once</td>
+    <td>n/a²</td>
   </tr>
   <tr>
-    <td class="tg-sogj">next stage on first output</td>
-    <td class="tg-sogj">Batch jobs</td>
+    <td>next stage on first output</td>
+    <td>Batch jobs</td>
   </tr>
   <tr>
-    <td class="tg-cbs6">blocking</td>
-    <td class="tg-cbs6">next stage on complete output</td>
-    <td class="tg-cbs6">Batch jobs</td>
+    <td>blocking</td>
+    <td>next stage on complete output</td>
+    <td>Batch jobs</td>
   </tr>
 </table>
 </center>
@@ -275,7 +299,7 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 
 <p><br /></p>
 
-<h1 id="physical-transport">Physical Transport</h1>
+<h2 id="physical-transport">Physical Transport</h2>
 
 <p>In order to understand the physical data connections, please recall that, in Flink, different tasks may share the same slot via <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/operators/#task-chaining-and-resource-groups">slot sharing groups</a>. TaskManagers may also provide more than one slot to allow multiple subtasks of the same task to be scheduled onto the same TaskManager.</p>
 
@@ -283,37 +307,29 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 <br /></p>
 
 <center>
-<style type="text/css">
-.tg  {border-collapse:collapse;border-spacing:10;}
-.tg td{font-family:Arial, sans-serif;font-size:15px;padding:10px 80px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg th{font-family:Arial, sans-serif;font-size:15px;font-weight:normal;padding:10px 80px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg .tg-266k{background-color:#9b9b9b;border-color:inherit;text-align:left;vertical-align:center}
-.tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:center}
-.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:center}
-</style>
 <table class="tg">
   <tr>
-    <th class="tg-266k"></th>
-    <th class="tg-266k">B.1</th>
-    <th class="tg-266k">B.2</th>
-    <th class="tg-266k">B.3</th>
-    <th class="tg-266k">B.4</th>
+    <th></th>
+    <th class="tg-wide">B.1</th>
+    <th class="tg-wide">B.2</th>
+    <th class="tg-wide">B.3</th>
+    <th class="tg-wide">B.4</th>
   </tr>
   <tr>
-    <td class="tg-0pky">A.1</td>
-    <td class="tg-c3ow" colspan="2" rowspan="2">local</td>
-    <td class="tg-c3ow" colspan="2" rowspan="2">remote</td>
+    <th class="tg-wide">A.1</th>
+    <td class="tg-center" colspan="2" rowspan="2">local</td>
+    <td class="tg-center" colspan="2" rowspan="2">remote</td>
   </tr>
   <tr>
-    <td class="tg-0pky">A.2</td>
+    <th class="tg-wide">A.2</th>
   </tr>
   <tr>
-    <td class="tg-0pky">A.3</td>
-    <td class="tg-c3ow" colspan="2" rowspan="2">remote</td>
-    <td class="tg-c3ow" colspan="2" rowspan="2">local</td>
+    <th class="tg-wide">A.3</th>
+    <td class="tg-center" colspan="2" rowspan="2">remote</td>
+    <td class="tg-center" colspan="2" rowspan="2">local</td>
   </tr>
   <tr>
-    <td class="tg-0pky">A.4</td>
+    <th class="tg-wide">A.4</th>
   </tr>
 </table>
 </center>
@@ -335,7 +351,7 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 
 <p>The total number of buffers on a single TaskManager usually does not need configuration. See the <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/config.html#configuring-the-network-buffers">Configuring the Network Buffers</a> documentation for details on how to do so if needed.</p>
 
-<h2 id="inflicting-backpressure-1">Inflicting Backpressure (1)</h2>
+<h3 id="inflicting-backpressure-1">Inflicting Backpressure (1)</h3>
 
 <p>Whenever a subtask’s sending buffer pool is exhausted — buffers reside in either a result subpartition’s buffer queue or inside the lower, Netty-backed network stack — the producer is blocked, cannot continue, and experiences backpressure. The receiver works in a similar fashion: any incoming Netty buffer in the lower network stack needs to be made available to Flink via a network buffer. If there is no network buffer available in the appropriate subtask’s buffer pool, Flink will stop [...]
 
@@ -350,7 +366,7 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 
 <p><br /></p>
 
-<h1 id="credit-based-flow-control">Credit-based Flow Control</h1>
+<h2 id="credit-based-flow-control">Credit-based Flow Control</h2>
 
 <p>Credit-based flow control makes sure that whatever is “on the wire” will have capacity at the receiver to handle. It is based on the availability of network buffers as a natural extension of the mechanisms Flink had before. Instead of only having a shared local buffer pool, each remote input channel now has its own set of <strong>exclusive buffers</strong>. Conversely, buffers in the local buffer pool are called <strong>floating buffers</strong> as they will float around and are avail [...]
 
@@ -367,11 +383,11 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 
 <p><sup>3</sup>If there are not enough buffers available, each buffer pool will get the same share of the globally available ones (± 1).</p>
 
-<h2 id="inflicting-backpressure-2">Inflicting Backpressure (2)</h2>
+<h3 id="inflicting-backpressure-2">Inflicting Backpressure (2)</h3>
 
 <p>As opposed to the receiver’s backpressure mechanisms without flow control, credits provide a more direct control: If a receiver cannot keep up, its available credits will eventually hit 0 and stop the sender from forwarding buffers to the lower network stack. There is backpressure on this logical channel only and there is no need to block reading from a multiplexed TCP channel. Other receivers are therefore not affected in processing available buffers.</p>
 
-<h2 id="what-do-we-gain-where-is-the-catch">What do we Gain? Where is the Catch?</h2>
+<h3 id="what-do-we-gain-where-is-the-catch">What do we Gain? Where is the Catch?</h3>
 
 <p><img align="right" src="/img/blog/2019-06-05-network-stack/flink-network-stack5.png" width="300" height="200" alt="Physical-transport-credit-flow-checkpoints-Flink's Network Stack" /></p>
 
@@ -384,38 +400,40 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 <p><br /></p>
 
 <center>
-<style type="text/css">
-.tg  {border-collapse:collapse;border-spacing:0;}
-.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 30px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 30px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg .tg-0vnf{font-size:15px;text-align:center}
-.tg .tg-rc1r{font-size:15px;background-color:#9b9b9b;text-align:left}
-.tg .tg-sogj{font-size:15px;text-align:left}
-</style>
 <table class="tg">
   <tr>
-    <th class="tg-rc1r">Advantages</th>
-    <th class="tg-rc1r">Disadvantages</th>
+    <th>Advantages</th>
+    <th>Disadvantages</th>
   </tr>
   <tr>
-    <td class="tg-sogj">• better resource utilisation with data skew in multiplexed connections <br /><br />• improved checkpoint alignment<br /><br />• reduced memory use (less data in lower network layers)</td>
-    <td class="tg-sogj">• additional credit-announce messages<br /><br />• additional backlog-announce messages (piggy-backed with buffer messages, almost no overhead)<br /><br />• potential round-trip latency</td>
+    <td class="tg-top">
+    • better resource utilisation with data skew in multiplexed connections <br /><br />
+    • improved checkpoint alignment<br /><br />
+    • reduced memory use (less data in lower network layers)</td>
+    <td class="tg-top">
+    • additional credit-announce messages<br /><br />
+    • additional backlog-announce messages (piggy-backed with buffer messages, almost no overhead)<br /><br />
+    • potential round-trip latency</td>
   </tr>
   <tr>
-    <td class="tg-0vnf" colspan="2">• backpressure appears earlier</td>
+    <td class="tg-center" colspan="2">• backpressure appears earlier</td>
   </tr>
 </table>
 </center>
 <p><br /></p>
 
-<blockquote>
-  <p><em>NOTE:</em> If you need to turn off credit-based flow control, you can add this to your <code>flink-conf.yaml</code>: <code>taskmanager.network.credit-model: false</code>. 
-This parameter, however, is deprecated and will eventually be removed along with the non-credit-based flow control code.</p>
-</blockquote>
+<div class="alert alert-info">
+  <p><span class="label label-info" style="display: inline-block"><span class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> Note</span>
+If you need to turn off credit-based flow control, you can add this to your <code>flink-conf.yaml</code>:</p>
+
+  <p><code>taskmanager.network.credit-model: false</code></p>
+
+  <p>This parameter, however, is deprecated and will eventually be removed along with the non-credit-based flow control code.</p>
+</div>
 
 <p><br /></p>
 
-<h1 id="writing-records-into-network-buffers-and-reading-them-again">Writing Records into Network Buffers and Reading them again</h1>
+<h2 id="writing-records-into-network-buffers-and-reading-them-again">Writing Records into Network Buffers and Reading them again</h2>
 
 <p>The following picture extends the slightly more high-level view from above with further details of the network stack and its surrounding components, from the collection of a record in your sending operator to the receiving operator getting it:
 <br /></p>
@@ -430,7 +448,7 @@ This parameter, however, is deprecated and will eventually be removed along with
 <p>On the receiver’s side, the lower network stack (netty) is writing received buffers into the appropriate input channels. The (stream) tasks’s thread eventually reads from these queues and tries to deserialise the accumulated bytes into Java objects with the help of the <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.8/api/java/org/apache/flink/runtime/io/network/api/reader/RecordReader.html">RecordReader</a> and going through the <a href="https://ci.apache.org/proje [...]
 <br /></p>
 
-<h2 id="flushing-buffers-to-netty">Flushing Buffers to Netty</h2>
+<h3 id="flushing-buffers-to-netty">Flushing Buffers to Netty</h3>
 
 <p>In the picture above, the credit-based flow control mechanics actually sit inside the “Netty Server” (and “Netty Client”) components and the buffer the RecordWriter is writing to is always added to the result subpartition in an empty state and then gradually filled with (serialised) records. But when does Netty actually get the buffer? Obviously, it cannot take bytes whenever they become available since that would not only add substantial costs due to cross-thread communication and sy [...]
 
@@ -443,7 +461,7 @@ This parameter, however, is deprecated and will eventually be removed along with
 <br /></li>
 </ul>
 
-<h3 id="flush-after-buffer-full">Flush after Buffer Full</h3>
+<h4 id="flush-after-buffer-full">Flush after Buffer Full</h4>
 
 <p>The RecordWriter works with a local serialisation buffer for the current record and will gradually write these bytes to one or more network buffers sitting at the appropriate result subpartition queue. Although a RecordWriter can work on multiple subpartitions, each subpartition has only one RecordWriter writing data to it. The Netty server, on the other hand, is reading from multiple result subpartitions and multiplexing the appropriate ones into a single channel as described above.  [...]
 <br /></p>
@@ -456,7 +474,7 @@ This parameter, however, is deprecated and will eventually be removed along with
 <p><sup>4</sup>We can assume it already got the notification if there are more finished buffers in the queue.
 <br /></p>
 
-<h3 id="flush-after-buffer-timeout">Flush after Buffer Timeout</h3>
+<h4 id="flush-after-buffer-timeout">Flush after Buffer Timeout</h4>
 
 <p>In order to support low-latency use cases, we cannot only rely on buffers being full in order to send data downstream. There may be cases where a certain communication channel does not have too many records flowing through and unnecessarily increase the latency of the few records you actually have. Therefore, a periodic process will flush whatever data is available down the stack: the output flusher. The periodic interval can be configured via <a href="https://ci.apache.org/projects/f [...]
 <br /></p>
@@ -469,12 +487,12 @@ This parameter, however, is deprecated and will eventually be removed along with
 <p><sup>5</sup>Strictly speaking, the output flusher does not give any guarantees - it only sends a notification to Netty which can pick it up at will / capacity. This also means that the output flusher has no effect if the channel is backpressured.
 <br /></p>
 
-<h3 id="flush-after-special-event">Flush after special event</h3>
+<h4 id="flush-after-special-event">Flush after special event</h4>
 
 <p>Some special events also trigger immediate flushes if being sent through the RecordWriter. The most important ones are checkpoint barriers or end-of-partition events which obviously should go quickly and not wait for the output flusher to kick in.
 <br /></p>
 
-<h3 id="further-remarks">Further remarks</h3>
+<h4 id="further-remarks">Further remarks</h4>
 
 <p>In contrast to Flink &lt; 1.5, please note that (a) network buffers are now placed in the subpartition queues directly and (b) we are not closing the buffer on each flush. This gives us a few advantages:</p>
 
@@ -487,13 +505,13 @@ This parameter, however, is deprecated and will eventually be removed along with
 <p>However, you may notice an increased CPU use and TCP packet rate during low load scenarios. This is because, with the changes, Flink will use any <em>available</em> CPU cycles to try to maintain the desired latency. Once the load increases, this will self-adjust by buffers filling up more. High load scenarios are not affected and even get a better throughput because of the reduced synchronisation overhead.
 <br /></p>
 
-<h2 id="buffer-builder--buffer-consumer">Buffer Builder &amp; Buffer Consumer</h2>
+<h3 id="buffer-builder--buffer-consumer">Buffer Builder &amp; Buffer Consumer</h3>
 
 <p>If you want to dig deeper into how the producer-consumer mechanics are implemented in Flink, please take a closer look at the <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.8/api/java/org/apache/flink/runtime/io/network/buffer/BufferBuilder.html">BufferBuilder</a> and <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.8/api/java/org/apache/flink/runtime/io/network/buffer/BufferConsumer.html">BufferConsumer</a> classes which have been introduced in F [...]
 
 <p><br /></p>
 
-<h1 id="latency-vs-throughput">Latency vs. Throughput</h1>
+<h2 id="latency-vs-throughput">Latency vs. Throughput</h2>
 
 <p>Network buffers were introduced to get higher resource utilisation and higher throughput at the cost of having some records wait in buffers a little longer. Although an upper limit to this wait time can be given via the buffer timeout, you may be curious to find out more about the trade-off between these two dimensions: latency and throughput, as, obviously, you cannot get both. The following plot shows various values for the buffer timeout starting at 0 (flush with every record) to 1 [...]
 <br /></p>
@@ -507,7 +525,7 @@ This parameter, however, is deprecated and will eventually be removed along with
 
 <p><br /></p>
 
-<h1 id="conclusion">Conclusion</h1>
+<h2 id="conclusion">Conclusion</h2>
 
 <p>Now you know about result partitions, the different network connections and scheduling types for both batch and streaming. You also know about credit-based flow control and how the network stack works internally, in order to reason about network-related tuning parameters and about certain job behaviours. Future blog posts in this series will build upon this knowledge and go into more operational details including relevant metrics to look at, further network stack tuning, and common an [...]
 
diff --git a/content/blog/feed.xml b/content/blog/feed.xml
index d4d033b..ebfe803 100644
--- a/content/blog/feed.xml
+++ b/content/blog/feed.xml
@@ -359,11 +359,43 @@ The website implements a streaming application that detects a pattern on the str
 
 <item>
 <title>A Deep-Dive into Flink&#39;s Network Stack</title>
-<description>&lt;p&gt;Flink’s network stack is one of the core components that make up the &lt;code&gt;flink-runtime&lt;/code&gt; module and sit at the heart of every Flink job. It connects individual work units (subtasks) from all TaskManagers. This is where your streamed-in data flows through and it is therefore crucial to the performance of your Flink job for both the throughput as well as latency you observe. In contrast to the coordination channels between TaskManagers and JobManage [...]
+<description>&lt;style type=&quot;text/css&quot;&gt;
+.tg  {border-collapse:collapse;border-spacing:0;}
+.tg td{padding:10px 20px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;}
+.tg th{padding:10px 20px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;background-color:#eff0f1;}
+.tg .tg-wide{padding:10px 30px;}
+.tg .tg-top{vertical-align:top}
+.tg .tg-center{text-align:center;vertical-align:center}
+&lt;/style&gt;
+
+&lt;p&gt;Flink’s network stack is one of the core components that make up the &lt;code&gt;flink-runtime&lt;/code&gt; module and sit at the heart of every Flink job. It connects individual work units (subtasks) from all TaskManagers. This is where your streamed-in data flows through and it is therefore crucial to the performance of your Flink job for both the throughput as well as latency you observe. In contrast to the coordination channels between TaskManagers and JobManagers which are  [...]
 
 &lt;p&gt;This blog post is the first in a series of posts about the network stack. In the sections below, we will first have a high-level look at what abstractions are exposed to the stream operators and then go into detail on the physical implementation and various optimisations Flink did. We will briefly present the result of these optimisations and Flink’s trade-off between throughput and latency. Future blog posts in this series will elaborate more on monitoring and metrics, tuning p [...]
 
-&lt;h1 id=&quot;logical-view&quot;&gt;Logical View&lt;/h1&gt;
+&lt;div class=&quot;page-toc&quot;&gt;
+&lt;ul id=&quot;markdown-toc&quot;&gt;
+  &lt;li&gt;&lt;a href=&quot;#logical-view&quot; id=&quot;markdown-toc-logical-view&quot;&gt;Logical View&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a href=&quot;#physical-transport&quot; id=&quot;markdown-toc-physical-transport&quot;&gt;Physical Transport&lt;/a&gt;    &lt;ul&gt;
+      &lt;li&gt;&lt;a href=&quot;#inflicting-backpressure-1&quot; id=&quot;markdown-toc-inflicting-backpressure-1&quot;&gt;Inflicting Backpressure (1)&lt;/a&gt;&lt;/li&gt;
+    &lt;/ul&gt;
+  &lt;/li&gt;
+  &lt;li&gt;&lt;a href=&quot;#credit-based-flow-control&quot; id=&quot;markdown-toc-credit-based-flow-control&quot;&gt;Credit-based Flow Control&lt;/a&gt;    &lt;ul&gt;
+      &lt;li&gt;&lt;a href=&quot;#inflicting-backpressure-2&quot; id=&quot;markdown-toc-inflicting-backpressure-2&quot;&gt;Inflicting Backpressure (2)&lt;/a&gt;&lt;/li&gt;
+      &lt;li&gt;&lt;a href=&quot;#what-do-we-gain-where-is-the-catch&quot; id=&quot;markdown-toc-what-do-we-gain-where-is-the-catch&quot;&gt;What do we Gain? Where is the Catch?&lt;/a&gt;&lt;/li&gt;
+    &lt;/ul&gt;
+  &lt;/li&gt;
+  &lt;li&gt;&lt;a href=&quot;#writing-records-into-network-buffers-and-reading-them-again&quot; id=&quot;markdown-toc-writing-records-into-network-buffers-and-reading-them-again&quot;&gt;Writing Records into Network Buffers and Reading them again&lt;/a&gt;    &lt;ul&gt;
+      &lt;li&gt;&lt;a href=&quot;#flushing-buffers-to-netty&quot; id=&quot;markdown-toc-flushing-buffers-to-netty&quot;&gt;Flushing Buffers to Netty&lt;/a&gt;&lt;/li&gt;
+      &lt;li&gt;&lt;a href=&quot;#buffer-builder--buffer-consumer&quot; id=&quot;markdown-toc-buffer-builder--buffer-consumer&quot;&gt;Buffer Builder &amp;amp; Buffer Consumer&lt;/a&gt;&lt;/li&gt;
+    &lt;/ul&gt;
+  &lt;/li&gt;
+  &lt;li&gt;&lt;a href=&quot;#latency-vs-throughput&quot; id=&quot;markdown-toc-latency-vs-throughput&quot;&gt;Latency vs. Throughput&lt;/a&gt;&lt;/li&gt;
+  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;/div&gt;
+
+&lt;h2 id=&quot;logical-view&quot;&gt;Logical View&lt;/h2&gt;
 
 &lt;p&gt;Flink’s network stack provides the following logical view to the subtasks when communicating with each other, for example during a network shuffle as required by a &lt;code&gt;keyBy()&lt;/code&gt;.&lt;/p&gt;
 
@@ -412,42 +444,34 @@ The website implements a streaming application that detects a pattern on the str
 &lt;p&gt;The following table summarises the valid combinations:
 &lt;br /&gt;&lt;/p&gt;
 &lt;center&gt;
-&lt;style type=&quot;text/css&quot;&gt;
-.tg  {border-collapse:collapse;border-spacing:0;}
-.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 30px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 30px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg .tg-wwp9{font-size:15px;background-color:#9b9b9b;border-color:#343434;text-align:left}
-.tg .tg-sogj{font-size:15px;text-align:left}
-.tg .tg-cbs6{font-size:15px;text-align:left;vertical-align:top}
-&lt;/style&gt;
 &lt;table class=&quot;tg&quot;&gt;
   &lt;tr&gt;
-    &lt;th class=&quot;tg-wwp9&quot;&gt;Output Type&lt;/th&gt;
-    &lt;th class=&quot;tg-wwp9&quot;&gt;Scheduling Type&lt;/th&gt;
-    &lt;th class=&quot;tg-wwp9&quot;&gt;Applies to…&lt;/th&gt;
+    &lt;th&gt;Output Type&lt;/th&gt;
+    &lt;th&gt;Scheduling Type&lt;/th&gt;
+    &lt;th&gt;Applies to…&lt;/th&gt;
   &lt;/tr&gt;
   &lt;tr&gt;
-    &lt;td class=&quot;tg-sogj&quot; rowspan=&quot;2&quot;&gt;pipelined, unbounded&lt;/td&gt;
-    &lt;td class=&quot;tg-sogj&quot;&gt;all at once&lt;/td&gt;
-    &lt;td class=&quot;tg-sogj&quot;&gt;Streaming jobs&lt;/td&gt;
+    &lt;td rowspan=&quot;2&quot;&gt;pipelined, unbounded&lt;/td&gt;
+    &lt;td&gt;all at once&lt;/td&gt;
+    &lt;td&gt;Streaming jobs&lt;/td&gt;
   &lt;/tr&gt;
   &lt;tr&gt;
-    &lt;td class=&quot;tg-sogj&quot;&gt;next stage on first output&lt;/td&gt;
-    &lt;td class=&quot;tg-sogj&quot;&gt;n/a¹&lt;/td&gt;
+    &lt;td&gt;next stage on first output&lt;/td&gt;
+    &lt;td&gt;n/a¹&lt;/td&gt;
   &lt;/tr&gt;
   &lt;tr&gt;
-    &lt;td class=&quot;tg-sogj&quot; rowspan=&quot;2&quot;&gt;pipelined, bounded&lt;/td&gt;
-    &lt;td class=&quot;tg-sogj&quot;&gt;all at once&lt;/td&gt;
-    &lt;td class=&quot;tg-sogj&quot;&gt;n/a²&lt;/td&gt;
+    &lt;td rowspan=&quot;2&quot;&gt;pipelined, bounded&lt;/td&gt;
+    &lt;td&gt;all at once&lt;/td&gt;
+    &lt;td&gt;n/a²&lt;/td&gt;
   &lt;/tr&gt;
   &lt;tr&gt;
-    &lt;td class=&quot;tg-sogj&quot;&gt;next stage on first output&lt;/td&gt;
-    &lt;td class=&quot;tg-sogj&quot;&gt;Batch jobs&lt;/td&gt;
+    &lt;td&gt;next stage on first output&lt;/td&gt;
+    &lt;td&gt;Batch jobs&lt;/td&gt;
   &lt;/tr&gt;
   &lt;tr&gt;
-    &lt;td class=&quot;tg-cbs6&quot;&gt;blocking&lt;/td&gt;
-    &lt;td class=&quot;tg-cbs6&quot;&gt;next stage on complete output&lt;/td&gt;
-    &lt;td class=&quot;tg-cbs6&quot;&gt;Batch jobs&lt;/td&gt;
+    &lt;td&gt;blocking&lt;/td&gt;
+    &lt;td&gt;next stage on complete output&lt;/td&gt;
+    &lt;td&gt;Batch jobs&lt;/td&gt;
   &lt;/tr&gt;
 &lt;/table&gt;
 &lt;/center&gt;
@@ -461,7 +485,7 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 
 &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
 
-&lt;h1 id=&quot;physical-transport&quot;&gt;Physical Transport&lt;/h1&gt;
+&lt;h2 id=&quot;physical-transport&quot;&gt;Physical Transport&lt;/h2&gt;
 
 &lt;p&gt;In order to understand the physical data connections, please recall that, in Flink, different tasks may share the same slot via &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/operators/#task-chaining-and-resource-groups&quot;&gt;slot sharing groups&lt;/a&gt;. TaskManagers may also provide more than one slot to allow multiple subtasks of the same task to be scheduled onto the same TaskManager.&lt;/p&gt;
 
@@ -469,37 +493,29 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 &lt;br /&gt;&lt;/p&gt;
 
 &lt;center&gt;
-&lt;style type=&quot;text/css&quot;&gt;
-.tg  {border-collapse:collapse;border-spacing:10;}
-.tg td{font-family:Arial, sans-serif;font-size:15px;padding:10px 80px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg th{font-family:Arial, sans-serif;font-size:15px;font-weight:normal;padding:10px 80px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg .tg-266k{background-color:#9b9b9b;border-color:inherit;text-align:left;vertical-align:center}
-.tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:center}
-.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:center}
-&lt;/style&gt;
 &lt;table class=&quot;tg&quot;&gt;
   &lt;tr&gt;
-    &lt;th class=&quot;tg-266k&quot;&gt;&lt;/th&gt;
-    &lt;th class=&quot;tg-266k&quot;&gt;B.1&lt;/th&gt;
-    &lt;th class=&quot;tg-266k&quot;&gt;B.2&lt;/th&gt;
-    &lt;th class=&quot;tg-266k&quot;&gt;B.3&lt;/th&gt;
-    &lt;th class=&quot;tg-266k&quot;&gt;B.4&lt;/th&gt;
+    &lt;th&gt;&lt;/th&gt;
+    &lt;th class=&quot;tg-wide&quot;&gt;B.1&lt;/th&gt;
+    &lt;th class=&quot;tg-wide&quot;&gt;B.2&lt;/th&gt;
+    &lt;th class=&quot;tg-wide&quot;&gt;B.3&lt;/th&gt;
+    &lt;th class=&quot;tg-wide&quot;&gt;B.4&lt;/th&gt;
   &lt;/tr&gt;
   &lt;tr&gt;
-    &lt;td class=&quot;tg-0pky&quot;&gt;A.1&lt;/td&gt;
-    &lt;td class=&quot;tg-c3ow&quot; colspan=&quot;2&quot; rowspan=&quot;2&quot;&gt;local&lt;/td&gt;
-    &lt;td class=&quot;tg-c3ow&quot; colspan=&quot;2&quot; rowspan=&quot;2&quot;&gt;remote&lt;/td&gt;
+    &lt;th class=&quot;tg-wide&quot;&gt;A.1&lt;/th&gt;
+    &lt;td class=&quot;tg-center&quot; colspan=&quot;2&quot; rowspan=&quot;2&quot;&gt;local&lt;/td&gt;
+    &lt;td class=&quot;tg-center&quot; colspan=&quot;2&quot; rowspan=&quot;2&quot;&gt;remote&lt;/td&gt;
   &lt;/tr&gt;
   &lt;tr&gt;
-    &lt;td class=&quot;tg-0pky&quot;&gt;A.2&lt;/td&gt;
+    &lt;th class=&quot;tg-wide&quot;&gt;A.2&lt;/th&gt;
   &lt;/tr&gt;
   &lt;tr&gt;
-    &lt;td class=&quot;tg-0pky&quot;&gt;A.3&lt;/td&gt;
-    &lt;td class=&quot;tg-c3ow&quot; colspan=&quot;2&quot; rowspan=&quot;2&quot;&gt;remote&lt;/td&gt;
-    &lt;td class=&quot;tg-c3ow&quot; colspan=&quot;2&quot; rowspan=&quot;2&quot;&gt;local&lt;/td&gt;
+    &lt;th class=&quot;tg-wide&quot;&gt;A.3&lt;/th&gt;
+    &lt;td class=&quot;tg-center&quot; colspan=&quot;2&quot; rowspan=&quot;2&quot;&gt;remote&lt;/td&gt;
+    &lt;td class=&quot;tg-center&quot; colspan=&quot;2&quot; rowspan=&quot;2&quot;&gt;local&lt;/td&gt;
   &lt;/tr&gt;
   &lt;tr&gt;
-    &lt;td class=&quot;tg-0pky&quot;&gt;A.4&lt;/td&gt;
+    &lt;th class=&quot;tg-wide&quot;&gt;A.4&lt;/th&gt;
   &lt;/tr&gt;
 &lt;/table&gt;
 &lt;/center&gt;
@@ -521,7 +537,7 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 
 &lt;p&gt;The total number of buffers on a single TaskManager usually does not need configuration. See the &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/config.html#configuring-the-network-buffers&quot;&gt;Configuring the Network Buffers&lt;/a&gt; documentation for details on how to do so if needed.&lt;/p&gt;
 
-&lt;h2 id=&quot;inflicting-backpressure-1&quot;&gt;Inflicting Backpressure (1)&lt;/h2&gt;
+&lt;h3 id=&quot;inflicting-backpressure-1&quot;&gt;Inflicting Backpressure (1)&lt;/h3&gt;
 
 &lt;p&gt;Whenever a subtask’s sending buffer pool is exhausted — buffers reside in either a result subpartition’s buffer queue or inside the lower, Netty-backed network stack — the producer is blocked, cannot continue, and experiences backpressure. The receiver works in a similar fashion: any incoming Netty buffer in the lower network stack needs to be made available to Flink via a network buffer. If there is no network buffer available in the appropriate subtask’s buffer pool, Flink wil [...]
 
@@ -536,7 +552,7 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 
 &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
 
-&lt;h1 id=&quot;credit-based-flow-control&quot;&gt;Credit-based Flow Control&lt;/h1&gt;
+&lt;h2 id=&quot;credit-based-flow-control&quot;&gt;Credit-based Flow Control&lt;/h2&gt;
 
 &lt;p&gt;Credit-based flow control makes sure that whatever is “on the wire” will have capacity at the receiver to handle. It is based on the availability of network buffers as a natural extension of the mechanisms Flink had before. Instead of only having a shared local buffer pool, each remote input channel now has its own set of &lt;strong&gt;exclusive buffers&lt;/strong&gt;. Conversely, buffers in the local buffer pool are called &lt;strong&gt;floating buffers&lt;/strong&gt; as they w [...]
 
@@ -553,11 +569,11 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 
 &lt;p&gt;&lt;sup&gt;3&lt;/sup&gt;If there are not enough buffers available, each buffer pool will get the same share of the globally available ones (± 1).&lt;/p&gt;
 
-&lt;h2 id=&quot;inflicting-backpressure-2&quot;&gt;Inflicting Backpressure (2)&lt;/h2&gt;
+&lt;h3 id=&quot;inflicting-backpressure-2&quot;&gt;Inflicting Backpressure (2)&lt;/h3&gt;
 
 &lt;p&gt;As opposed to the receiver’s backpressure mechanisms without flow control, credits provide a more direct control: If a receiver cannot keep up, its available credits will eventually hit 0 and stop the sender from forwarding buffers to the lower network stack. There is backpressure on this logical channel only and there is no need to block reading from a multiplexed TCP channel. Other receivers are therefore not affected in processing available buffers.&lt;/p&gt;
 
-&lt;h2 id=&quot;what-do-we-gain-where-is-the-catch&quot;&gt;What do we Gain? Where is the Catch?&lt;/h2&gt;
+&lt;h3 id=&quot;what-do-we-gain-where-is-the-catch&quot;&gt;What do we Gain? Where is the Catch?&lt;/h3&gt;
 
 &lt;p&gt;&lt;img align=&quot;right&quot; src=&quot;/img/blog/2019-06-05-network-stack/flink-network-stack5.png&quot; width=&quot;300&quot; height=&quot;200&quot; alt=&quot;Physical-transport-credit-flow-checkpoints-Flink&#39;s Network Stack&quot; /&gt;&lt;/p&gt;
 
@@ -570,38 +586,40 @@ Additionally, for subtasks with more than one input, scheduling start in two way
 &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
 
 &lt;center&gt;
-&lt;style type=&quot;text/css&quot;&gt;
-.tg  {border-collapse:collapse;border-spacing:0;}
-.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 30px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 30px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
-.tg .tg-0vnf{font-size:15px;text-align:center}
-.tg .tg-rc1r{font-size:15px;background-color:#9b9b9b;text-align:left}
-.tg .tg-sogj{font-size:15px;text-align:left}
-&lt;/style&gt;
 &lt;table class=&quot;tg&quot;&gt;
   &lt;tr&gt;
-    &lt;th class=&quot;tg-rc1r&quot;&gt;Advantages&lt;/th&gt;
-    &lt;th class=&quot;tg-rc1r&quot;&gt;Disadvantages&lt;/th&gt;
+    &lt;th&gt;Advantages&lt;/th&gt;
+    &lt;th&gt;Disadvantages&lt;/th&gt;
   &lt;/tr&gt;
   &lt;tr&gt;
-    &lt;td class=&quot;tg-sogj&quot;&gt;• better resource utilisation with data skew in multiplexed connections &lt;br /&gt;&lt;br /&gt;• improved checkpoint alignment&lt;br /&gt;&lt;br /&gt;• reduced memory use (less data in lower network layers)&lt;/td&gt;
-    &lt;td class=&quot;tg-sogj&quot;&gt;• additional credit-announce messages&lt;br /&gt;&lt;br /&gt;• additional backlog-announce messages (piggy-backed with buffer messages, almost no overhead)&lt;br /&gt;&lt;br /&gt;• potential round-trip latency&lt;/td&gt;
+    &lt;td class=&quot;tg-top&quot;&gt;
+    • better resource utilisation with data skew in multiplexed connections &lt;br /&gt;&lt;br /&gt;
+    • improved checkpoint alignment&lt;br /&gt;&lt;br /&gt;
+    • reduced memory use (less data in lower network layers)&lt;/td&gt;
+    &lt;td class=&quot;tg-top&quot;&gt;
+    • additional credit-announce messages&lt;br /&gt;&lt;br /&gt;
+    • additional backlog-announce messages (piggy-backed with buffer messages, almost no overhead)&lt;br /&gt;&lt;br /&gt;
+    • potential round-trip latency&lt;/td&gt;
   &lt;/tr&gt;
   &lt;tr&gt;
-    &lt;td class=&quot;tg-0vnf&quot; colspan=&quot;2&quot;&gt;• backpressure appears earlier&lt;/td&gt;
+    &lt;td class=&quot;tg-center&quot; colspan=&quot;2&quot;&gt;• backpressure appears earlier&lt;/td&gt;
   &lt;/tr&gt;
 &lt;/table&gt;
 &lt;/center&gt;
 &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
 
-&lt;blockquote&gt;
-  &lt;p&gt;&lt;em&gt;NOTE:&lt;/em&gt; If you need to turn off credit-based flow control, you can add this to your &lt;code&gt;flink-conf.yaml&lt;/code&gt;: &lt;code&gt;taskmanager.network.credit-model: false&lt;/code&gt;. 
-This parameter, however, is deprecated and will eventually be removed along with the non-credit-based flow control code.&lt;/p&gt;
-&lt;/blockquote&gt;
+&lt;div class=&quot;alert alert-info&quot;&gt;
+  &lt;p&gt;&lt;span class=&quot;label label-info&quot; style=&quot;display: inline-block&quot;&gt;&lt;span class=&quot;glyphicon glyphicon-info-sign&quot; aria-hidden=&quot;true&quot;&gt;&lt;/span&gt; Note&lt;/span&gt;
+If you need to turn off credit-based flow control, you can add this to your &lt;code&gt;flink-conf.yaml&lt;/code&gt;:&lt;/p&gt;
+
+  &lt;p&gt;&lt;code&gt;taskmanager.network.credit-model: false&lt;/code&gt;&lt;/p&gt;
+
+  &lt;p&gt;This parameter, however, is deprecated and will eventually be removed along with the non-credit-based flow control code.&lt;/p&gt;
+&lt;/div&gt;
 
 &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
 
-&lt;h1 id=&quot;writing-records-into-network-buffers-and-reading-them-again&quot;&gt;Writing Records into Network Buffers and Reading them again&lt;/h1&gt;
+&lt;h2 id=&quot;writing-records-into-network-buffers-and-reading-them-again&quot;&gt;Writing Records into Network Buffers and Reading them again&lt;/h2&gt;
 
 &lt;p&gt;The following picture extends the slightly more high-level view from above with further details of the network stack and its surrounding components, from the collection of a record in your sending operator to the receiving operator getting it:
 &lt;br /&gt;&lt;/p&gt;
@@ -616,7 +634,7 @@ This parameter, however, is deprecated and will eventually be removed along with
 &lt;p&gt;On the receiver’s side, the lower network stack (netty) is writing received buffers into the appropriate input channels. The (stream) tasks’s thread eventually reads from these queues and tries to deserialise the accumulated bytes into Java objects with the help of the &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.8/api/java/org/apache/flink/runtime/io/network/api/reader/RecordReader.html&quot;&gt;RecordReader&lt;/a&gt; and going through the &lt;a hr [...]
 &lt;br /&gt;&lt;/p&gt;
 
-&lt;h2 id=&quot;flushing-buffers-to-netty&quot;&gt;Flushing Buffers to Netty&lt;/h2&gt;
+&lt;h3 id=&quot;flushing-buffers-to-netty&quot;&gt;Flushing Buffers to Netty&lt;/h3&gt;
 
 &lt;p&gt;In the picture above, the credit-based flow control mechanics actually sit inside the “Netty Server” (and “Netty Client”) components and the buffer the RecordWriter is writing to is always added to the result subpartition in an empty state and then gradually filled with (serialised) records. But when does Netty actually get the buffer? Obviously, it cannot take bytes whenever they become available since that would not only add substantial costs due to cross-thread communication  [...]
 
@@ -629,7 +647,7 @@ This parameter, however, is deprecated and will eventually be removed along with
 &lt;br /&gt;&lt;/li&gt;
 &lt;/ul&gt;
 
-&lt;h3 id=&quot;flush-after-buffer-full&quot;&gt;Flush after Buffer Full&lt;/h3&gt;
+&lt;h4 id=&quot;flush-after-buffer-full&quot;&gt;Flush after Buffer Full&lt;/h4&gt;
 
 &lt;p&gt;The RecordWriter works with a local serialisation buffer for the current record and will gradually write these bytes to one or more network buffers sitting at the appropriate result subpartition queue. Although a RecordWriter can work on multiple subpartitions, each subpartition has only one RecordWriter writing data to it. The Netty server, on the other hand, is reading from multiple result subpartitions and multiplexing the appropriate ones into a single channel as described a [...]
 &lt;br /&gt;&lt;/p&gt;
@@ -642,7 +660,7 @@ This parameter, however, is deprecated and will eventually be removed along with
 &lt;p&gt;&lt;sup&gt;4&lt;/sup&gt;We can assume it already got the notification if there are more finished buffers in the queue.
 &lt;br /&gt;&lt;/p&gt;
 
-&lt;h3 id=&quot;flush-after-buffer-timeout&quot;&gt;Flush after Buffer Timeout&lt;/h3&gt;
+&lt;h4 id=&quot;flush-after-buffer-timeout&quot;&gt;Flush after Buffer Timeout&lt;/h4&gt;
 
 &lt;p&gt;In order to support low-latency use cases, we cannot only rely on buffers being full in order to send data downstream. There may be cases where a certain communication channel does not have too many records flowing through and unnecessarily increase the latency of the few records you actually have. Therefore, a periodic process will flush whatever data is available down the stack: the output flusher. The periodic interval can be configured via &lt;a href=&quot;https://ci.apache. [...]
 &lt;br /&gt;&lt;/p&gt;
@@ -655,12 +673,12 @@ This parameter, however, is deprecated and will eventually be removed along with
 &lt;p&gt;&lt;sup&gt;5&lt;/sup&gt;Strictly speaking, the output flusher does not give any guarantees - it only sends a notification to Netty which can pick it up at will / capacity. This also means that the output flusher has no effect if the channel is backpressured.
 &lt;br /&gt;&lt;/p&gt;
 
-&lt;h3 id=&quot;flush-after-special-event&quot;&gt;Flush after special event&lt;/h3&gt;
+&lt;h4 id=&quot;flush-after-special-event&quot;&gt;Flush after special event&lt;/h4&gt;
 
 &lt;p&gt;Some special events also trigger immediate flushes if being sent through the RecordWriter. The most important ones are checkpoint barriers or end-of-partition events which obviously should go quickly and not wait for the output flusher to kick in.
 &lt;br /&gt;&lt;/p&gt;
 
-&lt;h3 id=&quot;further-remarks&quot;&gt;Further remarks&lt;/h3&gt;
+&lt;h4 id=&quot;further-remarks&quot;&gt;Further remarks&lt;/h4&gt;
 
 &lt;p&gt;In contrast to Flink &amp;lt; 1.5, please note that (a) network buffers are now placed in the subpartition queues directly and (b) we are not closing the buffer on each flush. This gives us a few advantages:&lt;/p&gt;
 
@@ -673,13 +691,13 @@ This parameter, however, is deprecated and will eventually be removed along with
 &lt;p&gt;However, you may notice an increased CPU use and TCP packet rate during low load scenarios. This is because, with the changes, Flink will use any &lt;em&gt;available&lt;/em&gt; CPU cycles to try to maintain the desired latency. Once the load increases, this will self-adjust by buffers filling up more. High load scenarios are not affected and even get a better throughput because of the reduced synchronisation overhead.
 &lt;br /&gt;&lt;/p&gt;
 
-&lt;h2 id=&quot;buffer-builder--buffer-consumer&quot;&gt;Buffer Builder &amp;amp; Buffer Consumer&lt;/h2&gt;
+&lt;h3 id=&quot;buffer-builder--buffer-consumer&quot;&gt;Buffer Builder &amp;amp; Buffer Consumer&lt;/h3&gt;
 
 &lt;p&gt;If you want to dig deeper into how the producer-consumer mechanics are implemented in Flink, please take a closer look at the &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.8/api/java/org/apache/flink/runtime/io/network/buffer/BufferBuilder.html&quot;&gt;BufferBuilder&lt;/a&gt; and &lt;a href=&quot;https://ci.apache.org/projects/flink/flink-docs-release-1.8/api/java/org/apache/flink/runtime/io/network/buffer/BufferConsumer.html&quot;&gt;BufferConsumer [...]
 
 &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
 
-&lt;h1 id=&quot;latency-vs-throughput&quot;&gt;Latency vs. Throughput&lt;/h1&gt;
+&lt;h2 id=&quot;latency-vs-throughput&quot;&gt;Latency vs. Throughput&lt;/h2&gt;
 
 &lt;p&gt;Network buffers were introduced to get higher resource utilisation and higher throughput at the cost of having some records wait in buffers a little longer. Although an upper limit to this wait time can be given via the buffer timeout, you may be curious to find out more about the trade-off between these two dimensions: latency and throughput, as, obviously, you cannot get both. The following plot shows various values for the buffer timeout starting at 0 (flush with every record [...]
 &lt;br /&gt;&lt;/p&gt;
@@ -693,7 +711,7 @@ This parameter, however, is deprecated and will eventually be removed along with
 
 &lt;p&gt;&lt;br /&gt;&lt;/p&gt;
 
-&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;
+&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
 
 &lt;p&gt;Now you know about result partitions, the different network connections and scheduling types for both batch and streaming. You also know about credit-based flow control and how the network stack works internally, in order to reason about network-related tuning parameters and about certain job behaviours. Future blog posts in this series will build upon this knowledge and go into more operational details including relevant metrics to look at, further network stack tuning, and com [...]