You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by rm...@apache.org on 2017/05/31 16:17:53 UTC

flink git commit: [FLINK-6766] [docs] Update documentation about async backends and incremental checkpoints

Repository: flink
Updated Branches:
  refs/heads/master 5982ea93f -> 88545130b


[FLINK-6766] [docs] Update documentation about async backends and incremental checkpoints

This closes #4011


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/88545130
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/88545130
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/88545130

Branch: refs/heads/master
Commit: 88545130b7debb3fe39d675bc14041d5c07930a1
Parents: 5982ea9
Author: Stefan Richter <s....@data-artisans.com>
Authored: Mon May 29 17:18:19 2017 +0200
Committer: Robert Metzger <rm...@apache.org>
Committed: Wed May 31 18:17:31 2017 +0200

----------------------------------------------------------------------
 docs/monitoring/large_state_tuning.md | 16 ++++++++++++++++
 docs/ops/state_backends.md            | 23 +++++++++++++++++++++++
 2 files changed, 39 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/88545130/docs/monitoring/large_state_tuning.md
----------------------------------------------------------------------
diff --git a/docs/monitoring/large_state_tuning.md b/docs/monitoring/large_state_tuning.md
index a520106..9e1ecc7 100644
--- a/docs/monitoring/large_state_tuning.md
+++ b/docs/monitoring/large_state_tuning.md
@@ -138,6 +138,22 @@ Unfortunately, RocksDB's performance can vary with configuration, and there is l
 RocksDB properly. For example, the default configuration is tailored towards SSDs and performs suboptimal
 on spinning disks.
 
+**Incremental Checkpoints**
+
+Incremental checkpoints can dramatically reduce the checkpointing time in comparison to full checkpoints, at the cost of a (potentially) longer
+recovery time. The core idea is that incremental checkpoints only record all changes to the previous completed checkpoint, instead of
+producing a full, self-contained backup of the state backend. Like this, incremental checkpoints build upon previous checkpoints. Flink leverages
+RocksDB's internal backup mechanism in a way that is self-consolidating over time. As a result, the incremental checkpoint history in Flink
+does not grow indefinitely, and old checkpoints are eventually subsumed and pruned automatically. `
+
+While we strongly encourage the use of incremental checkpoints for large state, please note that this is a new feature and currently not enabled 
+by default. To enable this feature, users can instantiate a `RocksDBStateBackend` with the corresponding boolean flag in the constructor set to `true`, e.g.:
+
+{% highlight java %}
+    RocksDBStateBackend backend =
+        new RocksDBStateBackend(filebackend, true);
+{% endhighlight %}
+
 **Passing Options to RocksDB**
 
 {% highlight java %}

http://git-wip-us.apache.org/repos/asf/flink/blob/88545130/docs/ops/state_backends.md
----------------------------------------------------------------------
diff --git a/docs/ops/state_backends.md b/docs/ops/state_backends.md
index 656594d..781b2c5 100644
--- a/docs/ops/state_backends.md
+++ b/docs/ops/state_backends.md
@@ -56,6 +56,13 @@ that store the values, triggers, etc.
 Upon checkpoints, this state backend will snapshot the state and send it as part of the checkpoint acknowledgement messages to the
 JobManager (master), which stores it on its heap as well.
 
+The MemoryStateBackend can be configured to use asynchronous snapshots. While we strongly encourage the use of asynchronous snapshots to avoid blocking pipelines, please note that this is a new feature and currently not enabled 
+by default. To enable this feature, users can instantiate a `MemoryStateBackend` with the corresponding boolean flag in the constructor set to `true`, e.g.:
+
+{% highlight java %}
+    new MemoryStateBackend(MAX_MEM_STATE_SIZE, true);
+{% endhighlight %}
+
 Limitations of the MemoryStateBackend:
 
   - The size of each individual state is by default limited to 5 MB. This value can be increased in the constructor of the MemoryStateBackend.
@@ -74,6 +81,13 @@ The *FsStateBackend* is configured with a file system URL (type, address, path),
 
 The FsStateBackend holds in-flight data in the TaskManager's memory. Upon checkpointing, it writes state snapshots into files in the configured file system and directory. Minimal metadata is stored in the JobManager's memory (or, in high-availability mode, in the metadata checkpoint).
 
+The FsStateBackend can be configured to use asynchronous snapshots. While we strongly encourage the use of asynchronous snapshots to avoid blocking pipelines, please note that this is a new feature and currently not enabled 
+by default. To enable this feature, users can instantiate a `FsStateBackend` with the corresponding boolean flag in the constructor set to `true`, e.g.:
+
+{% highlight java %}
+    new FsStateBackend(path, true);
+{% endhighlight %}
+
 The FsStateBackend is encouraged for:
 
   - Jobs with large state, long windows, large key/value states.
@@ -88,6 +102,13 @@ that is (per default) stored in the TaskManager data directories. Upon checkpoin
 RocksDB data base will be checkpointed into the configured file system and directory. Minimal
 metadata is stored in the JobManager's memory (or, in high-availability mode, in the metadata checkpoint).
 
+The RocksDBStateBackend always performs asynchronous snapshots.
+
+Limitations of the RocksDBStateBackend:
+
+  - As RocksDB's JNI bridge API is based on byte[], the maximum supported size per key and per value is 2^31 bytes each. 
+  IMPORTANT: states that use merge operations in RocksDB (e.g. ListState) can silently accumulate value sizes > 2^31 bytes and will then fail on their next retrieval. This is currently a limitation of RocksDB JNI.
+
 The RocksDBStateBackend is encouraged for:
 
   - Jobs with very large state, long windows, large key/value states.
@@ -98,6 +119,8 @@ This allows keeping very large state, compared to the FsStateBackend that keeps
 This also means, however, that the maximum throughput that can be achieved will be lower with
 this state backend.
 
+RocksDBStateBackend is currently the only backend that offers incremental checkpoints (see [here]({{ site.baseurl }}/monitoring/large_state_tuning.html)). 
+
 ## Configuring a State Backend
 
 State backends can be configured per job. In addition, you can define a default state backend to be used when the