You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/02/01 18:01:09 UTC

[GitHub] [flink] Myasuka commented on a change in pull request #10987: [FLINK-14495][docs] (EXTENDED) Add documentation for memory control of RocksDB state backend

Myasuka commented on a change in pull request #10987: [FLINK-14495][docs] (EXTENDED) Add documentation for memory control of RocksDB state backend
URL: https://github.com/apache/flink/pull/10987#discussion_r373792878
 
 

 ##########
 File path: docs/ops/state/state_backends.md
 ##########
 @@ -188,8 +188,145 @@ state.backend: filesystem
 state.checkpoints.dir: hdfs://namenode:40010/flink/checkpoints
 {% endhighlight %}
 
-#### RocksDB State Backend Config Options
 
-{% include generated/rocks_db_configuration.html %}
+# RocksDB State Backend Details
+
+*This section describes the RocksDB state backend in more detail.*
+
+### Incremental Checkpoints
+
+RocksDB supports *Incremental Checkpoints*, which can dramatically reduce the checkpointing time in comparison to full checkpoints.
+Instead of producing a full, self-contained backup of the state backend, incremental checkpoints only record the changes that happened since the latest completed checkpoint.
+
+An incremental checkpoint builds upon (typically multiple) previous checkpoints. Flink leverages RocksDB's internal compaction mechanism in a way that is self-consolidating over time. As a result, the incremental checkpoint history in Flink does not grow indefinitely, and old checkpoints are eventually subsumed and pruned automatically.
+
+Recovery time of incremental checkpoints may be longer or shorter comapared to full checkpoints. If your network bandwidth is the bottleneck, it may take a bit longer to restore from an incremental checkpoint, because it implies fetching more data (more deltas). Restoring from an incremental checkpoint is faster, if the bottleneck is your CPU or IOPs, because restoring from an incremental checkpoint means not re-building the local RocksDB tables from Flink's canonical key/value snapshot format(used in savepoints and full checkpoints).
+
+While we encourage the use of incremental checkpoints for large state, you need to enable this feature manually:
+  - Setting a default in your `flink-conf.yaml`: `state.backend.incremental: true`
 
 Review comment:
   I think we should describe these two options as a `either-or` not a `and` relation

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services