You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2019/04/17 13:40:50 UTC

[GitHub] [flink] StefanRRichter commented on a change in pull request #8185: [FLINK-12212] [docs] clarify that operator state is checkpointed asynchronously

StefanRRichter commented on a change in pull request #8185: [FLINK-12212] [docs] clarify that operator state is checkpointed asynchronously
URL: https://github.com/apache/flink/pull/8185#discussion_r276245794
 
 

 ##########
 File path: docs/ops/state/large_state_tuning.md
 ##########
 @@ -100,22 +100,18 @@ number of network buffers used per outgoing/incoming channel is limited and thus
 may be configured without affecting checkpoint times
 (see [network buffer configuration](../config.html#configuring-the-network-buffers)).
 
-## Make state checkpointing Asynchronous where possible
+## Asynchronous Checkpointing
 
 When state is *asynchronously* snapshotted, the checkpoints scale better than when the state is *synchronously* snapshotted.
-Especially in more complex streaming applications with multiple joins, Co-functions, or windows, this may have a profound
+Especially in more complex streaming applications with multiple joins, co-functions, or windows, this may have a profound
 impact.
 
-To get state to be snapshotted asynchronously, applications have to do two things:
+For state to be snapshotted asynchronsously, you need to use a state backend which supports asynchronous snapshotting.
+Starting from Flink 1.3, both RocksDB-based as well as heap-based state backends (`filesystem`) support asynchronous
+snapshotting and use it by default. This applies to to both managed operator state as well as managed keyed state.
 
-  1. Use state that is [managed by Flink](../../dev/stream/state/state.html): Managed state means that Flink provides the data
-     structure in which the state is stored. Currently, this is true for *keyed state*, which is abstracted behind the
-     interfaces like `ValueState`, `ListState`, `ReducingState`, ...
-
-  2. Use a state backend that supports asynchronous snapshots. In Flink 1.2, only the RocksDB state backend uses
-     fully asynchronous snapshots. Starting from Flink 1.3, heap-based state backends also support asynchronous snapshots.
-
-The above two points imply that large state should generally be kept as keyed state, not as operator state.
+<span class="label label-info">Note</span> *The combination RocksDB state backend / with incremental checkpoint / with heap-based timers currently does NOT support asynchronous snapshots for the timers state.
 
 Review comment:
   I would correct this: in fact the combination rocks state + heap timers already leads to sync snapshots for the timer part. not just for incremental.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services