You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by se...@apache.org on 2020/02/03 14:16:05 UTC

[flink] branch master updated (bab25ae -> 436cd39)

This is an automated email from the ASF dual-hosted git repository.

sewen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git.


    from bab25ae  [FLINK-15789][kubernetes] Do not wrap the InterruptedException and throw a different one in ActionWatcher.await
     new 9ec3e1c  [FLINK-15694][docs] Remove HDFS Section of Configuration Page
     new afd88a3  [FLINK-15697][docs] Move python options from general configuration page to table config options page
     new 4c21029  [FLINK-15824][docs] Generalize CommonOption
     new 2ce0409  [hotfix][docs] Clean up deprecation/unused warnings in tests for ConfigOptionsDocGenerator
     new c8ade87  [FLINK-15824][docs] (follow-up) Simple improvements/cleanups on @SectionOption
     new afb11a9  [FLINK-15698][docs] Restructure the Configuration docs
     new d6a7b01  [FLINK-15695][docs] Remove outdated background sections from configuration docs
     new 45f534d  [FLINK-14495][docs] Add documentation for memory control of RocksDB state backend
     new 9afcf31  [FLINK-14495][docs] Reorganize sections of RocksDB documentation
     new 436cd39  [FLINK-14495][docs] Synchronize the Chinese document for state backend with the English version

The 10 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 ...figuration.html => all_jobmanager_section.html} |  14 +-
 ...n.html => all_taskmanager_network_section.html} |  54 ++-
 ...iguration.html => all_taskmanager_section.html} |  20 +-
 ....html => common_high_availability_section.html} |   6 -
 .../common_high_availability_zk_section.html       |  24 ++
 .../generated/common_host_port_section.html        |  66 ++++
 .../_includes/generated/common_memory_section.html | 102 +++++
 .../generated/common_miscellaneous_section.html    |  24 ++
 docs/_includes/generated/common_section.html       |  84 ----
 .../generated/common_state_backends_section.html   |  54 +++
 docs/_includes/generated/core_configuration.html   |  18 +
 .../generated/deprecated_file_sinks_section.html   |  24 ++
 ...tion.html => expert_class_loading_section.html} |  12 -
 .../generated/expert_fault_tolerance_section.html  |  60 +++
 .../expert_high_availability_section.html          |  18 +
 .../expert_high_availability_zk_section.html       |  84 ++++
 docs/_includes/generated/expert_rest_section.html  |  66 ++++
 .../generated/expert_rocksdb_section.html          |  36 ++
 .../generated/expert_scheduling_section.html       |  30 ++
 .../generated/expert_security_ssl_section.html     |  42 ++
 .../generated/expert_state_backends_section.html   |  30 ++
 .../generated/high_availability_configuration.html |  84 ++++
 .../generated/job_manager_configuration.html       |   2 +-
 .../netty_shuffle_environment_configuration.html   |  42 ++
 .../generated/resource_manager_configuration.html  |   6 -
 ...on.html => security_auth_kerberos_section.html} |   0
 ...guration.html => security_auth_zk_section.html} |   0
 .../generated/security_configuration.html          |  70 ++--
 ...onfiguration.html => security_ssl_section.html} |  60 ---
 .../generated/state_backend_rocksdb_section.html   |  42 ++
 .../generated/task_manager_configuration.html      |   6 -
 docs/_includes/generated/web_configuration.html    |  12 -
 docs/dev/table/config.md                           |   4 +
 docs/ops/config.md                                 | 436 +++++++++++++--------
 docs/ops/config.zh.md                              | 436 +++++++++++++--------
 docs/ops/state/large_state_tuning.md               | 100 ++---
 docs/ops/state/large_state_tuning.zh.md            | 108 ++---
 docs/ops/state/state_backends.md                   | 147 ++++++-
 docs/ops/state/state_backends.zh.md                | 142 ++++++-
 .../flink/annotation/docs/Documentation.java       |  57 ++-
 .../flink/configuration/CheckpointingOptions.java  |  19 +-
 .../apache/flink/configuration/ClusterOptions.java |   7 +
 .../apache/flink/configuration/CoreOptions.java    |  11 +-
 .../configuration/HeartbeatManagerOptions.java     |   3 +
 .../configuration/HighAvailabilityOptions.java     |  26 +-
 .../flink/configuration/JobManagerOptions.java     |  13 +-
 .../NettyShuffleEnvironmentOptions.java            |  19 +-
 .../configuration/ResourceManagerOptions.java      |   3 +-
 .../apache/flink/configuration/RestOptions.java    |  14 +
 .../flink/configuration/SecurityOptions.java       |  45 ++-
 .../flink/configuration/TaskManagerOptions.java    |  34 +-
 .../org/apache/flink/configuration/WebOptions.java |   2 +
 .../configuration/ConfigOptionsDocGenerator.java   |  58 ++-
 .../ConfigOptionsDocGeneratorTest.java             |  67 +++-
 .../ConfigOptionsDocsCompletenessITCase.java       | 118 +++---
 .../docs/configuration/data/TestCommonOptions.java |  11 +-
 .../contrib/streaming/state/RocksDBOptions.java    |  10 +
 57 files changed, 2235 insertions(+), 847 deletions(-)
 copy docs/_includes/generated/{job_manager_configuration.html => all_jobmanager_section.html} (87%)
 copy docs/_includes/generated/{netty_shuffle_environment_configuration.html => all_taskmanager_network_section.html} (67%)
 copy docs/_includes/generated/{task_manager_configuration.html => all_taskmanager_section.html} (86%)
 copy docs/_includes/generated/{high_availability_configuration.html => common_high_availability_section.html} (83%)
 create mode 100644 docs/_includes/generated/common_high_availability_zk_section.html
 create mode 100644 docs/_includes/generated/common_host_port_section.html
 create mode 100644 docs/_includes/generated/common_memory_section.html
 create mode 100644 docs/_includes/generated/common_miscellaneous_section.html
 delete mode 100644 docs/_includes/generated/common_section.html
 create mode 100644 docs/_includes/generated/common_state_backends_section.html
 create mode 100644 docs/_includes/generated/deprecated_file_sinks_section.html
 copy docs/_includes/generated/{core_configuration.html => expert_class_loading_section.html} (79%)
 create mode 100644 docs/_includes/generated/expert_fault_tolerance_section.html
 create mode 100644 docs/_includes/generated/expert_high_availability_section.html
 create mode 100644 docs/_includes/generated/expert_high_availability_zk_section.html
 create mode 100644 docs/_includes/generated/expert_rest_section.html
 create mode 100644 docs/_includes/generated/expert_rocksdb_section.html
 create mode 100644 docs/_includes/generated/expert_scheduling_section.html
 create mode 100644 docs/_includes/generated/expert_security_ssl_section.html
 create mode 100644 docs/_includes/generated/expert_state_backends_section.html
 copy docs/_includes/generated/{kerberos_configuration.html => security_auth_kerberos_section.html} (100%)
 copy docs/_includes/generated/{zoo_keeper_configuration.html => security_auth_zk_section.html} (100%)
 copy docs/_includes/generated/{security_configuration.html => security_ssl_section.html} (60%)
 create mode 100644 docs/_includes/generated/state_backend_rocksdb_section.html


[flink] 08/10: [FLINK-14495][docs] Add documentation for memory control of RocksDB state backend

Posted by se...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

sewen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 45f534d698b8993e3c2375612200c4f429ddc25a
Author: Yun Tang <my...@live.com>
AuthorDate: Mon Dec 9 21:17:49 2019 +0800

    [FLINK-14495][docs] Add documentation for memory control of RocksDB state backend
---
 docs/ops/state/large_state_tuning.md    | 76 ++++++++++++++++++++++++++++++---
 docs/ops/state/large_state_tuning.zh.md | 76 ++++++++++++++++++++++++++++++---
 docs/ops/state/state_backends.md        |  2 +
 docs/ops/state/state_backends.zh.md     |  2 +
 4 files changed, 142 insertions(+), 14 deletions(-)

diff --git a/docs/ops/state/large_state_tuning.md b/docs/ops/state/large_state_tuning.md
index 66c440b..e76a835 100644
--- a/docs/ops/state/large_state_tuning.md
+++ b/docs/ops/state/large_state_tuning.md
@@ -122,7 +122,7 @@ Unfortunately, RocksDB's performance can vary with configuration, and there is l
 RocksDB properly. For example, the default configuration is tailored towards SSDs and performs suboptimal
 on spinning disks.
 
-**Incremental Checkpoints**
+### Incremental Checkpoints
 
 Incremental checkpoints can dramatically reduce the checkpointing time in comparison to full checkpoints, at the cost of a (potentially) longer
 recovery time. The core idea is that incremental checkpoints only record all changes to the previous completed checkpoint, instead of
@@ -138,7 +138,7 @@ by default. To enable this feature, users can instantiate a `RocksDBStateBackend
         new RocksDBStateBackend(filebackend, true);
 {% endhighlight %}
 
-**RocksDB Timers**
+### RocksDB Timers
 
 For RocksDB, a user can chose whether timers are stored on the heap or inside RocksDB (default). Heap-based timers can have a better performance for smaller numbers of
 timers, while storing timers inside RocksDB offers higher scalability as the number of timers in RocksDB can exceed the available main memory (spilling to disk).
@@ -149,7 +149,7 @@ Possible choices are `heap` (to store timers on the heap, default) and `rocksdb`
 <span class="label label-info">Note</span> *The combination RocksDB state backend with heap-based timers currently does NOT support asynchronous snapshots for the timers state.
 Other state like keyed state is still snapshotted asynchronously. Please note that this is not a regression from previous versions and will be resolved with `FLINK-10026`.*
 
-**Predefined Options**
+### Predefined Options
 
 Flink provides some predefined collections of option for RocksDB for different settings, and there existed two ways
 to pass these predefined options to RocksDB:
@@ -162,7 +162,7 @@ found a set of options that work well and seem representative for certain worklo
 
 <span class="label label-info">Note</span> Predefined options which set programmatically would override the one configured via `flink-conf.yaml`.
 
-**Passing Options Factory to RocksDB**
+### Passing Options Factory to RocksDB
 
 There existed two ways to pass options factory to RocksDB in Flink:
 
@@ -172,19 +172,20 @@ There existed two ways to pass options factory to RocksDB in Flink:
 
     {% highlight java %}
 
-    public class MyOptionsFactory implements ConfigurableOptionsFactory {
+    public class MyOptionsFactory implements ConfigurableRocksDBOptionsFactory {
 
         private static final long DEFAULT_SIZE = 256 * 1024 * 1024;  // 256 MB
         private long blockCacheSize = DEFAULT_SIZE;
 
         @Override
-        public DBOptions createDBOptions(DBOptions currentOptions) {
+        public DBOptions createDBOptions(DBOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
             return currentOptions.setIncreaseParallelism(4)
                    .setUseFsync(false);
         }
 
         @Override
-        public ColumnFamilyOptions createColumnOptions(ColumnFamilyOptions currentOptions) {
+        public ColumnFamilyOptions createColumnOptions(
+            ColumnFamilyOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
             return currentOptions.setTableFormatConfig(
                 new BlockBasedTableConfig()
                     .setBlockCacheSize(blockCacheSize)
@@ -210,6 +211,67 @@ and not from the JVM. Any memory you assign to RocksDB will have to be accounted
 of the TaskManagers by the same amount. Not doing that may result in YARN/Mesos/etc terminating the JVM processes for
 allocating more memory than configured.
 
+### Bounding RocksDB Memory Usage
+
+RocksDB allocates native memory outside of the JVM, which could lead the process to exceed the total memory budget.
+This can be especially problematic in containerized environments such as Kubernetes that kill processes who exceed their memory budgets.
+
+Flink limit total memory usage of RocksDB instance(s) per slot by leveraging shareable [cache](https://github.com/facebook/rocksdb/wiki/Block-Cache)
+and [write buffer manager](https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager) among all instances in a single slot by default.
+The shared cache will place an upper limit on the [three components](https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB) that use the majority of memory
+when RocksDB is deployed as a state backend: block cache, index and bloom filters, and MemTables. 
+
+This feature is enabled by default along with managed memory. Flink will use the managed memory budget as the per-slot memory limit for RocksDB state backend(s).
+
+Flink also provides two parameters to tune the memory fraction of MemTable and index & filters along with the bounding RocksDB memory usage feature:
+  - `state.backend.rocksdb.memory.write-buffer-ratio`, by default `0.5`, which means 50% of the given memory would be used by write buffer manager.
+  - `state.backend.rocksdb.memory.high-prio-pool-ratio`, by default `0.1`, which means 10% of the given memory would be set as high priority for index and filters in shared block cache.
+  We strongly suggest not to set this to zero, to prevent index and filters from competing against data blocks for staying in cache and causing performance issues.
+  Moreover, the L0 level filter and index are pinned into the cache by default to mitigate performance problems,
+  more details please refer to the [RocksDB-documentation](https://github.com/facebook/rocksdb/wiki/Block-Cache#caching-index-filter-and-compression-dictionary-blocks).
+
+<span class="label label-info">Note</span> When bounded RocksDB memory usage is enabled by default,
+the shared `cache` and `write buffer manager` will override customized settings of block cache and write buffer via `PredefinedOptions` and `OptionsFactory`.
+
+*Experts only*: To control memory manually instead of using managed memory, user can set `state.backend.rocksdb.memory.managed` as `false` and control via `ColumnFamilyOptions`.
+Or to save some manual calculation, through the `state.backend.rocksdb.memory.fixed-per-slot` option which will override `state.backend.rocksdb.memory.managed` when configured.
+With the later method, please tune down `taskmanager.memory.managed.size` or `taskmanager.memory.managed.fraction` to `0` 
+and increase `taskmanager.memory.task.off-heap.size` by "`taskmanager.numberOfTaskSlots` * `state.backend.rocksdb.memory.fixed-per-slot`" accordingly.
+
+#### Tune performance when bounding RocksDB memory usage.
+
+There might existed performance regression compared with previous no-memory-limit case if you have too many states per slot.
+- If you observed this behavior and not running jobs in containerized environment or don't care about the over-limit memory usage.
+The easiest way to wipe out the performance regression is to disable memory bound for RocksDB, e.g. turn `state.backend.rocksdb.memory.managed` as `false`.
+Moreover, please refer to [memory configuration migration guide](WIP) to know how to keep backward compatibility to previous memory configuration.
+- Otherwise you need to increase the upper memory for RocksDB by tuning up `taskmanager.memory.managed.size` or `taskmanager.memory.managed.fraction`, or increasing the total memory for task manager.
+
+*Experts only*: Apart from increasing total memory, user could also tune RocksDB options (e.g. arena block size, max background flush threads, etc.) via `RocksDBOptionsFactory`:
+
+{% highlight java %}
+public class MyOptionsFactory implements ConfigurableRocksDBOptionsFactory {
+
+    @Override
+    public DBOptions createDBOptions(DBOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
+        // increase the max background flush threads when we have many states in one operator,
+        // which means we would have many column families in one DB instance.
+        return currentOptions.setMaxBackgroundFlushes(4);
+    }
+
+    @Override
+    public ColumnFamilyOptions createColumnOptions(
+        ColumnFamilyOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
+        // decrease the arena block size from default 8MB to 1MB. 
+        return currentOptions.setArenaBlockSize(1024 * 1024);
+    }
+
+    @Override
+    public OptionsFactory configure(Configuration configuration) {
+        return this;
+    }
+}
+{% endhighlight %}
+
 ## Capacity Planning
 
 This section discusses how to decide how many resources should be used for a Flink job to run reliably.
diff --git a/docs/ops/state/large_state_tuning.zh.md b/docs/ops/state/large_state_tuning.zh.md
index b29fee5..f3a7964 100644
--- a/docs/ops/state/large_state_tuning.zh.md
+++ b/docs/ops/state/large_state_tuning.zh.md
@@ -122,7 +122,7 @@ Unfortunately, RocksDB's performance can vary with configuration, and there is l
 RocksDB properly. For example, the default configuration is tailored towards SSDs and performs suboptimal
 on spinning disks.
 
-**Incremental Checkpoints**
+### Incremental Checkpoints
 
 Incremental checkpoints can dramatically reduce the checkpointing time in comparison to full checkpoints, at the cost of a (potentially) longer
 recovery time. The core idea is that incremental checkpoints only record all changes to the previous completed checkpoint, instead of
@@ -138,7 +138,7 @@ by default. To enable this feature, users can instantiate a `RocksDBStateBackend
         new RocksDBStateBackend(filebackend, true);
 {% endhighlight %}
 
-**RocksDB Timers**
+### RocksDB Timers
 
 For RocksDB, a user can chose whether timers are stored on the heap or inside RocksDB (default). Heap-based timers can have a better performance for smaller numbers of
 timers, while storing timers inside RocksDB offers higher scalability as the number of timers in RocksDB can exceed the available main memory (spilling to disk).
@@ -149,7 +149,7 @@ Possible choices are `heap` (to store timers on the heap, default) and `rocksdb`
 <span class="label label-info">Note</span> *The combination RocksDB state backend with heap-based timers currently does NOT support asynchronous snapshots for the timers state.
 Other state like keyed state is still snapshotted asynchronously. Please note that this is not a regression from previous versions and will be resolved with `FLINK-10026`.*
 
-**Predefined Options**
+### Predefined Options
 
 Flink provides some predefined collections of option for RocksDB for different settings, and there existed two ways
 to pass these predefined options to RocksDB:
@@ -162,7 +162,7 @@ found a set of options that work well and seem representative for certain worklo
 
 <span class="label label-info">Note</span> Predefined options which set programmatically would override the one configured via `flink-conf.yaml`.
 
-**Passing Options Factory to RocksDB**
+### Passing Options Factory to RocksDB
 
 There existed two ways to pass options factory to RocksDB in Flink:
 
@@ -172,19 +172,20 @@ There existed two ways to pass options factory to RocksDB in Flink:
 
     {% highlight java %}
 
-    public class MyOptionsFactory implements ConfigurableOptionsFactory {
+    public class MyOptionsFactory implements ConfigurableRocksDBOptionsFactory {
 
         private static final long DEFAULT_SIZE = 256 * 1024 * 1024;  // 256 MB
         private long blockCacheSize = DEFAULT_SIZE;
 
         @Override
-        public DBOptions createDBOptions(DBOptions currentOptions) {
+        public DBOptions createDBOptions(DBOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
             return currentOptions.setIncreaseParallelism(4)
                    .setUseFsync(false);
         }
 
         @Override
-        public ColumnFamilyOptions createColumnOptions(ColumnFamilyOptions currentOptions) {
+        public ColumnFamilyOptions createColumnOptions(
+            ColumnFamilyOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
             return currentOptions.setTableFormatConfig(
                 new BlockBasedTableConfig()
                     .setBlockCacheSize(blockCacheSize)
@@ -210,6 +211,67 @@ and not from the JVM. Any memory you assign to RocksDB will have to be accounted
 of the TaskManagers by the same amount. Not doing that may result in YARN/Mesos/etc terminating the JVM processes for
 allocating more memory than configured.
 
+### Bounding RocksDB Memory Usage
+
+RocksDB allocates native memory outside of the JVM, which could lead the process to exceed the total memory budget.
+This can be especially problematic in containerized environments such as Kubernetes that kill processes who exceed their memory budgets.
+
+Flink limit total memory usage of RocksDB instance(s) per slot by leveraging shareable [cache](https://github.com/facebook/rocksdb/wiki/Block-Cache)
+and [write buffer manager](https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager) among all instances in a single slot by default.
+The shared cache will place an upper limit on the [three components](https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB) that use the majority of memory
+when RocksDB is deployed as a state backend: block cache, index and bloom filters, and MemTables. 
+
+This feature is enabled by default along with managed memory. Flink will use the managed memory budget as the per-slot memory limit for RocksDB state backend(s).
+
+Flink also provides two parameters to tune the memory fraction of MemTable and index & filters along with the bounding RocksDB memory usage feature:
+  - `state.backend.rocksdb.memory.write-buffer-ratio`, by default `0.5`, which means 50% of the given memory would be used by write buffer manager.
+  - `state.backend.rocksdb.memory.high-prio-pool-ratio`, by default `0.1`, which means 10% of the given memory would be set as high priority for index and filters in shared block cache.
+  We strongly suggest not to set this to zero, to prevent index and filters from competing against data blocks for staying in cache and causing performance issues.
+  Moreover, the L0 level filter and index are pinned into the cache by default to mitigate performance problems,
+  more details please refer to the [RocksDB-documentation](https://github.com/facebook/rocksdb/wiki/Block-Cache#caching-index-filter-and-compression-dictionary-blocks).
+
+<span class="label label-info">Note</span> When bounded RocksDB memory usage is enabled by default,
+the shared `cache` and `write buffer manager` will override customized settings of block cache and write buffer via `PredefinedOptions` and `OptionsFactory`.
+
+*Experts only*: To control memory manually instead of using managed memory, user can set `state.backend.rocksdb.memory.managed` as `false` and control via `ColumnFamilyOptions`.
+Or to save some manual calculation, through the `state.backend.rocksdb.memory.fixed-per-slot` option which will override `state.backend.rocksdb.memory.managed` when configured.
+With the later method, please tune down `taskmanager.memory.managed.size` or `taskmanager.memory.managed.fraction` to `0` 
+and increase `taskmanager.memory.task.off-heap.size` by "`taskmanager.numberOfTaskSlots` * `state.backend.rocksdb.memory.fixed-per-slot`" accordingly.
+
+#### Tune performance when bounding RocksDB memory usage.
+
+There might existed performance regression compared with previous no-memory-limit case if you have too many states per slot.
+- If you observed this behavior and not running jobs in containerized environment or don't care about the over-limit memory usage.
+The easiest way to wipe out the performance regression is to disable memory bound for RocksDB, e.g. turn `state.backend.rocksdb.memory.managed` as `false`.
+Moreover, please refer to [memory configuration migration guide](WIP) to know how to keep backward compatibility to previous memory configuration.
+- Otherwise you need to increase the upper memory for RocksDB by tuning up `taskmanager.memory.managed.size` or `taskmanager.memory.managed.fraction`, or increasing the total memory for task manager.
+
+*Experts only*: Apart from increasing total memory, user could also tune RocksDB options (e.g. arena block size, max background flush threads, etc.) via `RocksDBOptionsFactory`:
+
+{% highlight java %}
+public class MyOptionsFactory implements ConfigurableRocksDBOptionsFactory {
+
+    @Override
+    public DBOptions createDBOptions(DBOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
+        // increase the max background flush threads when we have many states in one operator,
+        // which means we would have many column families in one DB instance.
+        return currentOptions.setMaxBackgroundFlushes(4);
+    }
+
+    @Override
+    public ColumnFamilyOptions createColumnOptions(
+        ColumnFamilyOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
+        // decrease the arena block size from default 8MB to 1MB. 
+        return currentOptions.setArenaBlockSize(1024 * 1024);
+    }
+
+    @Override
+    public OptionsFactory configure(Configuration configuration) {
+        return this;
+    }
+}
+{% endhighlight %}
+
 ## Capacity Planning
 
 This section discusses how to decide how many resources should be used for a Flink job to run reliably.
diff --git a/docs/ops/state/state_backends.md b/docs/ops/state/state_backends.md
index cd0a041d..1264aec 100644
--- a/docs/ops/state/state_backends.md
+++ b/docs/ops/state/state_backends.md
@@ -123,6 +123,8 @@ RocksDBStateBackend is currently the only backend that offers incremental checkp
 
 Certain RocksDB native metrics are available but disabled by default, you can find full documentation [here]({{ site.baseurl }}/ops/config.html#rocksdb-native-metrics)
 
+The total memory amount of RocksDB instance(s) per slot can also be bounded, please refer to documentation [here](large_state_tuning.html#bounding-rocksdb-memory-usage) for details.
+
 ## Configuring a State Backend
 
 The default state backend, if you specify nothing, is the jobmanager. If you wish to establish a different default for all jobs on your cluster, you can do so by defining a new default state backend in **flink-conf.yaml**. The default state backend can be overridden on a per-job basis, as shown below.
diff --git a/docs/ops/state/state_backends.zh.md b/docs/ops/state/state_backends.zh.md
index 9960d32..eeaf247 100644
--- a/docs/ops/state/state_backends.zh.md
+++ b/docs/ops/state/state_backends.zh.md
@@ -119,6 +119,8 @@ RocksDBStateBackend 是目前唯一支持增量 CheckPoint 的 State Backend (
 
 可以使用一些 RocksDB 的本地指标(metrics),但默认是关闭的。你能在 [这里]({{ site.baseurl }}/zh/ops/config.html#rocksdb-native-metrics) 找到关于 RocksDB 本地指标的文档。
 
+The total memory amount of RocksDB instance(s) per slot can also be bounded, please refer to documentation [here](large_state_tuning.html#bounding-rocksdb-memory-usage) for details.
+
 ## 设置 State Backend
 
 如果没有明确指定,将使用 jobmanager 做为默认的 state backend。你能在 **flink-conf.yaml** 中为所有 Job 设置其他默认的 State Backend。


[flink] 01/10: [FLINK-15694][docs] Remove HDFS Section of Configuration Page

Posted by se...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

sewen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 9ec3e1c57dc4cabec1f6f04db2c29dc83b645dd5
Author: bdine <ba...@gmail.com>
AuthorDate: Wed Jan 29 22:20:51 2020 +0100

    [FLINK-15694][docs] Remove HDFS Section of Configuration Page
    
    This close #15694
---
 docs/ops/config.md    | 16 ----------------
 docs/ops/config.zh.md | 16 ----------------
 2 files changed, 32 deletions(-)

diff --git a/docs/ops/config.md b/docs/ops/config.md
index 97dc6da..bd672a9 100644
--- a/docs/ops/config.md
+++ b/docs/ops/config.md
@@ -44,22 +44,6 @@ The configuration files for the TaskManagers can be different, Flink does not as
 
 ## Full Reference
 
-### HDFS
-
-<div class="alert alert-warning">
-  <strong>Note:</strong> These keys are deprecated and it is recommended to configure the Hadoop path with the environment variable <code>HADOOP_CONF_DIR</code> instead.
-</div>
-
-See also [how to configure Hadoop]({{ site.baseurl }}/ops/deployment/hadoop.html#configure-hadoop).
-
-These parameters configure the default HDFS used by Flink. Setups that do not specify an HDFS configuration have to specify the full path to HDFS files (`hdfs://address:port/path/to/files`) Files will also be written with default HDFS parameters (block size, replication factor).
-
-- `fs.hdfs.hadoopconf`: The absolute path to the Hadoop File System's (HDFS) configuration **directory** (OPTIONAL VALUE). Specifying this value allows programs to reference HDFS files using short URIs (`hdfs:///path/to/files`, without including the address and port of the NameNode in the file URI). Without this option, HDFS files can be accessed, but require fully qualified URIs like `hdfs://address:port/path/to/files`. This option also causes file writers to pick up the HDFS's default  [...]
-
-- `fs.hdfs.hdfsdefault`: The absolute path of Hadoop's own configuration file "hdfs-default.xml" (DEFAULT: null).
-
-- `fs.hdfs.hdfssite`: The absolute path of Hadoop's own configuration file "hdfs-site.xml" (DEFAULT: null).
-
 ### Core
 
 {% include generated/core_configuration.html %}
diff --git a/docs/ops/config.zh.md b/docs/ops/config.zh.md
index a0ec20d..8b7febc 100644
--- a/docs/ops/config.zh.md
+++ b/docs/ops/config.zh.md
@@ -44,22 +44,6 @@ The configuration files for the TaskManagers can be different, Flink does not as
 
 ## Full Reference
 
-### HDFS
-
-<div class="alert alert-warning">
-  <strong>Note:</strong> These keys are deprecated and it is recommended to configure the Hadoop path with the environment variable <code>HADOOP_CONF_DIR</code> instead.
-</div>
-
-See also [how to configure Hadoop]({{ site.baseurl }}/ops/deployment/hadoop.html#configure-hadoop).
-
-These parameters configure the default HDFS used by Flink. Setups that do not specify an HDFS configuration have to specify the full path to HDFS files (`hdfs://address:port/path/to/files`) Files will also be written with default HDFS parameters (block size, replication factor).
-
-- `fs.hdfs.hadoopconf`: The absolute path to the Hadoop File System's (HDFS) configuration **directory** (OPTIONAL VALUE). Specifying this value allows programs to reference HDFS files using short URIs (`hdfs:///path/to/files`, without including the address and port of the NameNode in the file URI). Without this option, HDFS files can be accessed, but require fully qualified URIs like `hdfs://address:port/path/to/files`. This option also causes file writers to pick up the HDFS's default  [...]
-
-- `fs.hdfs.hdfsdefault`: The absolute path of Hadoop's own configuration file "hdfs-default.xml" (DEFAULT: null).
-
-- `fs.hdfs.hdfssite`: The absolute path of Hadoop's own configuration file "hdfs-site.xml" (DEFAULT: null).
-
 ### Core
 
 {% include generated/core_configuration.html %}


[flink] 03/10: [FLINK-15824][docs] Generalize CommonOption

Posted by se...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

sewen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 4c2102922e85a65293bfb39981e4d334325023d4
Author: Chesnay Schepler <ch...@apache.org>
AuthorDate: Thu Jan 23 19:27:38 2020 +0100

    [FLINK-15824][docs] Generalize CommonOption
---
 .../flink/annotation/docs/Documentation.java       | 17 +++++--
 .../flink/configuration/CheckpointingOptions.java  | 12 +++--
 .../apache/flink/configuration/CoreOptions.java    |  4 +-
 .../configuration/HighAvailabilityOptions.java     |  8 ++-
 .../flink/configuration/JobManagerOptions.java     |  4 +-
 .../flink/configuration/SecurityOptions.java       |  8 ++-
 .../flink/configuration/TaskManagerOptions.java    | 12 +++--
 .../configuration/ConfigOptionsDocGenerator.java   | 58 ++++++++++++++++------
 .../ConfigOptionsDocGeneratorTest.java             |  2 +-
 .../ConfigOptionsDocsCompletenessITCase.java       |  2 +-
 .../docs/configuration/data/TestCommonOptions.java |  6 ++-
 11 files changed, 99 insertions(+), 34 deletions(-)

diff --git a/flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java b/flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java
index 7d1b181..a1e142a 100644
--- a/flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java
+++ b/flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java
@@ -41,21 +41,32 @@ public final class Documentation {
 	}
 
 	/**
-	 * Annotation used on config option fields to include them in the "Common Options" section.
+	 * Annotation used on config option fields to include them in specific sections. Sections are groups of options
+	 * that are aggregated across option classes, with each group being placed into a dedicated file.
 	 *
-	 * <p>The {@link CommonOption#position()} argument controls the position in the generated table, with lower values
+	 * <p>The {@link SectionOption#position()} argument controls the position in the generated table, with lower values
 	 * being placed at the top. Fields with the same position are sorted alphabetically by key.
 	 */
 	@Target(ElementType.FIELD)
 	@Retention(RetentionPolicy.RUNTIME)
 	@Internal
-	public @interface CommonOption {
+	public @interface SectionOption {
 		int POSITION_MEMORY = 10;
 		int POSITION_PARALLELISM_SLOTS = 20;
 		int POSITION_FAULT_TOLERANCE = 30;
 		int POSITION_HIGH_AVAILABILITY = 40;
 		int POSITION_SECURITY = 50;
 
+		String SECTION_COMMON = "common";
+
+		/**
+		 * The sections in the config docs where this option should be included.
+		 */
+		String[] sections() default {};
+
+		/**
+		 * The relative position of the option in its section.
+		 */
 		int position() default Integer.MAX_VALUE;
 	}
 
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java
index 30a2e0c..03284ca 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java
@@ -31,7 +31,9 @@ public class CheckpointingOptions {
 	// ------------------------------------------------------------------------
 
 	/** The state backend to be used to store and checkpoint state. */
-	@Documentation.CommonOption(position = Documentation.CommonOption.POSITION_FAULT_TOLERANCE)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = Documentation.SectionOption.POSITION_FAULT_TOLERANCE)
 	public static final ConfigOption<String> STATE_BACKEND = ConfigOptions
 			.key("state.backend")
 			.noDefaultValue()
@@ -103,7 +105,9 @@ public class CheckpointingOptions {
 
 	/** The default directory for savepoints. Used by the state backends that write
 	 * savepoints to file systems (MemoryStateBackend, FsStateBackend, RocksDBStateBackend). */
-	@Documentation.CommonOption(position = Documentation.CommonOption.POSITION_FAULT_TOLERANCE)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = Documentation.SectionOption.POSITION_FAULT_TOLERANCE)
 	public static final ConfigOption<String> SAVEPOINT_DIRECTORY = ConfigOptions
 			.key("state.savepoints.dir")
 			.noDefaultValue()
@@ -113,7 +117,9 @@ public class CheckpointingOptions {
 
 	/** The default directory used for storing the data files and meta data of checkpoints in a Flink supported filesystem.
 	 * The storage path must be accessible from all participating processes/nodes(i.e. all TaskManagers and JobManagers).*/
-	@Documentation.CommonOption(position = Documentation.CommonOption.POSITION_FAULT_TOLERANCE)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = Documentation.SectionOption.POSITION_FAULT_TOLERANCE)
 	public static final ConfigOption<String> CHECKPOINTS_DIRECTORY = ConfigOptions
 			.key("state.checkpoints.dir")
 			.noDefaultValue()
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/CoreOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/CoreOptions.java
index 3420b93..2d7bc55 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/CoreOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/CoreOptions.java
@@ -250,7 +250,9 @@ public class CoreOptions {
 	//  program
 	// ------------------------------------------------------------------------
 
-	@Documentation.CommonOption(position = Documentation.CommonOption.POSITION_PARALLELISM_SLOTS)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = Documentation.SectionOption.POSITION_PARALLELISM_SLOTS)
 	public static final ConfigOption<Integer> DEFAULT_PARALLELISM = ConfigOptions
 		.key("parallelism.default")
 		.defaultValue(1)
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/HighAvailabilityOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/HighAvailabilityOptions.java
index 825d8c7..b6c0f02 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/HighAvailabilityOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/HighAvailabilityOptions.java
@@ -45,7 +45,9 @@ public class HighAvailabilityOptions {
 	 * To enable high-availability, set this mode to "ZOOKEEPER".
 	 * Can also be set to FQN of HighAvailability factory class.
 	 */
-	@Documentation.CommonOption(position = Documentation.CommonOption.POSITION_HIGH_AVAILABILITY)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = Documentation.SectionOption.POSITION_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_MODE =
 			key("high-availability")
 			.defaultValue("NONE")
@@ -67,7 +69,9 @@ public class HighAvailabilityOptions {
 	/**
 	 * File system path (URI) where Flink persists metadata in high-availability setups.
 	 */
-	@Documentation.CommonOption(position = Documentation.CommonOption.POSITION_HIGH_AVAILABILITY)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = Documentation.SectionOption.POSITION_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_STORAGE_PATH =
 			key("high-availability.storageDir")
 			.noDefaultValue()
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
index c8412bc..ef49421 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
@@ -79,7 +79,9 @@ public class JobManagerOptions {
 	/**
 	 * JVM heap size for the JobManager with memory size.
 	 */
-	@Documentation.CommonOption(position = Documentation.CommonOption.POSITION_MEMORY)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = Documentation.SectionOption.POSITION_MEMORY)
 	public static final ConfigOption<String> JOB_MANAGER_HEAP_MEMORY =
 		key("jobmanager.heap.size")
 		.defaultValue("1024m")
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java
index 24f8c5f..5235dcf 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java
@@ -106,7 +106,9 @@ public class SecurityOptions {
 	/**
 	 * Enable SSL for internal communication (akka rpc, netty data transport, blob server).
 	 */
-	@Documentation.CommonOption(position = Documentation.CommonOption.POSITION_SECURITY)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = Documentation.SectionOption.POSITION_SECURITY)
 	public static final ConfigOption<Boolean> SSL_INTERNAL_ENABLED =
 			key("security.ssl.internal.enabled")
 			.defaultValue(false)
@@ -117,7 +119,9 @@ public class SecurityOptions {
 	/**
 	 * Enable SSL for external REST endpoints.
 	 */
-	@Documentation.CommonOption(position = Documentation.CommonOption.POSITION_SECURITY)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = Documentation.SectionOption.POSITION_SECURITY)
 	public static final ConfigOption<Boolean> SSL_REST_ENABLED =
 			key("security.ssl.rest.enabled")
 			.defaultValue(false)
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java
index cfd535a..cf532a4 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java
@@ -165,7 +165,9 @@ public class TaskManagerOptions {
 	/**
 	 * The config parameter defining the number of task slots of a task manager.
 	 */
-	@Documentation.CommonOption(position = Documentation.CommonOption.POSITION_PARALLELISM_SLOTS)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = Documentation.SectionOption.POSITION_PARALLELISM_SLOTS)
 	public static final ConfigOption<Integer> NUM_TASK_SLOTS =
 		key("taskmanager.numberOfTaskSlots")
 			.intType()
@@ -252,7 +254,9 @@ public class TaskManagerOptions {
 	/**
 	 * Total Process Memory size for the TaskExecutors.
 	 */
-	@Documentation.CommonOption(position = Documentation.CommonOption.POSITION_MEMORY)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = Documentation.SectionOption.POSITION_MEMORY)
 	public static final ConfigOption<MemorySize> TOTAL_PROCESS_MEMORY =
 		key("taskmanager.memory.process.size")
 			.memoryType()
@@ -266,7 +270,9 @@ public class TaskManagerOptions {
 	/**
 	 * Total Flink Memory size for the TaskExecutors.
 	 */
-	@Documentation.CommonOption(position = Documentation.CommonOption.POSITION_MEMORY)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = Documentation.SectionOption.POSITION_MEMORY)
 	public static final ConfigOption<MemorySize> TOTAL_FLINK_MEMORY =
 		key("taskmanager.memory.flink.size")
 			.memoryType()
diff --git a/flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java b/flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java
index eeea628..1694620 100644
--- a/flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java
+++ b/flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java
@@ -101,7 +101,7 @@ public class ConfigOptionsDocGenerator {
 	 * every {@link ConfigOption}.
 	 *
 	 * <p>One additional table is generated containing all {@link ConfigOption ConfigOptions} that are annotated with
-	 * {@link org.apache.flink.annotation.docs.Documentation.CommonOption}.
+	 * {@link Documentation.SectionOption}.
 	 *
 	 * @param args
 	 *  [0] output directory for the generated files
@@ -120,28 +120,56 @@ public class ConfigOptionsDocGenerator {
 
 	@VisibleForTesting
 	static void generateCommonSection(String rootDir, String outputDirectory, OptionsClassLocation[] locations, String pathPrefix) throws IOException, ClassNotFoundException {
-		List<OptionWithMetaInfo> commonOptions = new ArrayList<>(32);
+		List<OptionWithMetaInfo> allSectionOptions = new ArrayList<>(32);
 		for (OptionsClassLocation location : locations) {
-			commonOptions.addAll(findCommonOptions(rootDir, location.getModule(), location.getPackage(), pathPrefix));
+			allSectionOptions.addAll(findSectionOptions(rootDir, location.getModule(), location.getPackage(), pathPrefix));
 		}
-		commonOptions.sort((o1, o2) -> {
-			int position1 = o1.field.getAnnotation(Documentation.CommonOption.class).position();
-			int position2 = o2.field.getAnnotation(Documentation.CommonOption.class).position();
-			if (position1 == position2) {
-				return o1.option.key().compareTo(o2.option.key());
-			} else {
-				return Integer.compare(position1, position2);
+
+		Map<String, List<OptionWithMetaInfo>> optionsGroupedBySection = allSectionOptions.stream()
+			.flatMap(option -> {
+				final String[] sections = option.field.getAnnotation(Documentation.SectionOption.class).sections();
+				if (sections.length == 0) {
+					throw new RuntimeException(String.format(
+						"Option %s is annotated with %s but the list of sections is empty.",
+						option.option.key(),
+						Documentation.SectionOption.class.getSimpleName()));
+				}
+
+				return Arrays.stream(sections).map(section -> Tuple2.of(section, option));
+			})
+			.collect(Collectors.groupingBy(option -> option.f0, Collectors.mapping(option -> option.f1, Collectors.toList())));
+
+		optionsGroupedBySection.forEach(
+			(section, options) -> {
+				options.sort((o1, o2) -> {
+					int position1 = o1.field.getAnnotation(Documentation.SectionOption.class).position();
+					int position2 = o2.field.getAnnotation(Documentation.SectionOption.class).position();
+					if (position1 == position2) {
+						return o1.option.key().compareTo(o2.option.key());
+					} else {
+						return Integer.compare(position1, position2);
+					}
+				});
+
+				String sectionHtmlTable = toHtmlTable(options);
+				try {
+					Files.write(Paths.get(outputDirectory, getSectionFileName(section)), sectionHtmlTable.getBytes(StandardCharsets.UTF_8));
+				} catch (Exception e) {
+					throw new RuntimeException(e);
+				}
 			}
-		});
+		);
+	}
 
-		String commonHtmlTable = toHtmlTable(commonOptions);
-		Files.write(Paths.get(outputDirectory, COMMON_SECTION_FILE_NAME), commonHtmlTable.getBytes(StandardCharsets.UTF_8));
+	@VisibleForTesting
+	static String getSectionFileName(String section) {
+		return section + "_section.html";
 	}
 
-	private static Collection<OptionWithMetaInfo> findCommonOptions(String rootDir, String module, String packageName, String pathPrefix) throws IOException, ClassNotFoundException {
+	private static Collection<OptionWithMetaInfo> findSectionOptions(String rootDir, String module, String packageName, String pathPrefix) throws IOException, ClassNotFoundException {
 		Collection<OptionWithMetaInfo> commonOptions = new ArrayList<>(32);
 		processConfigOptions(rootDir, module, packageName, pathPrefix, optionsClass -> extractConfigOptions(optionsClass).stream()
-			.filter(optionWithMetaInfo -> optionWithMetaInfo.field.getAnnotation(Documentation.CommonOption.class) != null)
+			.filter(optionWithMetaInfo -> optionWithMetaInfo.field.getAnnotation(Documentation.SectionOption.class) != null)
 			.forEachOrdered(commonOptions::add));
 		return commonOptions;
 	}
diff --git a/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java b/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java
index 662afa4..38bd96b 100644
--- a/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java
+++ b/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java
@@ -445,7 +445,7 @@ public class ConfigOptionsDocGeneratorTest {
 			"    </tbody>\n" +
 			"</table>\n";
 
-		String output = FileUtils.readFile(Paths.get(outputDirectory, ConfigOptionsDocGenerator.COMMON_SECTION_FILE_NAME).toFile(), StandardCharsets.UTF_8.name());
+		String output = FileUtils.readFile(Paths.get(outputDirectory, ConfigOptionsDocGenerator.getSectionFileName("common")).toFile(), StandardCharsets.UTF_8.name());
 
 		assertEquals(expected, output);
 	}
diff --git a/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocsCompletenessITCase.java b/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocsCompletenessITCase.java
index 6497b46..bd6316e 100644
--- a/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocsCompletenessITCase.java
+++ b/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocsCompletenessITCase.java
@@ -66,7 +66,7 @@ public class ConfigOptionsDocsCompletenessITCase {
 	public void testCommonSectionCompleteness() throws IOException, ClassNotFoundException {
 		Map<String, List<DocumentedOption>> documentedOptions = parseDocumentedCommonOptions();
 		Map<String, List<ExistingOption>> existingOptions = findExistingOptions(
-			optionWithMetaInfo -> optionWithMetaInfo.field.getAnnotation(Documentation.CommonOption.class) != null);
+			optionWithMetaInfo -> optionWithMetaInfo.field.getAnnotation(Documentation.SectionOption.class) != null);
 
 		assertExistingOptionsAreWellDefined(existingOptions);
 
diff --git a/flink-docs/src/test/java/org/apache/flink/docs/configuration/data/TestCommonOptions.java b/flink-docs/src/test/java/org/apache/flink/docs/configuration/data/TestCommonOptions.java
index f0bf37c..865d9ec 100644
--- a/flink-docs/src/test/java/org/apache/flink/docs/configuration/data/TestCommonOptions.java
+++ b/flink-docs/src/test/java/org/apache/flink/docs/configuration/data/TestCommonOptions.java
@@ -27,7 +27,7 @@ import org.apache.flink.configuration.ConfigOptions;
  */
 public class TestCommonOptions {
 
-	@Documentation.CommonOption
+	@Documentation.SectionOption(sections = {Documentation.SectionOption.SECTION_COMMON})
 	public static final ConfigOption<Integer> COMMON_OPTION = ConfigOptions
 		.key("first.option.a")
 		.defaultValue(2)
@@ -38,7 +38,9 @@ public class TestCommonOptions {
 		.noDefaultValue()
 		.withDescription("This is the description for the generic option.");
 
-	@Documentation.CommonOption(position = 2)
+	@Documentation.SectionOption(
+		sections = {Documentation.SectionOption.SECTION_COMMON},
+		position = 2)
 	public static final ConfigOption<Integer> COMMON_POSITIONED_OPTION = ConfigOptions
 		.key("third.option.a")
 		.defaultValue(3)


[flink] 04/10: [hotfix][docs] Clean up deprecation/unused warnings in tests for ConfigOptionsDocGenerator

Posted by se...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

sewen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 2ce040976810d9bf766dd49f20d5ade66078f00f
Author: Stephan Ewen <se...@apache.org>
AuthorDate: Fri Jan 31 09:33:37 2020 +0100

    [hotfix][docs] Clean up deprecation/unused warnings in tests for ConfigOptionsDocGenerator
---
 .../configuration/ConfigOptionsDocGeneratorTest.java  | 19 ++++++++++++++++++-
 .../docs/configuration/data/TestCommonOptions.java    |  4 ++++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java b/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java
index 38bd96b..1e9731c 100644
--- a/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java
+++ b/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java
@@ -51,6 +51,7 @@ import static org.junit.Assert.assertEquals;
 /**
  * Tests for the {@link ConfigOptionsDocGenerator}.
  */
+@SuppressWarnings("unused")
 public class ConfigOptionsDocGeneratorTest {
 
 	@ClassRule
@@ -59,11 +60,13 @@ public class ConfigOptionsDocGeneratorTest {
 	static class TestConfigGroup {
 		public static ConfigOption<Integer> firstOption = ConfigOptions
 			.key("first.option.a")
+			.intType()
 			.defaultValue(2)
 			.withDescription("This is example description for the first option.");
 
 		public static ConfigOption<String> secondOption = ConfigOptions
 			.key("second.option.a")
+			.stringType()
 			.noDefaultValue()
 			.withDescription("This is long example description for the second option.");
 	}
@@ -71,7 +74,7 @@ public class ConfigOptionsDocGeneratorTest {
 	private enum TestEnum {
 		VALUE_1,
 		VALUE_2,
-		VALUE_3;
+		VALUE_3
 	}
 
 	static class TypeTestConfigGroup {
@@ -209,34 +212,40 @@ public class ConfigOptionsDocGeneratorTest {
 		// should end up in the default group
 		public static ConfigOption<Integer> option1 = ConfigOptions
 			.key("a.option")
+			.intType()
 			.defaultValue(2);
 
 		// should end up in group1, perfect key-prefix match
 		public static ConfigOption<String> option2 = ConfigOptions
 			.key("a.b.option")
+			.stringType()
 			.noDefaultValue();
 
 		// should end up in group1, full key-prefix match
 		public static ConfigOption<Integer> option3 = ConfigOptions
 			.key("a.b.c.option")
+			.intType()
 			.defaultValue(2);
 
 		// should end up in group1, full key-prefix match for group 1, partial match for group 2
 		// checks that the generator remembers the last encountered root node
 		public static ConfigOption<Integer> option4 = ConfigOptions
 			.key("a.b.c.e.option")
+			.intType()
 			.defaultValue(2);
 
 		// should end up in the default group, since no group exists with prefix "a.c"
 		// checks that the generator does not ignore components (like ignoring "c" to find a match "a.b")
 		public static ConfigOption<String> option5 = ConfigOptions
 			.key("a.c.b.option")
+			.stringType()
 			.noDefaultValue();
 
 		// should end up in group2, full key-prefix match for group 2
 		// checks that the longest matching group is assigned
 		public static ConfigOption<Integer> option6 = ConfigOptions
 			.key("a.b.c.d.option")
+			.intType()
 			.defaultValue(2);
 	}
 
@@ -266,21 +275,25 @@ public class ConfigOptionsDocGeneratorTest {
 	static class TestConfigMultipleSubGroup {
 		public static ConfigOption<Integer> firstOption = ConfigOptions
 			.key("first.option.a")
+			.intType()
 			.defaultValue(2)
 			.withDescription("This is example description for the first option.");
 
 		public static ConfigOption<String> secondOption = ConfigOptions
 			.key("second.option.a")
+			.stringType()
 			.noDefaultValue()
 			.withDescription("This is long example description for the second option.");
 
 		public static ConfigOption<Integer> thirdOption = ConfigOptions
 			.key("third.option.a")
+			.intType()
 			.defaultValue(2)
 			.withDescription("This is example description for the third option.");
 
 		public static ConfigOption<String> fourthOption = ConfigOptions
 			.key("fourth.option.a")
+			.stringType()
 			.noDefaultValue()
 			.withDescription("This is long example description for the fourth option.");
 	}
@@ -365,12 +378,14 @@ public class ConfigOptionsDocGeneratorTest {
 		@Documentation.OverrideDefault("default_1")
 		public static ConfigOption<Integer> firstOption = ConfigOptions
 			.key("first.option.a")
+			.intType()
 			.defaultValue(2)
 			.withDescription("This is example description for the first option.");
 
 		@Documentation.OverrideDefault("default_2")
 		public static ConfigOption<String> secondOption = ConfigOptions
 			.key("second.option.a")
+			.stringType()
 			.noDefaultValue()
 			.withDescription("This is long example description for the second option.");
 	}
@@ -453,12 +468,14 @@ public class ConfigOptionsDocGeneratorTest {
 	static class TestConfigGroupWithExclusion {
 		public static ConfigOption<Integer> firstOption = ConfigOptions
 			.key("first.option.a")
+			.intType()
 			.defaultValue(2)
 			.withDescription("This is example description for the first option.");
 
 		@Documentation.ExcludeFromDocumentation
 		public static ConfigOption<String> excludedOption = ConfigOptions
 			.key("excluded.option.a")
+			.stringType()
 			.noDefaultValue()
 			.withDescription("This should not be documented.");
 	}
diff --git a/flink-docs/src/test/java/org/apache/flink/docs/configuration/data/TestCommonOptions.java b/flink-docs/src/test/java/org/apache/flink/docs/configuration/data/TestCommonOptions.java
index 865d9ec..200b84d 100644
--- a/flink-docs/src/test/java/org/apache/flink/docs/configuration/data/TestCommonOptions.java
+++ b/flink-docs/src/test/java/org/apache/flink/docs/configuration/data/TestCommonOptions.java
@@ -25,16 +25,19 @@ import org.apache.flink.configuration.ConfigOptions;
 /**
  * Collection of test {@link ConfigOptions ConfigOptions}.
  */
+@SuppressWarnings("unused") // this class is only accessed reflectively
 public class TestCommonOptions {
 
 	@Documentation.SectionOption(sections = {Documentation.SectionOption.SECTION_COMMON})
 	public static final ConfigOption<Integer> COMMON_OPTION = ConfigOptions
 		.key("first.option.a")
+		.intType()
 		.defaultValue(2)
 		.withDescription("This is the description for the common option.");
 
 	public static final ConfigOption<String> GENERIC_OPTION = ConfigOptions
 		.key("second.option.a")
+		.stringType()
 		.noDefaultValue()
 		.withDescription("This is the description for the generic option.");
 
@@ -43,6 +46,7 @@ public class TestCommonOptions {
 		position = 2)
 	public static final ConfigOption<Integer> COMMON_POSITIONED_OPTION = ConfigOptions
 		.key("third.option.a")
+		.intType()
 		.defaultValue(3)
 		.withDescription("This is the description for the positioned common option.");
 }


[flink] 05/10: [FLINK-15824][docs] (follow-up) Simple improvements/cleanups on @SectionOption

Posted by se...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

sewen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit c8ade873c841c083d2f705e1beb4bb55eb4311e0
Author: Stephan Ewen <se...@apache.org>
AuthorDate: Wed Jan 29 20:37:52 2020 +0100

    [FLINK-15824][docs] (follow-up) Simple improvements/cleanups on @SectionOption
    
      - Rename '@SectionOption' to '@Section' for brevity
    
      - Rename '@Section.sections()' to '@Section.value()'
        That way, the method name can be skipped when only sections (no positions) are specified,
        which should is the common case after config documentation rework is complete.
    
      - Adjust test for @Section annotation
---
 .../flink/annotation/docs/Documentation.java       |   6 +-
 .../flink/configuration/CheckpointingOptions.java  |  18 ++--
 .../apache/flink/configuration/CoreOptions.java    |   6 +-
 .../configuration/HighAvailabilityOptions.java     |  12 +--
 .../flink/configuration/JobManagerOptions.java     |   6 +-
 .../flink/configuration/SecurityOptions.java       |  12 +--
 .../flink/configuration/TaskManagerOptions.java    |  18 ++--
 .../configuration/ConfigOptionsDocGenerator.java   |  12 +--
 .../ConfigOptionsDocGeneratorTest.java             |  48 +++++++--
 .../ConfigOptionsDocsCompletenessITCase.java       | 118 ++++++++++++---------
 .../docs/configuration/data/TestCommonOptions.java |   9 +-
 11 files changed, 161 insertions(+), 104 deletions(-)

diff --git a/flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java b/flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java
index a1e142a..e6c507f 100644
--- a/flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java
+++ b/flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java
@@ -44,13 +44,13 @@ public final class Documentation {
 	 * Annotation used on config option fields to include them in specific sections. Sections are groups of options
 	 * that are aggregated across option classes, with each group being placed into a dedicated file.
 	 *
-	 * <p>The {@link SectionOption#position()} argument controls the position in the generated table, with lower values
+	 * <p>The {@link Section#position()} argument controls the position in the generated table, with lower values
 	 * being placed at the top. Fields with the same position are sorted alphabetically by key.
 	 */
 	@Target(ElementType.FIELD)
 	@Retention(RetentionPolicy.RUNTIME)
 	@Internal
-	public @interface SectionOption {
+	public @interface Section {
 		int POSITION_MEMORY = 10;
 		int POSITION_PARALLELISM_SLOTS = 20;
 		int POSITION_FAULT_TOLERANCE = 30;
@@ -62,7 +62,7 @@ public final class Documentation {
 		/**
 		 * The sections in the config docs where this option should be included.
 		 */
-		String[] sections() default {};
+		String[] value() default {};
 
 		/**
 		 * The relative position of the option in its section.
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java
index 03284ca..566bac6 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java
@@ -31,9 +31,9 @@ public class CheckpointingOptions {
 	// ------------------------------------------------------------------------
 
 	/** The state backend to be used to store and checkpoint state. */
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = Documentation.SectionOption.POSITION_FAULT_TOLERANCE)
+	@Documentation.Section(
+		value = {Documentation.Section.SECTION_COMMON},
+		position = Documentation.Section.POSITION_FAULT_TOLERANCE)
 	public static final ConfigOption<String> STATE_BACKEND = ConfigOptions
 			.key("state.backend")
 			.noDefaultValue()
@@ -105,9 +105,9 @@ public class CheckpointingOptions {
 
 	/** The default directory for savepoints. Used by the state backends that write
 	 * savepoints to file systems (MemoryStateBackend, FsStateBackend, RocksDBStateBackend). */
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = Documentation.SectionOption.POSITION_FAULT_TOLERANCE)
+	@Documentation.Section(
+		value = {Documentation.Section.SECTION_COMMON},
+		position = Documentation.Section.POSITION_FAULT_TOLERANCE)
 	public static final ConfigOption<String> SAVEPOINT_DIRECTORY = ConfigOptions
 			.key("state.savepoints.dir")
 			.noDefaultValue()
@@ -117,9 +117,9 @@ public class CheckpointingOptions {
 
 	/** The default directory used for storing the data files and meta data of checkpoints in a Flink supported filesystem.
 	 * The storage path must be accessible from all participating processes/nodes(i.e. all TaskManagers and JobManagers).*/
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = Documentation.SectionOption.POSITION_FAULT_TOLERANCE)
+	@Documentation.Section(
+		value = {Documentation.Section.SECTION_COMMON},
+		position = Documentation.Section.POSITION_FAULT_TOLERANCE)
 	public static final ConfigOption<String> CHECKPOINTS_DIRECTORY = ConfigOptions
 			.key("state.checkpoints.dir")
 			.noDefaultValue()
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/CoreOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/CoreOptions.java
index 2d7bc55..2449b4e 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/CoreOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/CoreOptions.java
@@ -250,9 +250,9 @@ public class CoreOptions {
 	//  program
 	// ------------------------------------------------------------------------
 
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = Documentation.SectionOption.POSITION_PARALLELISM_SLOTS)
+	@Documentation.Section(
+		value = {Documentation.Section.SECTION_COMMON},
+		position = Documentation.Section.POSITION_PARALLELISM_SLOTS)
 	public static final ConfigOption<Integer> DEFAULT_PARALLELISM = ConfigOptions
 		.key("parallelism.default")
 		.defaultValue(1)
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/HighAvailabilityOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/HighAvailabilityOptions.java
index b6c0f02..0087878 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/HighAvailabilityOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/HighAvailabilityOptions.java
@@ -45,9 +45,9 @@ public class HighAvailabilityOptions {
 	 * To enable high-availability, set this mode to "ZOOKEEPER".
 	 * Can also be set to FQN of HighAvailability factory class.
 	 */
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = Documentation.SectionOption.POSITION_HIGH_AVAILABILITY)
+	@Documentation.Section(
+		value = {Documentation.Section.SECTION_COMMON},
+		position = Documentation.Section.POSITION_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_MODE =
 			key("high-availability")
 			.defaultValue("NONE")
@@ -69,9 +69,9 @@ public class HighAvailabilityOptions {
 	/**
 	 * File system path (URI) where Flink persists metadata in high-availability setups.
 	 */
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = Documentation.SectionOption.POSITION_HIGH_AVAILABILITY)
+	@Documentation.Section(
+		value = {Documentation.Section.SECTION_COMMON},
+		position = Documentation.Section.POSITION_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_STORAGE_PATH =
 			key("high-availability.storageDir")
 			.noDefaultValue()
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
index ef49421..47755e2 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
@@ -79,9 +79,9 @@ public class JobManagerOptions {
 	/**
 	 * JVM heap size for the JobManager with memory size.
 	 */
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = Documentation.SectionOption.POSITION_MEMORY)
+	@Documentation.Section(
+		value = {Documentation.Section.SECTION_COMMON},
+		position = Documentation.Section.POSITION_MEMORY)
 	public static final ConfigOption<String> JOB_MANAGER_HEAP_MEMORY =
 		key("jobmanager.heap.size")
 		.defaultValue("1024m")
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java
index 5235dcf..8297cfd 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java
@@ -106,9 +106,9 @@ public class SecurityOptions {
 	/**
 	 * Enable SSL for internal communication (akka rpc, netty data transport, blob server).
 	 */
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = Documentation.SectionOption.POSITION_SECURITY)
+	@Documentation.Section(
+		value = {Documentation.Section.SECTION_COMMON},
+		position = Documentation.Section.POSITION_SECURITY)
 	public static final ConfigOption<Boolean> SSL_INTERNAL_ENABLED =
 			key("security.ssl.internal.enabled")
 			.defaultValue(false)
@@ -119,9 +119,9 @@ public class SecurityOptions {
 	/**
 	 * Enable SSL for external REST endpoints.
 	 */
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = Documentation.SectionOption.POSITION_SECURITY)
+	@Documentation.Section(
+		value = {Documentation.Section.SECTION_COMMON},
+		position = Documentation.Section.POSITION_SECURITY)
 	public static final ConfigOption<Boolean> SSL_REST_ENABLED =
 			key("security.ssl.rest.enabled")
 			.defaultValue(false)
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java
index cf532a4..05ad75c 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java
@@ -165,9 +165,9 @@ public class TaskManagerOptions {
 	/**
 	 * The config parameter defining the number of task slots of a task manager.
 	 */
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = Documentation.SectionOption.POSITION_PARALLELISM_SLOTS)
+	@Documentation.Section(
+		value = {Documentation.Section.SECTION_COMMON},
+		position = Documentation.Section.POSITION_PARALLELISM_SLOTS)
 	public static final ConfigOption<Integer> NUM_TASK_SLOTS =
 		key("taskmanager.numberOfTaskSlots")
 			.intType()
@@ -254,9 +254,9 @@ public class TaskManagerOptions {
 	/**
 	 * Total Process Memory size for the TaskExecutors.
 	 */
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = Documentation.SectionOption.POSITION_MEMORY)
+	@Documentation.Section(
+		value = {Documentation.Section.SECTION_COMMON},
+		position = Documentation.Section.POSITION_MEMORY)
 	public static final ConfigOption<MemorySize> TOTAL_PROCESS_MEMORY =
 		key("taskmanager.memory.process.size")
 			.memoryType()
@@ -270,9 +270,9 @@ public class TaskManagerOptions {
 	/**
 	 * Total Flink Memory size for the TaskExecutors.
 	 */
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = Documentation.SectionOption.POSITION_MEMORY)
+	@Documentation.Section(
+		value = {Documentation.Section.SECTION_COMMON},
+		position = Documentation.Section.POSITION_MEMORY)
 	public static final ConfigOption<MemorySize> TOTAL_FLINK_MEMORY =
 		key("taskmanager.memory.flink.size")
 			.memoryType()
diff --git a/flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java b/flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java
index 1694620..7e5f9c4 100644
--- a/flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java
+++ b/flink-docs/src/main/java/org/apache/flink/docs/configuration/ConfigOptionsDocGenerator.java
@@ -101,7 +101,7 @@ public class ConfigOptionsDocGenerator {
 	 * every {@link ConfigOption}.
 	 *
 	 * <p>One additional table is generated containing all {@link ConfigOption ConfigOptions} that are annotated with
-	 * {@link Documentation.SectionOption}.
+	 * {@link Documentation.Section}.
 	 *
 	 * @param args
 	 *  [0] output directory for the generated files
@@ -127,12 +127,12 @@ public class ConfigOptionsDocGenerator {
 
 		Map<String, List<OptionWithMetaInfo>> optionsGroupedBySection = allSectionOptions.stream()
 			.flatMap(option -> {
-				final String[] sections = option.field.getAnnotation(Documentation.SectionOption.class).sections();
+				final String[] sections = option.field.getAnnotation(Documentation.Section.class).value();
 				if (sections.length == 0) {
 					throw new RuntimeException(String.format(
 						"Option %s is annotated with %s but the list of sections is empty.",
 						option.option.key(),
-						Documentation.SectionOption.class.getSimpleName()));
+						Documentation.Section.class.getSimpleName()));
 				}
 
 				return Arrays.stream(sections).map(section -> Tuple2.of(section, option));
@@ -142,8 +142,8 @@ public class ConfigOptionsDocGenerator {
 		optionsGroupedBySection.forEach(
 			(section, options) -> {
 				options.sort((o1, o2) -> {
-					int position1 = o1.field.getAnnotation(Documentation.SectionOption.class).position();
-					int position2 = o2.field.getAnnotation(Documentation.SectionOption.class).position();
+					int position1 = o1.field.getAnnotation(Documentation.Section.class).position();
+					int position2 = o2.field.getAnnotation(Documentation.Section.class).position();
 					if (position1 == position2) {
 						return o1.option.key().compareTo(o2.option.key());
 					} else {
@@ -169,7 +169,7 @@ public class ConfigOptionsDocGenerator {
 	private static Collection<OptionWithMetaInfo> findSectionOptions(String rootDir, String module, String packageName, String pathPrefix) throws IOException, ClassNotFoundException {
 		Collection<OptionWithMetaInfo> commonOptions = new ArrayList<>(32);
 		processConfigOptions(rootDir, module, packageName, pathPrefix, optionsClass -> extractConfigOptions(optionsClass).stream()
-			.filter(optionWithMetaInfo -> optionWithMetaInfo.field.getAnnotation(Documentation.SectionOption.class) != null)
+			.filter(optionWithMetaInfo -> optionWithMetaInfo.field.getAnnotation(Documentation.Section.class) != null)
 			.forEachOrdered(commonOptions::add));
 		return commonOptions;
 	}
diff --git a/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java b/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java
index 1e9731c..c0a88aa 100644
--- a/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java
+++ b/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocGeneratorTest.java
@@ -34,9 +34,9 @@ import org.junit.ClassRule;
 import org.junit.Test;
 import org.junit.rules.TemporaryFolder;
 
+import java.io.File;
 import java.io.IOException;
 import java.nio.charset.StandardCharsets;
-import java.nio.file.Paths;
 import java.time.Duration;
 import java.util.Collections;
 import java.util.HashMap;
@@ -423,8 +423,8 @@ public class ConfigOptionsDocGeneratorTest {
 	}
 
 	@Test
-	public void testCommonOptions() throws IOException, ClassNotFoundException {
-		final String projectRootDir = System.getProperty("rootDir");
+	public void testSections() throws IOException, ClassNotFoundException {
+		final String projectRootDir = getProjectRootDir();
 		final String outputDirectory = TMP.newFolder().getAbsolutePath();
 
 		final OptionsClassLocation[] locations = new OptionsClassLocation[] {
@@ -434,7 +434,7 @@ public class ConfigOptionsDocGeneratorTest {
 		ConfigOptionsDocGenerator.generateCommonSection(projectRootDir, outputDirectory, locations, "src/test/java");
 		Formatter formatter = new HtmlFormatter();
 
-		String expected =
+		String expected1 =
 			"<table class=\"table table-bordered\">\n" +
 			"    <thead>\n" +
 			"        <tr>\n" +
@@ -460,9 +460,34 @@ public class ConfigOptionsDocGeneratorTest {
 			"    </tbody>\n" +
 			"</table>\n";
 
-		String output = FileUtils.readFile(Paths.get(outputDirectory, ConfigOptionsDocGenerator.getSectionFileName("common")).toFile(), StandardCharsets.UTF_8.name());
+		String expected2 =
+			"<table class=\"table table-bordered\">\n" +
+				"    <thead>\n" +
+				"        <tr>\n" +
+				"            <th class=\"text-left\" style=\"width: 20%\">Key</th>\n" +
+				"            <th class=\"text-left\" style=\"width: 15%\">Default</th>\n" +
+				"            <th class=\"text-left\" style=\"width: 10%\">Type</th>\n" +
+				"            <th class=\"text-left\" style=\"width: 55%\">Description</th>\n" +
+				"        </tr>\n" +
+				"    </thead>\n" +
+				"    <tbody>\n" +
+				"        <tr>\n" +
+				"            <td><h5>" + TestCommonOptions.COMMON_OPTION.key() + "</h5></td>\n" +
+				"            <td style=\"word-wrap: break-word;\">" + TestCommonOptions.COMMON_OPTION.defaultValue() + "</td>\n" +
+				"            <td>Integer</td>\n" +
+				"            <td>" + formatter.format(TestCommonOptions.COMMON_OPTION.description()) + "</td>\n" +
+				"        </tr>\n" +
+				"    </tbody>\n" +
+				"</table>\n";
+
+		final String fileName1 = ConfigOptionsDocGenerator.getSectionFileName(TestCommonOptions.SECTION_1);
+		final String fileName2 = ConfigOptionsDocGenerator.getSectionFileName(TestCommonOptions.SECTION_2);
 
-		assertEquals(expected, output);
+		final String output1 = FileUtils.readFile(new File(outputDirectory, fileName1), StandardCharsets.UTF_8.name());
+		final String output2 = FileUtils.readFile(new File(outputDirectory, fileName2), StandardCharsets.UTF_8.name());
+
+		assertEquals(expected1, output1);
+		assertEquals(expected2, output2);
 	}
 
 	static class TestConfigGroupWithExclusion {
@@ -509,4 +534,15 @@ public class ConfigOptionsDocGeneratorTest {
 
 		assertEquals(expectedTable, htmlTable);
 	}
+
+	static String getProjectRootDir() {
+		final String dirFromProperty = System.getProperty("rootDir");
+		if (dirFromProperty != null) {
+			return dirFromProperty;
+		}
+
+		// to make this work in the IDE use a default fallback
+		final String currentDir = System.getProperty("user.dir");
+		return new File(currentDir).getParent();
+	}
 }
diff --git a/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocsCompletenessITCase.java b/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocsCompletenessITCase.java
index bd6316e..49d03eb 100644
--- a/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocsCompletenessITCase.java
+++ b/flink-docs/src/test/java/org/apache/flink/docs/configuration/ConfigOptionsDocsCompletenessITCase.java
@@ -19,6 +19,7 @@
 package org.apache.flink.docs.configuration;
 
 import org.apache.flink.annotation.docs.Documentation;
+import org.apache.flink.api.java.tuple.Tuple2;
 import org.apache.flink.configuration.ConfigOption;
 import org.apache.flink.configuration.description.Formatter;
 import org.apache.flink.configuration.description.HtmlFormatter;
@@ -36,10 +37,12 @@ import java.nio.file.Path;
 import java.nio.file.Paths;
 import java.util.ArrayList;
 import java.util.Collection;
+import java.util.Collections;
 import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
 import java.util.Objects;
+import java.util.Optional;
 import java.util.function.Predicate;
 import java.util.stream.Collectors;
 import java.util.stream.Stream;
@@ -63,59 +66,64 @@ public class ConfigOptionsDocsCompletenessITCase {
 	private static final Formatter htmlFormatter = new HtmlFormatter();
 
 	@Test
-	public void testCommonSectionCompleteness() throws IOException, ClassNotFoundException {
-		Map<String, List<DocumentedOption>> documentedOptions = parseDocumentedCommonOptions();
-		Map<String, List<ExistingOption>> existingOptions = findExistingOptions(
-			optionWithMetaInfo -> optionWithMetaInfo.field.getAnnotation(Documentation.SectionOption.class) != null);
+	public void testCompleteness() throws IOException, ClassNotFoundException {
+		final Map<String, List<DocumentedOption>> documentedOptions = parseDocumentedOptions();
+		final Map<String, List<ExistingOption>> existingOptions = findExistingOptions(ignored -> true);
 
-		assertExistingOptionsAreWellDefined(existingOptions);
+		final Map<String, List<ExistingOption>> existingDeduplicated = checkWellDefinedAndDeduplicate(existingOptions);
 
-		compareDocumentedAndExistingOptions(documentedOptions, existingOptions);
+		compareDocumentedAndExistingOptions(documentedOptions, existingDeduplicated);
 	}
 
-	@Test
-	public void testFullReferenceCompleteness() throws IOException, ClassNotFoundException {
-		Map<String, List<DocumentedOption>> documentedOptions = parseDocumentedOptions();
-		Map<String, List<ExistingOption>> existingOptions = findExistingOptions(ignored -> true);
+	private static Map<String, List<ExistingOption>> checkWellDefinedAndDeduplicate(Map<String, List<ExistingOption>> allOptions) {
+		return allOptions.entrySet().stream()
+			.map((entry) -> {
+				final List<ExistingOption> existingOptions = entry.getValue();
+				final List<ExistingOption> consolidated;
 
-		assertExistingOptionsAreWellDefined(existingOptions);
+				if (existingOptions.stream().allMatch(option -> option.isSuffixOption)) {
+					consolidated = existingOptions;
+				}
+				else {
+					Optional<ExistingOption> deduped = existingOptions.stream()
+						.reduce((option1, option2) -> {
+							if (option1.equals(option2)) {
+								// we allow multiple instances of ConfigOptions with the same key if they are identical
+								return option1;
+							} else {
+								// found a ConfigOption pair with the same key that aren't equal
+								// we fail here outright as this is not a documentation-completeness problem
+								if (!option1.defaultValue.equals(option2.defaultValue)) {
+									String errorMessage = String.format(
+										"Ambiguous option %s due to distinct default values (%s (in %s) vs %s (in %s)).",
+										option1.key,
+										option1.defaultValue,
+										option1.containingClass.getSimpleName(),
+										option2.defaultValue,
+										option2.containingClass.getSimpleName());
+									throw new AssertionError(errorMessage);
+								} else {
+									String errorMessage = String.format(
+										"Ambiguous option %s due to distinct descriptions (%s vs %s).",
+										option1.key,
+										option1.containingClass.getSimpleName(),
+										option2.containingClass.getSimpleName());
+									throw new AssertionError(errorMessage);
+								}
+							}
+						});
+					consolidated = Collections.singletonList(deduped.get());
+				}
 
-		compareDocumentedAndExistingOptions(documentedOptions, existingOptions);
+				return new Tuple2<>(entry.getKey(), consolidated);
+			})
+			.collect(Collectors.toMap((t) -> t.f0, (t) -> t.f1));
 	}
 
-	private static void assertExistingOptionsAreWellDefined(Map<String, List<ExistingOption>> allOptions) {
-		allOptions.values().stream()
-			.filter(options -> !options.stream().allMatch(option -> option.isSuffixOption))
-			.forEach(options -> options.stream()
-				.reduce((option1, option2) -> {
-					if (option1.equals(option2)) {
-						// we allow multiple instances of ConfigOptions with the same key if they are identical
-						return option1;
-					} else {
-						// found a ConfigOption pair with the same key that aren't equal
-						// we fail here outright as this is not a documentation-completeness problem
-						if (!option1.defaultValue.equals(option2.defaultValue)) {
-							String errorMessage = String.format(
-								"Ambiguous option %s due to distinct default values (%s (in %s) vs %s (in %s)).",
-								option1.key,
-								option1.defaultValue,
-								option1.containingClass.getSimpleName(),
-								option2.defaultValue,
-								option2.containingClass.getSimpleName());
-							throw new AssertionError(errorMessage);
-						} else {
-							String errorMessage = String.format(
-								"Ambiguous option %s due to distinct descriptions (%s vs %s).",
-								option1.key,
-								option1.containingClass.getSimpleName(),
-								option2.containingClass.getSimpleName());
-							throw new AssertionError(errorMessage);
-						}
-					}
-				}));
-	}
+	private static void compareDocumentedAndExistingOptions(
+			Map<String, List<DocumentedOption>> documentedOptions,
+			Map<String, List<ExistingOption>> existingOptions) {
 
-	private static void compareDocumentedAndExistingOptions(Map<String, List<DocumentedOption>> documentedOptions, Map<String, List<ExistingOption>> existingOptions) {
 		final Collection<String> problems = new ArrayList<>(0);
 
 		// first check that all existing options are properly documented
@@ -130,7 +138,7 @@ public class ConfigOptionsDocsCompletenessITCase {
 					final Iterator<DocumentedOption> candidates = documentedState.iterator();
 
 					boolean matchFound = false;
-					while (candidates.hasNext() && !matchFound) {
+					while (candidates.hasNext()) {
 						DocumentedOption candidate = candidates.next();
 						if (supposedState.defaultValue.equals(candidate.defaultValue) && supposedState.description.equals(candidate.description)) {
 							matchFound = true;
@@ -138,6 +146,10 @@ public class ConfigOptionsDocsCompletenessITCase {
 						}
 					}
 
+					if (documentedState.isEmpty()) {
+						documentedOptions.remove(key);
+					}
+
 					if (!matchFound) {
 						problems.add(String.format(
 							"Documentation of %s in %s is outdated. Expected: default=(%s) description=(%s).",
@@ -169,15 +181,22 @@ public class ConfigOptionsDocsCompletenessITCase {
 	}
 
 	private static Map<String, List<DocumentedOption>> parseDocumentedCommonOptions() throws IOException {
-		Path commonSection = Paths.get(System.getProperty("rootDir"), "docs", "_includes", "generated", COMMON_SECTION_FILE_NAME);
+		final String rootDir = ConfigOptionsDocGeneratorTest.getProjectRootDir();
+
+		Path commonSection = Paths.get(rootDir, "docs", "_includes", "generated", COMMON_SECTION_FILE_NAME);
 		return parseDocumentedOptionsFromFile(commonSection).stream()
 			.collect(Collectors.groupingBy(option -> option.key, Collectors.toList()));
 	}
 
 	private static Map<String, List<DocumentedOption>> parseDocumentedOptions() throws IOException {
-		Path includeFolder = Paths.get(System.getProperty("rootDir"), "docs", "_includes", "generated").toAbsolutePath();
+		final String rootDir = ConfigOptionsDocGeneratorTest.getProjectRootDir();
+
+		Path includeFolder = Paths.get(rootDir, "docs", "_includes", "generated").toAbsolutePath();
 		return Files.list(includeFolder)
-			.filter(path -> path.getFileName().toString().contains("configuration"))
+			.filter((path) -> {
+				final String filename = path.getFileName().toString();
+				return filename.endsWith("configuration.html") || filename.endsWith("_section.html");
+			})
 			.flatMap(file -> {
 				try {
 					return parseDocumentedOptionsFromFile(file).stream();
@@ -216,10 +235,11 @@ public class ConfigOptionsDocsCompletenessITCase {
 	}
 
 	private static Map<String, List<ExistingOption>> findExistingOptions(Predicate<ConfigOptionsDocGenerator.OptionWithMetaInfo> predicate) throws IOException, ClassNotFoundException {
+		final String rootDir = ConfigOptionsDocGeneratorTest.getProjectRootDir();
 		final Collection<ExistingOption> existingOptions = new ArrayList<>();
 
 		for (OptionsClassLocation location : LOCATIONS) {
-			processConfigOptions(System.getProperty("rootDir"), location.getModule(), location.getPackage(), DEFAULT_PATH_PREFIX, optionsClass ->
+			processConfigOptions(rootDir, location.getModule(), location.getPackage(), DEFAULT_PATH_PREFIX, optionsClass ->
 				extractConfigOptions(optionsClass)
 					.stream()
 					.filter(predicate)
diff --git a/flink-docs/src/test/java/org/apache/flink/docs/configuration/data/TestCommonOptions.java b/flink-docs/src/test/java/org/apache/flink/docs/configuration/data/TestCommonOptions.java
index 200b84d..b5aef9e 100644
--- a/flink-docs/src/test/java/org/apache/flink/docs/configuration/data/TestCommonOptions.java
+++ b/flink-docs/src/test/java/org/apache/flink/docs/configuration/data/TestCommonOptions.java
@@ -28,7 +28,10 @@ import org.apache.flink.configuration.ConfigOptions;
 @SuppressWarnings("unused") // this class is only accessed reflectively
 public class TestCommonOptions {
 
-	@Documentation.SectionOption(sections = {Documentation.SectionOption.SECTION_COMMON})
+	public static final String SECTION_1 = "test_A";
+	public static final String SECTION_2 = "other";
+
+	@Documentation.Section({SECTION_1, SECTION_2})
 	public static final ConfigOption<Integer> COMMON_OPTION = ConfigOptions
 		.key("first.option.a")
 		.intType()
@@ -41,9 +44,7 @@ public class TestCommonOptions {
 		.noDefaultValue()
 		.withDescription("This is the description for the generic option.");
 
-	@Documentation.SectionOption(
-		sections = {Documentation.SectionOption.SECTION_COMMON},
-		position = 2)
+	@Documentation.Section(value = SECTION_1, position = 2)
 	public static final ConfigOption<Integer> COMMON_POSITIONED_OPTION = ConfigOptions
 		.key("third.option.a")
 		.intType()


[flink] 10/10: [FLINK-14495][docs] Synchronize the Chinese document for state backend with the English version

Posted by se...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

sewen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 436cd39ed02d4ba651d8173668dfa0075e7c5392
Author: Yu Li <li...@apache.org>
AuthorDate: Sat Feb 1 21:18:24 2020 +0800

    [FLINK-14495][docs] Synchronize the Chinese document for state backend with the English version
---
 docs/ops/state/state_backends.zh.md | 140 ++++++++++++++++++++++++++++++++++--
 1 file changed, 135 insertions(+), 5 deletions(-)

diff --git a/docs/ops/state/state_backends.zh.md b/docs/ops/state/state_backends.zh.md
index eeaf247..60718c2 100644
--- a/docs/ops/state/state_backends.zh.md
+++ b/docs/ops/state/state_backends.zh.md
@@ -36,7 +36,7 @@ under the License.
 * ToC
 {:toc}
 
-## 可用的 State Backends
+# 可用的 State Backends
 
 Flink 内置了以下这些开箱即用的 state backends :
 
@@ -175,18 +175,148 @@ env.setStateBackend(new FsStateBackend("hdfs://namenode:40010/flink/checkpoints"
 配置文件的部分示例如下所示:
 
 {% highlight yaml %}
-# The backend that will be used to store operator state checkpoints
+# 用于存储 operator state 快照的 State Backend
 
 state.backend: filesystem
 
 
-# Directory for storing checkpoints
+# 存储快照的目录
 
 state.checkpoints.dir: hdfs://namenode:40010/flink/checkpoints
 {% endhighlight %}
 
-#### RocksDB State Backend Config Options
+# RocksDB State Backend 进阶
 
-{% include generated/rocks_db_configuration.html %}
+*该小节描述 RocksDBStateBackend 的更多细节*
+
+### 增量快照
+
+RocksDBStateBackend 支持*增量快照*。不同于产生一个包含所有数据的全量备份,增量快照中只包含自上一次快照完成之后被修改的记录,因此可以显著减少快照完成的耗时。
+
+一个增量快照是基于(通常多个)前序快照构建的。由于 RocksDB 内部存在 compaction 机制对 sst 文件进行合并,Flink 的增量快照也会定期重新设立起点(rebase),因此增量链条不会一直增长,旧快照包含的文件也会逐渐过期并被自动清理。
+
+和基于全量快照的恢复时间相比,如果网络带宽是瓶颈,那么基于增量快照恢复可能会消耗更多时间,因为增量快照包含的 sst 文件之间可能存在数据重叠导致需要下载的数据量变大;而当 CPU 或者 IO 是瓶颈的时候,基于增量快照恢复会更快,因为从增量快照恢复不需要解析 Flink 的统一快照格式来重建本地的 RocksDB 数据表,而是可以直接基于 sst 文件加载。
+
+虽然状态数据量很大时我们推荐使用增量快照,但这并不是默认的快照机制,您需要通过下述配置手动开启该功能:
+  - 在 `flink-conf.yaml` 中设置:`state.backend.incremental: true` 或者
+  - 在代码中按照右侧方式配置(来覆盖默认配置):`RocksDBStateBackend backend = new RocksDBStateBackend(filebackend, true);`
+
+### 内存管理
+
+Flink 致力于控制整个进程的内存消耗,以确保 Flink 任务管理器(TaskManager)有良好的内存使用,从而既不会在容器(Docker/Kubernetes, Yarn等)环境中由于内存超用被杀掉,也不会因为内存利用率过低导致不必要的数据落盘或是缓存命中率下降,致使性能下降。
+
+为了达到上述目标,Flink 默认将 RocksDB 的可用内存配置为任务管理器的单槽(per-slot)托管内存量。这将为大多数应用程序提供良好的开箱即用体验,即大多数应用程序不需要调整 RocksDB 配置,简单的增加 Flink 的托管内存即可改善内存相关性能问题。
+
+当然,您也可以选择不使用 Flink 自带的内存管理,而是手动为 RocksDB 的每个列族(ColumnFamily)分配内存(每个算子的每个 state 都对应一个列族)。这为专业用户提供了对 RocksDB 进行更细粒度控制的途径,但同时也意味着用户需要自行保证总内存消耗不会超过(尤其是容器)环境的限制。请参阅 [large state tuning]({{ site.baseurl }}/ops/state/large_state_tuning.html#tuning-rocksdb-memory) 了解有关大状态数据性能调优的一些指导原则。
+
+**RocksDB 使用托管内存**
+
+这个功能默认打开,并且可以通过 `state.backend.rocksdb.memory.managed` 配置项控制。
+
+Flink 并不直接控制 RocksDB 的 native 内存分配,而是通过配置 RocksDB 来确保其使用的内存正好与 Flink 的托管内存预算相同。这是在任务槽(per-slot)级别上完成的(托管内存以任务槽为粒度计算)。
+
+为了设置 RocksDB 实例的总内存使用量,Flink 对同一个任务槽上的所有 RocksDB 实例使用共享的 [cache](https://github.com/facebook/RocksDB/wiki/Block-cache) 以及 [write buffer manager](https://github.com/facebook/rocksdb/wiki/write-buffer-manager)。
+共享 cache 将对 RocksDB 中内存消耗的[三个主要来源](https://github.com/facebook/rocksdb/wiki/Memory-usage-in-rocksdb)(块缓存、索引和bloom过滤器、MemTables)设置上限。
+
+Flink还提供了两个参数来控制*写路径*(MemTable)和*读路径*(索引及过滤器,读缓存)之间的内存分配。当您看到 RocksDB 由于缺少写缓冲内存(频繁刷新)或读缓存未命中而性能不佳时,可以使用这些参数调整读写间的内存分配。
+
+  - `state.backend.rocksdb.memory.write-buffer-ratio`,默认值 `0.5`,即 50% 的给定内存会分配给写缓冲区使用。
+  - `state.backend.rocksdb.memory.high-prio-pool-ratio`,默认值 `0.1`,即 10% 的 block cache 内存会优先分配给索引及过滤器。
+  我们强烈建议不要将此值设置为零,以防止索引和过滤器被频繁踢出缓存而导致性能问题。此外,我们默认将L0级的过滤器和索引将被固定到缓存中以提高性能,更多详细信息请参阅 [RocksDB 文档](https://github.com/facebook/rocksdb/wiki/Block-Cache#caching-index-filter-and-compression-dictionary-blocks)。
+
+<span class="label label-info">注意</span> 上述机制开启时将覆盖用户在 [`PredefinedOptions`](#predefined-per-columnfamily-options) 和 [`OptionsFactory`](#passing-options-factory-to-rocksdb) 中对 block cache 和 write buffer 进行的配置。
+
+<span class="label label-info">注意</span> *仅面向专业用户*:若要手动控制内存,可以将 `state.backend.rocksdb.memory.managed` 设置为 `false`,并通过 [`ColumnFamilyOptions`](#passing-options-factory-to-rocksdb) 配置 RocksDB。
+或者可以复用上述 cache/write-buffer-manager 机制,但将内存大小设置为与 Flink 的托管内存大小无关的固定大小(通过 `state.backend.rocksdb.memory.fixed-per-slot` 选项)。
+注意在这两种情况下,用户都需要确保在 JVM 之外有足够的内存可供 RocksDB 使用。
+
+### 计时器(内存 vs. RocksDB)
+
+计时器(Timer)用于安排稍后的操作(基于事件时间或处理时间),例如触发窗口或回调 `ProcessFunction`。
+
+当选择 RocksDBStateBackend 时,默认情况下计时器也存储在 RocksDB 中。这是一种健壮且可扩展的方式,允许应用程序使用很多个计时器。另一方面,在 RocksDB 中维护计时器会有一定的成本,因此 Flink 也提供了将计时器存储在 JVM 堆上而使用 RocksDB 存储其他状态的选项。当计时器数量较少时,基于堆的计时器可以有更好的性能。
+
+您可以通过将 `state.backend.rocksdb.timer-service.factory` 配置项设置为 `heap`(而不是默认的 `rocksdb`)来将计时器存储在堆上。
+
+<span class="label label-info">注意</span> *在 RocksDBStateBackend 中使用基于堆的计时器的组合当前不支持计时器状态的异步快照。其他状态(如 keyed state)可以被异步快照。*
+
+### 开启 RocksDB 原生监控指标
+
+您可以选择使用 Flink 的监控指标系统来汇报 RocksDB 的原生指标,并且可以选择性的指定特定指标进行汇报。
+请参阅 [configuration docs]({{ site.baseurl }}/ops/config.html#rocksdb-native-metrics) 了解更多详情。
+
+<div class="alert alert-warning">
+  <strong>注意:</strong> 启用 RocksDB 的原生指标可能会对应用程序的性能产生负面影响。
+</div>
+
+### 列族(ColumnFamily)级别的预定义选项
+
+<span class="label label-info">注意</span> 在引入 [RocksDB 使用托管内存](#memory-management) 功能后,此机制应限于在*专家调优*或*故障处理*中使用。
+
+使用*预定义选项*,用户可以在每个 RocksDB 列族上应用一些预定义的配置,例如配置内存使用、线程、Compaction 设置等。目前每个算子的每个状态都在 RocksDB 中有专门的一个列族存储。
+
+有两种方法可以选择要应用的预定义选项:
+  - 通过 `state.backend.rocksdb.predefined-options` 配置项将选项名称设置进 `flink-conf.yaml` 。
+  - 通过程序设置:`RocksDBStateBackend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED_HIGH_MEM)` 。
+
+该选项的默认值是 `DEFAULT` ,对应 `PredefinedOptions.DEFAULT` 。
+
+### 通过 RocksDBOptionsFactory 配置 RocksDB 选项
+
+<span class="label label-info">注意</span> 在引入 [RocksDB 使用托管内存](#memory-management) 功能后,此机制应限于在*专家调优*或*故障处理*中使用。
+
+您也可以通过配置一个 `RocksDBOptionsFactory` 来手动控制 RocksDB 的选项。此机制使您可以对列族的设置进行细粒度控制,例如内存使用、线程、Compaction 设置等。目前每个算子的每个状态都在 RocksDB 中有专门的一个列族存储。
+
+有两种方法可以将 `RocksDBOptionsFactory` 传递给 RocksDBStateBackend:
+
+  - 通过 `state.backend.rocksdb.options-factory` 选项将工厂实现类的名称设置到`flink-conf.yaml` 。
+  
+  - 通过程序设置,例如 `RocksDBStateBackend.setOptions(new MyOptionsFactory());` 。
+  
+<span class="label label-info">注意</span> 通过程序设置的 `RocksDBOptionsFactory` 将覆盖 `flink-conf.yaml` 配置文件的设置,且 `RocksDBOptionsFactory` 设置的优先级高于预定义选项(`PredefinedOptions`)。
+
+<span class="label label-info">注意</span> RocksDB是一个本地库,它直接从进程分配内存,
+而不是从JVM分配内存。分配给 RocksDB 的任何内存都必须被考虑在内,通常需要将这部分内存从任务管理器(`TaskManager`)的JVM堆中减去。
+不这样做可能会导致JVM进程由于分配的内存超过申请值而被 YARN/Mesos 等资源管理框架终止。
+
+**从 flink-conf.yaml 中读取列族选项**
+
+一个实现了 `ConfigurableRocksDBOptionsFactory` 接口的 `RocksDBOptionsFactory` 可以直接从配置文件(`flink-conf.yaml`)中读取设定。
+
+`state.backend.rocksdb.options-factory` 的默认配置是 `org.apache.flink.contrib.streaming.state.DefaultConfigurableOptionsFactory`,它默认会将 [这里定义]({{ site.baseurl }}/ops/config.html#advanced-rocksdb-state-backends-options) 的所有配置项全部加载。
+因此您可以简单的通过关闭 RocksDB 使用托管内存的功能并将需要的设置选项加入配置文件来配置底层的列族选项。
+
+下面是自定义 `ConfigurableRocksDBOptionsFactory` 的一个示例 (开发完成后,请将您的实现类全名设置到 `state.backend.rocksdb.options-factory`).
+
+    {% highlight java %}
+
+    public class MyOptionsFactory implements ConfigurableRocksDBOptionsFactory {
+
+        private static final long DEFAULT_SIZE = 256 * 1024 * 1024;  // 256 MB
+        private long blockCacheSize = DEFAULT_SIZE;
+
+        @Override
+        public DBOptions createDBOptions(DBOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
+            return currentOptions.setIncreaseParallelism(4)
+                   .setUseFsync(false);
+        }
+
+        @Override
+        public ColumnFamilyOptions createColumnOptions(
+            ColumnFamilyOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
+            return currentOptions.setTableFormatConfig(
+                new BlockBasedTableConfig()
+                    .setBlockCacheSize(blockCacheSize)
+                    .setBlockSize(128 * 1024));            // 128 KB
+        }
+
+        @Override
+        public OptionsFactory configure(Configuration configuration) {
+            this.blockCacheSize =
+                configuration.getLong("my.custom.rocksdb.block.cache.size", DEFAULT_SIZE);
+            return this;
+        }
+    }
+    {% endhighlight %}
 
 {% top %}


[flink] 09/10: [FLINK-14495][docs] Reorganize sections of RocksDB documentation

Posted by se...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

sewen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 9afcf31fdccae6f7df31e59cf314852cfb38afd6
Author: Stephan Ewen <se...@apache.org>
AuthorDate: Fri Jan 31 23:28:04 2020 +0100

    [FLINK-14495][docs] Reorganize sections of RocksDB documentation
    
    This moves most in-depth RocksDB contents to te State Backends docs.
    The "tuning large state" docs only refer to actual tuning and trouble-shooting.
---
 docs/ops/state/large_state_tuning.md    | 152 +++++++-----------------------
 docs/ops/state/large_state_tuning.zh.md | 160 +++++++-------------------------
 docs/ops/state/state_backends.md        | 145 ++++++++++++++++++++++++++++-
 3 files changed, 207 insertions(+), 250 deletions(-)

diff --git a/docs/ops/state/large_state_tuning.md b/docs/ops/state/large_state_tuning.md
index e76a835..ba98338 100644
--- a/docs/ops/state/large_state_tuning.md
+++ b/docs/ops/state/large_state_tuning.md
@@ -118,160 +118,70 @@ Other state like keyed state is still snapshotted asynchronously. Please note th
 The state storage workhorse of many large scale Flink streaming applications is the *RocksDB State Backend*.
 The backend scales well beyond main memory and reliably stores large [keyed state](../../dev/stream/state/state.html).
 
-Unfortunately, RocksDB's performance can vary with configuration, and there is little documentation on how to tune
-RocksDB properly. For example, the default configuration is tailored towards SSDs and performs suboptimal
-on spinning disks.
+RocksDB's performance can vary with configuration, this section outlines some best-practices for tuning jobs that use the RocksDB State Backend.
 
 ### Incremental Checkpoints
 
-Incremental checkpoints can dramatically reduce the checkpointing time in comparison to full checkpoints, at the cost of a (potentially) longer
-recovery time. The core idea is that incremental checkpoints only record all changes to the previous completed checkpoint, instead of
-producing a full, self-contained backup of the state backend. Like this, incremental checkpoints build upon previous checkpoints. Flink leverages
-RocksDB's internal backup mechanism in a way that is self-consolidating over time. As a result, the incremental checkpoint history in Flink
-does not grow indefinitely, and old checkpoints are eventually subsumed and pruned automatically.
+When it comes to reducing the time that checkpoints take, activating incremental checkpoints should be one of the first considerations.
+Incremental checkpoints can dramatically reduce the checkpointing time in comparison to full checkpoints, because incremental checkpoints only record the changes compared to the previous completed checkpoint, instead of producing a full, self-contained backup of the state backend.
 
-While we strongly encourage the use of incremental checkpoints for large state, please note that this is a new feature and currently not enabled 
-by default. To enable this feature, users can instantiate a `RocksDBStateBackend` with the corresponding boolean flag in the constructor set to `true`, e.g.:
+See [Incremental Checkpoints in RocksDB]({{ site.baseurl }}/ops/state/state_backends.html#incremental-checkpoints) for more background information.
 
-{% highlight java %}
-    RocksDBStateBackend backend =
-        new RocksDBStateBackend(filebackend, true);
-{% endhighlight %}
+### Timers in RocksDB or on JVM Heap
 
-### RocksDB Timers
+Timers are stored in RocksDB by default, which is the more robust and scalable choice.
 
-For RocksDB, a user can chose whether timers are stored on the heap or inside RocksDB (default). Heap-based timers can have a better performance for smaller numbers of
-timers, while storing timers inside RocksDB offers higher scalability as the number of timers in RocksDB can exceed the available main memory (spilling to disk).
+When performance-tuning jobs that have few timers only (no windows, not using timers in ProcessFunction), putting those timers on the heap can increase performance.
+Use this feature carefully, as heap-based timers may increase checkpointing times and naturally cannot scale beyond memory.
 
-When using RockDB as state backend, the type of timer storage can be selected through Flink's configuration via option key `state.backend.rocksdb.timer-service.factory`.
-Possible choices are `heap` (to store timers on the heap, default) and `rocksdb` (to store timers in RocksDB).
+See [this section]({{ site.baseurl }}/ops/state/state_backends.html#timers-heap-vs-rocksdb) for details on how to configure heap-based timers.
 
-<span class="label label-info">Note</span> *The combination RocksDB state backend with heap-based timers currently does NOT support asynchronous snapshots for the timers state.
-Other state like keyed state is still snapshotted asynchronously. Please note that this is not a regression from previous versions and will be resolved with `FLINK-10026`.*
+### Tuning RocksDB Memory
 
-### Predefined Options
+The performance of the RocksDB State Backend much depends on the amount of memory that it has available. To increase performance, adding memory can help a lot, or adjusting to which functions memory goes.
 
-Flink provides some predefined collections of option for RocksDB for different settings, and there existed two ways
-to pass these predefined options to RocksDB:
-  - Configure the predefined options through `flink-conf.yaml` via option key `state.backend.rocksdb.predefined-options`.
-    The default value of this option is `DEFAULT` which means `PredefinedOptions.DEFAULT`.
-  - Set the predefined options programmatically, e.g. `RocksDBStateBackend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED_HIGH_MEM)`.
+By default, the RocksDB State Backend uses Flink's managed memory budget for RocksDBs buffers and caches (`state.backend.rocksdb.memory.managed: true`). Please refer to the [RocksDB Memory Management]({{ site.baseurl }}/ops/state/state_backends.html#memory-management) for background on how that mechanism works.
 
-We expect to accumulate more such profiles over time. Feel free to contribute such predefined option profiles when you
-found a set of options that work well and seem representative for certain workloads.
+To tune memory-related performance issues, the following steps may be helpful:
 
-<span class="label label-info">Note</span> Predefined options which set programmatically would override the one configured via `flink-conf.yaml`.
+  - The first step to try and increase performance should be to increase the amount of managed memory. This usually improves the situation a lot, without opening up the complexity of tuning low-level RocksDB options.
+    
+    Especially with large container/process sizes, much of the total memory can typically go to RocksDB, unless the application logic requires a lot of JVM heap itself. The default managed memory fraction *(0.4)* is conservative and can often be increased when using TaskManagers with multi-GB process sizes.
 
-### Passing Options Factory to RocksDB
+  - The number of write buffers in RocksDB depends on the number of states you have in your application (states across all operators in the pipeline). Each state corresponds to one ColumnFamily, which needs its own write buffers. Hence, applications with many states typically need more memory for the same performance.
 
-There existed two ways to pass options factory to RocksDB in Flink:
+  - You can try and compare the performance of RocksDB with managed memory to RocksDB with per-column-family memory by setting `state.backend.rocksdb.memory.managed: false`. Especially to test against a baseline (assuming no- or gracious container memory limits) or to test for regressions compared to earlier versions of Flink, this can be useful.
+  
+    Compared to the managed memory setup (constant memory pool), not using managed memory means that RocksDB allocates memory proportional to the number of states in the application (memory footprint changes with application changes). As a rule of thumb, the non-managed mode has (unless ColumnFamily options are applied) an upper bound of roughly "140MB * num-states-across-all-tasks * num-slots". Timers count as state as well!
 
-  - Configure options factory through `flink-conf.yaml`. You could set the options factory class name via option key `state.backend.rocksdb.options-factory`.
-    The default value for this option is `org.apache.flink.contrib.streaming.state.DefaultConfigurableOptionsFactory`, and all candidate configurable options are defined in `RocksDBConfigurableOptions`.
-    Moreover, you could also define your customized and configurable options factory class like below and pass the class name to `state.backend.rocksdb.options-factory`.
+  - If your application has many states and you see frequent MemTable flushes (write-side bottleneck), but you cannot give more memory you can increase the ratio of memory going to the write buffers (`state.backend.rocksdb.memory.write-buffer-ratio`). See [RocksDB Memory Management]({{ site.baseurl }}/ops/state/state_backends.html#memory-management) for details.
 
-    {% highlight java %}
+  - An advanced option (*expert mode*) to reduce the number of MemTable flushes in setups with many states, is to tune RocksDB's ColumnFamily options (arena block size, max background flush threads, etc.) via a `RocksDBOptionsFactory`:
 
+    {% highlight java %}
     public class MyOptionsFactory implements ConfigurableRocksDBOptionsFactory {
-
-        private static final long DEFAULT_SIZE = 256 * 1024 * 1024;  // 256 MB
-        private long blockCacheSize = DEFAULT_SIZE;
-
+    
         @Override
         public DBOptions createDBOptions(DBOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
-            return currentOptions.setIncreaseParallelism(4)
-                   .setUseFsync(false);
+            // increase the max background flush threads when we have many states in one operator,
+            // which means we would have many column families in one DB instance.
+            return currentOptions.setMaxBackgroundFlushes(4);
         }
-
+    
         @Override
         public ColumnFamilyOptions createColumnOptions(
             ColumnFamilyOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
-            return currentOptions.setTableFormatConfig(
-                new BlockBasedTableConfig()
-                    .setBlockCacheSize(blockCacheSize)
-                    .setBlockSize(128 * 1024));            // 128 KB
+            // decrease the arena block size from default 8MB to 1MB. 
+            return currentOptions.setArenaBlockSize(1024 * 1024);
         }
-
+    
         @Override
         public OptionsFactory configure(Configuration configuration) {
-            this.blockCacheSize =
-                configuration.getLong("my.custom.rocksdb.block.cache.size", DEFAULT_SIZE);
             return this;
         }
     }
     {% endhighlight %}
 
-  - Set the options factory programmatically, e.g. `RocksDBStateBackend.setOptions(new MyOptionsFactory());`
-
-<span class="label label-info">Note</span> Options factory which set programmatically would override the one configured via `flink-conf.yaml`,
-and options factory has a higher priority over the predefined options if ever configured or set.
-
-<span class="label label-info">Note</span> RocksDB is a native library that allocates memory directly from the process,
-and not from the JVM. Any memory you assign to RocksDB will have to be accounted for, typically by decreasing the JVM heap size
-of the TaskManagers by the same amount. Not doing that may result in YARN/Mesos/etc terminating the JVM processes for
-allocating more memory than configured.
-
-### Bounding RocksDB Memory Usage
-
-RocksDB allocates native memory outside of the JVM, which could lead the process to exceed the total memory budget.
-This can be especially problematic in containerized environments such as Kubernetes that kill processes who exceed their memory budgets.
-
-Flink limit total memory usage of RocksDB instance(s) per slot by leveraging shareable [cache](https://github.com/facebook/rocksdb/wiki/Block-Cache)
-and [write buffer manager](https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager) among all instances in a single slot by default.
-The shared cache will place an upper limit on the [three components](https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB) that use the majority of memory
-when RocksDB is deployed as a state backend: block cache, index and bloom filters, and MemTables. 
-
-This feature is enabled by default along with managed memory. Flink will use the managed memory budget as the per-slot memory limit for RocksDB state backend(s).
-
-Flink also provides two parameters to tune the memory fraction of MemTable and index & filters along with the bounding RocksDB memory usage feature:
-  - `state.backend.rocksdb.memory.write-buffer-ratio`, by default `0.5`, which means 50% of the given memory would be used by write buffer manager.
-  - `state.backend.rocksdb.memory.high-prio-pool-ratio`, by default `0.1`, which means 10% of the given memory would be set as high priority for index and filters in shared block cache.
-  We strongly suggest not to set this to zero, to prevent index and filters from competing against data blocks for staying in cache and causing performance issues.
-  Moreover, the L0 level filter and index are pinned into the cache by default to mitigate performance problems,
-  more details please refer to the [RocksDB-documentation](https://github.com/facebook/rocksdb/wiki/Block-Cache#caching-index-filter-and-compression-dictionary-blocks).
-
-<span class="label label-info">Note</span> When bounded RocksDB memory usage is enabled by default,
-the shared `cache` and `write buffer manager` will override customized settings of block cache and write buffer via `PredefinedOptions` and `OptionsFactory`.
-
-*Experts only*: To control memory manually instead of using managed memory, user can set `state.backend.rocksdb.memory.managed` as `false` and control via `ColumnFamilyOptions`.
-Or to save some manual calculation, through the `state.backend.rocksdb.memory.fixed-per-slot` option which will override `state.backend.rocksdb.memory.managed` when configured.
-With the later method, please tune down `taskmanager.memory.managed.size` or `taskmanager.memory.managed.fraction` to `0` 
-and increase `taskmanager.memory.task.off-heap.size` by "`taskmanager.numberOfTaskSlots` * `state.backend.rocksdb.memory.fixed-per-slot`" accordingly.
-
-#### Tune performance when bounding RocksDB memory usage.
-
-There might existed performance regression compared with previous no-memory-limit case if you have too many states per slot.
-- If you observed this behavior and not running jobs in containerized environment or don't care about the over-limit memory usage.
-The easiest way to wipe out the performance regression is to disable memory bound for RocksDB, e.g. turn `state.backend.rocksdb.memory.managed` as `false`.
-Moreover, please refer to [memory configuration migration guide](WIP) to know how to keep backward compatibility to previous memory configuration.
-- Otherwise you need to increase the upper memory for RocksDB by tuning up `taskmanager.memory.managed.size` or `taskmanager.memory.managed.fraction`, or increasing the total memory for task manager.
-
-*Experts only*: Apart from increasing total memory, user could also tune RocksDB options (e.g. arena block size, max background flush threads, etc.) via `RocksDBOptionsFactory`:
-
-{% highlight java %}
-public class MyOptionsFactory implements ConfigurableRocksDBOptionsFactory {
-
-    @Override
-    public DBOptions createDBOptions(DBOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
-        // increase the max background flush threads when we have many states in one operator,
-        // which means we would have many column families in one DB instance.
-        return currentOptions.setMaxBackgroundFlushes(4);
-    }
-
-    @Override
-    public ColumnFamilyOptions createColumnOptions(
-        ColumnFamilyOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
-        // decrease the arena block size from default 8MB to 1MB. 
-        return currentOptions.setArenaBlockSize(1024 * 1024);
-    }
-
-    @Override
-    public OptionsFactory configure(Configuration configuration) {
-        return this;
-    }
-}
-{% endhighlight %}
-
 ## Capacity Planning
 
 This section discusses how to decide how many resources should be used for a Flink job to run reliably.
diff --git a/docs/ops/state/large_state_tuning.zh.md b/docs/ops/state/large_state_tuning.zh.md
index f3a7964..a78dc5a 100644
--- a/docs/ops/state/large_state_tuning.zh.md
+++ b/docs/ops/state/large_state_tuning.zh.md
@@ -49,7 +49,7 @@ The two numbers that are of particular interest when scaling up checkpoints are:
 
   - The time until operators start their checkpoint: This time is currently not exposed directly, but corresponds
     to:
-
+    
     `checkpoint_start_delay = end_to_end_duration - synchronous_duration - asynchronous_duration`
 
     When the time to trigger the checkpoint is constantly very high, it means that the *checkpoint barriers* need a long
@@ -118,160 +118,70 @@ Other state like keyed state is still snapshotted asynchronously. Please note th
 The state storage workhorse of many large scale Flink streaming applications is the *RocksDB State Backend*.
 The backend scales well beyond main memory and reliably stores large [keyed state](../../dev/stream/state/state.html).
 
-Unfortunately, RocksDB's performance can vary with configuration, and there is little documentation on how to tune
-RocksDB properly. For example, the default configuration is tailored towards SSDs and performs suboptimal
-on spinning disks.
+RocksDB's performance can vary with configuration, this section outlines some best-practices for tuning jobs that use the RocksDB State Backend.
 
 ### Incremental Checkpoints
 
-Incremental checkpoints can dramatically reduce the checkpointing time in comparison to full checkpoints, at the cost of a (potentially) longer
-recovery time. The core idea is that incremental checkpoints only record all changes to the previous completed checkpoint, instead of
-producing a full, self-contained backup of the state backend. Like this, incremental checkpoints build upon previous checkpoints. Flink leverages
-RocksDB's internal backup mechanism in a way that is self-consolidating over time. As a result, the incremental checkpoint history in Flink
-does not grow indefinitely, and old checkpoints are eventually subsumed and pruned automatically.
+When it comes to reducing the time that checkpoints take, activating incremental checkpoints should be one of the first considerations.
+Incremental checkpoints can dramatically reduce the checkpointing time in comparison to full checkpoints, because incremental checkpoints only record the changes compared to the previous completed checkpoint, instead of producing a full, self-contained backup of the state backend.
 
-While we strongly encourage the use of incremental checkpoints for large state, please note that this is a new feature and currently not enabled
-by default. To enable this feature, users can instantiate a `RocksDBStateBackend` with the corresponding boolean flag in the constructor set to `true`, e.g.:
+See [Incremental Checkpoints in RocksDB]({{ site.baseurl }}/ops/state/state_backends.html#incremental-checkpoints) for more background information.
 
-{% highlight java %}
-    RocksDBStateBackend backend =
-        new RocksDBStateBackend(filebackend, true);
-{% endhighlight %}
+### Timers in RocksDB or on JVM Heap
 
-### RocksDB Timers
+Timers are stored in RocksDB by default, which is the more robust and scalable choice.
 
-For RocksDB, a user can chose whether timers are stored on the heap or inside RocksDB (default). Heap-based timers can have a better performance for smaller numbers of
-timers, while storing timers inside RocksDB offers higher scalability as the number of timers in RocksDB can exceed the available main memory (spilling to disk).
+When performance-tuning jobs that have few timers only (no windows, not using timers in ProcessFunction), putting those timers on the heap can increase performance.
+Use this feature carefully, as heap-based timers may increase checkpointing times and naturally cannot scale beyond memory.
 
-When using RockDB as state backend, the type of timer storage can be selected through Flink's configuration via option key `state.backend.rocksdb.timer-service.factory`.
-Possible choices are `heap` (to store timers on the heap, default) and `rocksdb` (to store timers in RocksDB).
+See [this section]({{ site.baseurl }}/ops/state/state_backends.html#timers-heap-vs-rocksdb) for details on how to configure heap-based timers.
 
-<span class="label label-info">Note</span> *The combination RocksDB state backend with heap-based timers currently does NOT support asynchronous snapshots for the timers state.
-Other state like keyed state is still snapshotted asynchronously. Please note that this is not a regression from previous versions and will be resolved with `FLINK-10026`.*
+### Tuning RocksDB Memory
 
-### Predefined Options
+The performance of the RocksDB State Backend much depends on the amount of memory that it has available. To increase performance, adding memory can help a lot, or adjusting to which functions memory goes.
 
-Flink provides some predefined collections of option for RocksDB for different settings, and there existed two ways
-to pass these predefined options to RocksDB:
-  - Configure the predefined options through `flink-conf.yaml` via option key `state.backend.rocksdb.predefined-options`.
-    The default value of this option is `DEFAULT` which means `PredefinedOptions.DEFAULT`.
-  - Set the predefined options programmatically, e.g. `RocksDBStateBackend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED_HIGH_MEM)`.
+By default, the RocksDB State Backend uses Flink's managed memory budget for RocksDBs buffers and caches (`state.backend.rocksdb.memory.managed: true`). Please refer to the [RocksDB Memory Management]({{ site.baseurl }}/ops/state/state_backends.html#memory-management) for background on how that mechanism works.
 
-We expect to accumulate more such profiles over time. Feel free to contribute such predefined option profiles when you
-found a set of options that work well and seem representative for certain workloads.
+To tune memory-related performance issues, the following steps may be helpful:
 
-<span class="label label-info">Note</span> Predefined options which set programmatically would override the one configured via `flink-conf.yaml`.
+  - The first step to try and increase performance should be to increase the amount of managed memory. This usually improves the situation a lot, without opening up the complexity of tuning low-level RocksDB options.
+    
+    Especially with large container/process sizes, much of the total memory can typically go to RocksDB, unless the application logic requires a lot of JVM heap itself. The default managed memory fraction *(0.4)* is conservative and can often be increased when using TaskManagers with multi-GB process sizes.
 
-### Passing Options Factory to RocksDB
+  - The number of write buffers in RocksDB depends on the number of states you have in your application (states across all operators in the pipeline). Each state corresponds to one ColumnFamily, which needs its own write buffers. Hence, applications with many states typically need more memory for the same performance.
 
-There existed two ways to pass options factory to RocksDB in Flink:
+  - You can try and compare the performance of RocksDB with managed memory to RocksDB with per-column-family memory by setting `state.backend.rocksdb.memory.managed: false`. Especially to test against a baseline (assuming no- or gracious container memory limits) or to test for regressions compared to earlier versions of Flink, this can be useful.
+  
+    Compared to the managed memory setup (constant memory pool), not using managed memory means that RocksDB allocates memory proportional to the number of states in the application (memory footprint changes with application changes). As a rule of thumb, the non-managed mode has (unless ColumnFamily options are applied) an upper bound of roughly "140MB * num-states-across-all-tasks * num-slots". Timers count as state as well!
 
-  - Configure options factory through `flink-conf.yaml`. You could set the options factory class name via option key `state.backend.rocksdb.options-factory`.
-    The default value for this option is `org.apache.flink.contrib.streaming.state.DefaultConfigurableOptionsFactory`, and all candidate configurable options are defined in `RocksDBConfigurableOptions`.
-    Moreover, you could also define your customized and configurable options factory class like below and pass the class name to `state.backend.rocksdb.options-factory`.
+  - If your application has many states and you see frequent MemTable flushes (write-side bottleneck), but you cannot give more memory you can increase the ratio of memory going to the write buffers (`state.backend.rocksdb.memory.write-buffer-ratio`). See [RocksDB Memory Management]({{ site.baseurl }}/ops/state/state_backends.html#memory-management) for details.
 
-    {% highlight java %}
+  - An advanced option (*expert mode*) to reduce the number of MemTable flushes in setups with many states, is to tune RocksDB's ColumnFamily options (arena block size, max background flush threads, etc.) via a `RocksDBOptionsFactory`:
 
+    {% highlight java %}
     public class MyOptionsFactory implements ConfigurableRocksDBOptionsFactory {
-
-        private static final long DEFAULT_SIZE = 256 * 1024 * 1024;  // 256 MB
-        private long blockCacheSize = DEFAULT_SIZE;
-
+    
         @Override
         public DBOptions createDBOptions(DBOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
-            return currentOptions.setIncreaseParallelism(4)
-                   .setUseFsync(false);
+            // increase the max background flush threads when we have many states in one operator,
+            // which means we would have many column families in one DB instance.
+            return currentOptions.setMaxBackgroundFlushes(4);
         }
-
+    
         @Override
         public ColumnFamilyOptions createColumnOptions(
             ColumnFamilyOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
-            return currentOptions.setTableFormatConfig(
-                new BlockBasedTableConfig()
-                    .setBlockCacheSize(blockCacheSize)
-                    .setBlockSize(128 * 1024));            // 128 KB
+            // decrease the arena block size from default 8MB to 1MB. 
+            return currentOptions.setArenaBlockSize(1024 * 1024);
         }
-
+    
         @Override
         public OptionsFactory configure(Configuration configuration) {
-            this.blockCacheSize =
-                configuration.getLong("my.custom.rocksdb.block.cache.size", DEFAULT_SIZE);
             return this;
         }
     }
     {% endhighlight %}
 
-  - Set the options factory programmatically, e.g. `RocksDBStateBackend.setOptions(new MyOptionsFactory());`
-
-<span class="label label-info">Note</span> Options factory which set programmatically would override the one configured via `flink-conf.yaml`,
-and options factory has a higher priority over the predefined options if ever configured or set.
-
-<span class="label label-info">Note</span> RocksDB is a native library that allocates memory directly from the process,
-and not from the JVM. Any memory you assign to RocksDB will have to be accounted for, typically by decreasing the JVM heap size
-of the TaskManagers by the same amount. Not doing that may result in YARN/Mesos/etc terminating the JVM processes for
-allocating more memory than configured.
-
-### Bounding RocksDB Memory Usage
-
-RocksDB allocates native memory outside of the JVM, which could lead the process to exceed the total memory budget.
-This can be especially problematic in containerized environments such as Kubernetes that kill processes who exceed their memory budgets.
-
-Flink limit total memory usage of RocksDB instance(s) per slot by leveraging shareable [cache](https://github.com/facebook/rocksdb/wiki/Block-Cache)
-and [write buffer manager](https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager) among all instances in a single slot by default.
-The shared cache will place an upper limit on the [three components](https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB) that use the majority of memory
-when RocksDB is deployed as a state backend: block cache, index and bloom filters, and MemTables. 
-
-This feature is enabled by default along with managed memory. Flink will use the managed memory budget as the per-slot memory limit for RocksDB state backend(s).
-
-Flink also provides two parameters to tune the memory fraction of MemTable and index & filters along with the bounding RocksDB memory usage feature:
-  - `state.backend.rocksdb.memory.write-buffer-ratio`, by default `0.5`, which means 50% of the given memory would be used by write buffer manager.
-  - `state.backend.rocksdb.memory.high-prio-pool-ratio`, by default `0.1`, which means 10% of the given memory would be set as high priority for index and filters in shared block cache.
-  We strongly suggest not to set this to zero, to prevent index and filters from competing against data blocks for staying in cache and causing performance issues.
-  Moreover, the L0 level filter and index are pinned into the cache by default to mitigate performance problems,
-  more details please refer to the [RocksDB-documentation](https://github.com/facebook/rocksdb/wiki/Block-Cache#caching-index-filter-and-compression-dictionary-blocks).
-
-<span class="label label-info">Note</span> When bounded RocksDB memory usage is enabled by default,
-the shared `cache` and `write buffer manager` will override customized settings of block cache and write buffer via `PredefinedOptions` and `OptionsFactory`.
-
-*Experts only*: To control memory manually instead of using managed memory, user can set `state.backend.rocksdb.memory.managed` as `false` and control via `ColumnFamilyOptions`.
-Or to save some manual calculation, through the `state.backend.rocksdb.memory.fixed-per-slot` option which will override `state.backend.rocksdb.memory.managed` when configured.
-With the later method, please tune down `taskmanager.memory.managed.size` or `taskmanager.memory.managed.fraction` to `0` 
-and increase `taskmanager.memory.task.off-heap.size` by "`taskmanager.numberOfTaskSlots` * `state.backend.rocksdb.memory.fixed-per-slot`" accordingly.
-
-#### Tune performance when bounding RocksDB memory usage.
-
-There might existed performance regression compared with previous no-memory-limit case if you have too many states per slot.
-- If you observed this behavior and not running jobs in containerized environment or don't care about the over-limit memory usage.
-The easiest way to wipe out the performance regression is to disable memory bound for RocksDB, e.g. turn `state.backend.rocksdb.memory.managed` as `false`.
-Moreover, please refer to [memory configuration migration guide](WIP) to know how to keep backward compatibility to previous memory configuration.
-- Otherwise you need to increase the upper memory for RocksDB by tuning up `taskmanager.memory.managed.size` or `taskmanager.memory.managed.fraction`, or increasing the total memory for task manager.
-
-*Experts only*: Apart from increasing total memory, user could also tune RocksDB options (e.g. arena block size, max background flush threads, etc.) via `RocksDBOptionsFactory`:
-
-{% highlight java %}
-public class MyOptionsFactory implements ConfigurableRocksDBOptionsFactory {
-
-    @Override
-    public DBOptions createDBOptions(DBOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
-        // increase the max background flush threads when we have many states in one operator,
-        // which means we would have many column families in one DB instance.
-        return currentOptions.setMaxBackgroundFlushes(4);
-    }
-
-    @Override
-    public ColumnFamilyOptions createColumnOptions(
-        ColumnFamilyOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
-        // decrease the arena block size from default 8MB to 1MB. 
-        return currentOptions.setArenaBlockSize(1024 * 1024);
-    }
-
-    @Override
-    public OptionsFactory configure(Configuration configuration) {
-        return this;
-    }
-}
-{% endhighlight %}
-
 ## Capacity Planning
 
 This section discusses how to decide how many resources should be used for a Flink job to run reliably.
@@ -292,7 +202,7 @@ The basic rules of thumb for capacity planning are:
   - Temporary back pressure is usually okay, and an essential part of execution flow control during load spikes,
     during catch-up phases, or when external systems (that are written to in a sink) exhibit temporary slowdown.
 
-  - Certain operations (like large windows) result in a spiky load for their downstream operators:
+  - Certain operations (like large windows) result in a spiky load for their downstream operators: 
     In the case of windows, the downstream operators may have little to do while the window is being built,
     and have a load to do when the windows are emitted.
     The planning for the downstream parallelism needs to take into account how much the windows emit and how
@@ -308,10 +218,10 @@ executing the program with a low parallelism.
 
 ## Compression
 
-Flink offers optional compression (default: off) for all checkpoints and savepoints. Currently, compression always uses
+Flink offers optional compression (default: off) for all checkpoints and savepoints. Currently, compression always uses 
 the [snappy compression algorithm (version 1.1.4)](https://github.com/xerial/snappy-java) but we are planning to support
 custom compression algorithms in the future. Compression works on the granularity of key-groups in keyed state, i.e.
-each key-group can be decompressed individually, which is important for rescaling.
+each key-group can be decompressed individually, which is important for rescaling. 
 
 Compression can be activated through the `ExecutionConfig`:
 
diff --git a/docs/ops/state/state_backends.md b/docs/ops/state/state_backends.md
index 1264aec..5bc0afb 100644
--- a/docs/ops/state/state_backends.md
+++ b/docs/ops/state/state_backends.md
@@ -37,7 +37,7 @@ chosen **State Backend**.
 * ToC
 {:toc}
 
-## Available State Backends
+# Available State Backends
 
 Out of the box, Flink bundles these state backends:
 
@@ -125,7 +125,7 @@ Certain RocksDB native metrics are available but disabled by default, you can fi
 
 The total memory amount of RocksDB instance(s) per slot can also be bounded, please refer to documentation [here](large_state_tuning.html#bounding-rocksdb-memory-usage) for details.
 
-## Configuring a State Backend
+# Configuring a State Backend
 
 The default state backend, if you specify nothing, is the jobmanager. If you wish to establish a different default for all jobs on your cluster, you can do so by defining a new default state backend in **flink-conf.yaml**. The default state backend can be overridden on a per-job basis, as shown below.
 
@@ -188,8 +188,145 @@ state.backend: filesystem
 state.checkpoints.dir: hdfs://namenode:40010/flink/checkpoints
 {% endhighlight %}
 
-#### RocksDB State Backend Config Options
 
-{% include generated/rocks_db_configuration.html %}
+# RocksDB State Backend Details
+
+*This section describes the RocksDB state backend in more detail.*
+
+### Incremental Checkpoints
+
+RocksDB supports *Incremental Checkpoints*, which can dramatically reduce the checkpointing time in comparison to full checkpoints.
+Instead of producing a full, self-contained backup of the state backend, incremental checkpoints only record the changes that happened since the latest completed checkpoint.
+
+An incremental checkpoint builds upon (typically multiple) previous checkpoints. Flink leverages RocksDB's internal compaction mechanism in a way that is self-consolidating over time. As a result, the incremental checkpoint history in Flink does not grow indefinitely, and old checkpoints are eventually subsumed and pruned automatically.
+
+Recovery time of incremental checkpoints may be longer or shorter compared to full checkpoints. If your network bandwidth is the bottleneck, it may take a bit longer to restore from an incremental checkpoint, because it implies fetching more data (more deltas). Restoring from an incremental checkpoint is faster, if the bottleneck is your CPU or IOPs, because restoring from an incremental checkpoint means not re-building the local RocksDB tables from Flink's canonical key/value snapshot f [...]
+
+While we encourage the use of incremental checkpoints for large state, you need to enable this feature manually:
+  - Setting a default in your `flink-conf.yaml`: `state.backend.incremental: true` will enable incremental checkpoints, unless the application overrides this setting in the code.
+  - You can alternatively configure this directly in the code (overrides the config default): `RocksDBStateBackend backend = new RocksDBStateBackend(checkpointDirURI, true);`
+
+### Memory Management
+
+Flink aims to control the total process memory consumption to make sure that the Flink TaskManagers have a well-behaved memory footprint. That means staying within the limits enforced by the environment (Docker/Kubernetes, Yarn, etc) to not get killed for consuming too much memory, but also to not under-utilize memory (unnecessary spilling to disk, wasted caching opportunities, reduced performance).
+
+To achieve that, Flink by default configures RocksDB's memory allocation to the amount of managed memory of the TaskManager (or, more precisely, task slot). This should give good out-of-the-box experience for most applications, meaning most applications should not need to tune any of the detailed RocksDB settings. The primary mechanism for improving memory-related performance issues would be to simply increase Flink's managed memory.
+
+Users can choose to deactivate that feature and let RocksDB allocate memory independently per ColumnFamily (one per state per operator). This offers expert users ultimately more fine grained control over RocksDB, but means that users need to take care themselves that the overall memory consumption does not exceed the limits of the environment. See [large state tuning]({{ site.baseurl }}/ops/state/large_state_tuning.html#tuning-rocksdb-memory) for some guideline about large state performa [...]
+
+**Managed Memory for RocksDB**
+
+This feature is active by default and can be (de)activated via the `state.backend.rocksdb.memory.managed` configuration key.
+
+Flink does not directly manage RocksDB's native memory allocations, but configures RocksDB in a certain way to ensure it uses exactly as much memory as Flink has for its managed memory budget. This is done on a per-slot level (managed memory is accounted per slot).
+
+To set the total memory usage of RocksDB instance(s), Flink leverages a shared [cache](https://github.com/facebook/rocksdb/wiki/Block-Cache)
+and [write buffer manager](https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager) among all instances in a single slot. 
+The shared cache will place an upper limit on the [three components](https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB) that use the majority of memory
+in RocksDB: block cache, index and bloom filters, and MemTables.
+
+For advanced tuning, Flink also provides two parameters to control the division of memory between the *write path* (MemTable) and *read path* (index & filters, remaining cache). When you see that RocksDB performs badly due to lack of write buffer memory (frequent flushes) or cache misses, you can use these parameters to redistribute the memory.
+
+  - `state.backend.rocksdb.memory.write-buffer-ratio`, by default `0.5`, which means 50% of the given memory would be used by write buffer manager.
+  - `state.backend.rocksdb.memory.high-prio-pool-ratio`, by default `0.1`, which means 10% of the given memory would be set as high priority for index and filters in shared block cache.
+  We strongly suggest not to set this to zero, to prevent index and filters from competing against data blocks for staying in cache and causing performance issues.
+  Moreover, the L0 level filter and index are pinned into the cache by default to mitigate performance problems,
+  more details please refer to the [RocksDB-documentation](https://github.com/facebook/rocksdb/wiki/Block-Cache#caching-index-filter-and-compression-dictionary-blocks).
+
+<span class="label label-info">Note</span> When the above described mechanism (`cache` and `write buffer manager`) is enabled, it will override any customized settings for block caches and write buffers done via [`PredefinedOptions`](#predefined-per-columnfamily-options) and [`OptionsFactory`](#passing-options-factory-to-rocksdb).
+
+<span class="label label-info">Note</span> *Expert Mode*: To control memory manually, you can set `state.backend.rocksdb.memory.managed` to `false` and configure RocksDB via [`ColumnFamilyOptions`](#passing-options-factory-to-rocksdb). Alternatively, you can use the above mentioned cache/buffer-manager mechanism, but set the memory size to a fixed amount independent of Flink's managed memory size (`state.backend.rocksdb.memory.fixed-per-slot` option). Note that in both cases, users need  [...]
+
+### Timers (Heap vs. RocksDB)
+
+Timers are used to schedule actions for later (event-time or processing-time), such as firing a window, or calling back a `ProcessFunction`.
+
+When selecting the RocksDB State Backend, timers are by default also stored in RocksDB. That is a robust and scalable way that lets applications scale to many timers. However, maintaining timers in RocksDB can have a certain cost, which is why Flink provides the option to store timers on the JVM heap instead, even when RocksDB is used to store other states. Heap-based timers can have a better performance when there is a smaller number of timers.
+
+Set the configuration option `state.backend.rocksdb.timer-service.factory` to `heap` (rather than the default, `rocksdb`) to store timers on heap.
+
+<span class="label label-info">Note</span> *The combination RocksDB state backend with heap-based timers currently does NOT support asynchronous snapshots for the timers state. Other state like keyed state is still snapshotted asynchronously.*
+
+### Enabling RocksDB Native Metrics
+
+You can optionally access RockDB's native metrics through Flink's metrics system, by enabling certain metrics selectively.
+See [configuration docs]({{ site.baseurl }}/ops/config.html#rocksdb-native-metrics) for details.
+
+<div class="alert alert-warning">
+  <strong>Note:</strong> Enabling RocksDB's native metrics may have a negative performance impact on your application.
+</div>
+
+### Predefined Per-ColumnFamily Options
+
+<span class="label label-info">Note</span> With the introduction of [memory management for RocksDB](#memory-management) this mechanism should be mainly used for *expert tuning* or *trouble shooting*.
+
+With *Predefined Options*, users can apply some predefined config profiles on each RocksDB Column Family, configuring for example memory use, thread, compaction settings, etc. There is currently one Column Family per each state in each operator.
+
+There are two ways to select predefined options to be applied:
+  - Set the option's name in `flink-conf.yaml` via `state.backend.rocksdb.predefined-options`.
+  - Set the predefined options programmatically: `RocksDBStateBackend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED_HIGH_MEM)`.
+
+The default value for this option is `DEFAULT` which translates to `PredefinedOptions.DEFAULT`.
+
+<span class="label label-info">Note</span> Predefined options set programmatically would override the ones configured via `flink-conf.yaml`.
+
+### Passing Options Factory to RocksDB
+
+<span class="label label-info">Note</span> With the introduction of [memory management for RocksDB](#memory-management) this mechanism should be mainly used for *expert tuning* or *trouble shooting*.
+
+To manually control RocksDB's options, you need to configure an `RocksDBOptionsFactory`. This mechanism gives you fine-grained control over the settings of the Column Families, for example memory use, thread, compaction settings, etc. There is currently one Column Family per each state in each operator.
+
+There are two ways to pass an OptionsFactory to the RocksDB State Backend:
+
+  - Configure options factory class name in the `flink-conf.yaml` via `state.backend.rocksdb.options-factory`.
+  
+  - Set the options factory programmatically, e.g. `RocksDBStateBackend.setOptions(new MyOptionsFactory());`
+
+<span class="label label-info">Note</span> Options factory which set programmatically would override the one configured via `flink-conf.yaml`,
+and options factory has a higher priority over the predefined options if ever configured or set.
+
+<span class="label label-info">Note</span> RocksDB is a native library that allocates memory directly from the process,
+and not from the JVM. Any memory you assign to RocksDB will have to be accounted for, typically by decreasing the JVM heap size
+of the TaskManagers by the same amount. Not doing that may result in YARN/Mesos/etc terminating the JVM processes for
+allocating more memory than configured.
+
+**Reading Column Family Options from flink-conf.yaml**
+
+When an `OptionsFactory` implements the `ConfigurableRocksDBOptionsFactory` interface, it can directly read settings from the configuration (`flink-conf.yaml`).
+
+The default value for `state.backend.rocksdb.options-factory` is in fact `org.apache.flink.contrib.streaming.state.DefaultConfigurableOptionsFactory` which picks up all config options [defined here]({{ site.baseurl }}/ops/config.html#advanced-rocksdb-state-backends-options) by default. Hence, you can configure low-level Column Family options simply by turning off managed memory for RocksDB and putting the relevant entries in the configuration.
+
+Below is an example how to define a custom ConfigurableOptionsFactory (set class name under `state.backend.rocksdb.options-factory`).
+
+    {% highlight java %}
+
+    public class MyOptionsFactory implements ConfigurableRocksDBOptionsFactory {
+
+        private static final long DEFAULT_SIZE = 256 * 1024 * 1024;  // 256 MB
+        private long blockCacheSize = DEFAULT_SIZE;
+
+        @Override
+        public DBOptions createDBOptions(DBOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
+            return currentOptions.setIncreaseParallelism(4)
+                   .setUseFsync(false);
+        }
+
+        @Override
+        public ColumnFamilyOptions createColumnOptions(
+            ColumnFamilyOptions currentOptions, Collection<AutoCloseable> handlesToClose) {
+            return currentOptions.setTableFormatConfig(
+                new BlockBasedTableConfig()
+                    .setBlockCacheSize(blockCacheSize)
+                    .setBlockSize(128 * 1024));            // 128 KB
+        }
+
+        @Override
+        public OptionsFactory configure(Configuration configuration) {
+            this.blockCacheSize =
+                configuration.getLong("my.custom.rocksdb.block.cache.size", DEFAULT_SIZE);
+            return this;
+        }
+    }
+    {% endhighlight %}
 
 {% top %}


[flink] 02/10: [FLINK-15697][docs] Move python options from general configuration page to table config options page

Posted by se...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

sewen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit afd88a3c25d085d5d570ad5775c21cdef575a455
Author: Dawid Wysakowicz <dw...@apache.org>
AuthorDate: Wed Jan 29 09:20:57 2020 +0100

    [FLINK-15697][docs] Move python options from general configuration page to table config options page
    
    This closes #10957
---
 docs/dev/table/config.md | 4 ++++
 docs/ops/config.md       | 4 ----
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/docs/dev/table/config.md b/docs/dev/table/config.md
index dd312d6..a9e9b71 100644
--- a/docs/dev/table/config.md
+++ b/docs/dev/table/config.md
@@ -104,3 +104,7 @@ The following options can be used to tune the performance of the query execution
 The following options can be used to adjust the behavior of the query optimizer to get a better execution plan.
 
 {% include generated/optimizer_config_configuration.html %}
+
+### Python Options
+
+{% include generated/python_configuration.html %}
diff --git a/docs/ops/config.md b/docs/ops/config.md
index bd672a9..0a42ec7 100644
--- a/docs/ops/config.md
+++ b/docs/ops/config.md
@@ -225,10 +225,6 @@ You have to configure `jobmanager.archive.fs.dir` in order to archive terminated
 
 {% include generated/history_server_configuration.html %}
 
-### Python
-
-{% include generated/python_configuration.html %}
-
 ## Legacy
 
 - `mode`: Execution mode of Flink. Possible values are `legacy` and `new`. In order to start the legacy components, you have to specify `legacy` (DEFAULT: `new`).


[flink] 06/10: [FLINK-15698][docs] Restructure the Configuration docs

Posted by se...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

sewen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit afb11a9a4f2d47f961b62d335e7773d3b11d28f2
Author: Stephan Ewen <se...@apache.org>
AuthorDate: Wed Jan 29 20:08:58 2020 +0100

    [FLINK-15698][docs] Restructure the Configuration docs
    
      - Grouping options by semantics and functionality, rather than by defined class.
      - Splitting between "normal/common options" and options that should only be necessary
        for trouble-shooting.
---
 ...figuration.html => all_jobmanager_section.html} |  14 +-
 ...n.html => all_taskmanager_network_section.html} |  54 ++-
 ...iguration.html => all_taskmanager_section.html} |  20 +-
 ....html => common_high_availability_section.html} |   6 -
 .../common_high_availability_zk_section.html       |  24 ++
 .../generated/common_host_port_section.html        |  66 ++++
 .../_includes/generated/common_memory_section.html | 102 ++++++
 .../generated/common_miscellaneous_section.html    |  24 ++
 docs/_includes/generated/common_section.html       |  84 -----
 .../generated/common_state_backends_section.html   |  54 +++
 docs/_includes/generated/core_configuration.html   |  18 +
 .../generated/deprecated_file_sinks_section.html   |  24 ++
 ...tion.html => expert_class_loading_section.html} |  12 -
 .../generated/expert_fault_tolerance_section.html  |  60 ++++
 .../expert_high_availability_section.html          |  18 +
 .../expert_high_availability_zk_section.html       |  84 +++++
 docs/_includes/generated/expert_rest_section.html  |  66 ++++
 .../generated/expert_rocksdb_section.html          |  36 ++
 .../generated/expert_scheduling_section.html       |  30 ++
 .../generated/expert_security_ssl_section.html     |  42 +++
 .../generated/expert_state_backends_section.html   |  30 ++
 .../generated/high_availability_configuration.html |  84 +++++
 .../generated/job_manager_configuration.html       |   2 +-
 .../netty_shuffle_environment_configuration.html   |  42 +++
 .../generated/resource_manager_configuration.html  |   6 -
 .../generated/security_auth_kerberos_section.html  |  36 ++
 .../generated/security_auth_zk_section.html        |  30 ++
 .../generated/security_configuration.html          |  70 ++--
 ...onfiguration.html => security_ssl_section.html} |  60 ----
 .../generated/state_backend_rocksdb_section.html   |  42 +++
 .../generated/task_manager_configuration.html      |   6 -
 docs/_includes/generated/web_configuration.html    |  12 -
 docs/ops/config.md                                 | 394 +++++++++++++++-----
 docs/ops/config.zh.md                              | 396 +++++++++++++++------
 .../flink/annotation/docs/Documentation.java       |  44 ++-
 .../flink/configuration/CheckpointingOptions.java  |  19 +-
 .../apache/flink/configuration/ClusterOptions.java |   7 +
 .../apache/flink/configuration/CoreOptions.java    |  13 +-
 .../configuration/HeartbeatManagerOptions.java     |   3 +
 .../configuration/HighAvailabilityOptions.java     |  30 +-
 .../flink/configuration/JobManagerOptions.java     |  15 +-
 .../NettyShuffleEnvironmentOptions.java            |  19 +-
 .../configuration/ResourceManagerOptions.java      |   3 +-
 .../apache/flink/configuration/RestOptions.java    |  14 +
 .../flink/configuration/SecurityOptions.java       |  49 ++-
 .../flink/configuration/TaskManagerOptions.java    |  40 ++-
 .../org/apache/flink/configuration/WebOptions.java |   2 +
 .../contrib/streaming/state/RocksDBOptions.java    |  10 +
 48 files changed, 1812 insertions(+), 504 deletions(-)

diff --git a/docs/_includes/generated/job_manager_configuration.html b/docs/_includes/generated/all_jobmanager_section.html
similarity index 87%
copy from docs/_includes/generated/job_manager_configuration.html
copy to docs/_includes/generated/all_jobmanager_section.html
index 76671e5..ab01a85 100644
--- a/docs/_includes/generated/job_manager_configuration.html
+++ b/docs/_includes/generated/all_jobmanager_section.html
@@ -22,7 +22,7 @@
         </tr>
         <tr>
             <td><h5>jobmanager.execution.failover-strategy</h5></td>
-            <td style="word-wrap: break-word;">"full"</td>
+            <td style="word-wrap: break-word;">region</td>
             <td>String</td>
             <td>This option specifies how the job computation recovers from task failures. Accepted values are:<ul><li>'full': Restarts all tasks to recover the job.</li><li>'region': Restarts all tasks that could be affected by the task failure. More details can be found <a href="../dev/task_failure_recovery.html#restart-pipelined-region-failover-strategy">here</a>.</li></ul></td>
         </tr>
@@ -62,17 +62,5 @@
             <td>Integer</td>
             <td>The max number of completed jobs that can be kept in the job store.</td>
         </tr>
-        <tr>
-            <td><h5>slot.idle.timeout</h5></td>
-            <td style="word-wrap: break-word;">50000</td>
-            <td>Long</td>
-            <td>The timeout in milliseconds for a idle slot in Slot Pool.</td>
-        </tr>
-        <tr>
-            <td><h5>slot.request.timeout</h5></td>
-            <td style="word-wrap: break-word;">300000</td>
-            <td>Long</td>
-            <td>The timeout in milliseconds for requesting a slot from Slot Pool.</td>
-        </tr>
     </tbody>
 </table>
diff --git a/docs/_includes/generated/netty_shuffle_environment_configuration.html b/docs/_includes/generated/all_taskmanager_network_section.html
similarity index 67%
copy from docs/_includes/generated/netty_shuffle_environment_configuration.html
copy to docs/_includes/generated/all_taskmanager_network_section.html
index 8a5b3ba..04a8566 100644
--- a/docs/_includes/generated/netty_shuffle_environment_configuration.html
+++ b/docs/_includes/generated/all_taskmanager_network_section.html
@@ -9,18 +9,6 @@
     </thead>
     <tbody>
         <tr>
-            <td><h5>taskmanager.data.port</h5></td>
-            <td style="word-wrap: break-word;">0</td>
-            <td>Integer</td>
-            <td>The task manager’s port used for data exchange operations.</td>
-        </tr>
-        <tr>
-            <td><h5>taskmanager.data.ssl.enabled</h5></td>
-            <td style="word-wrap: break-word;">true</td>
-            <td>Boolean</td>
-            <td>Enable SSL support for the taskmanager data transport. This is applicable only when the global flag for internal SSL (security.ssl.internal.enabled) is set to true</td>
-        </tr>
-        <tr>
             <td><h5>taskmanager.network.blocking-shuffle.compression.enabled</h5></td>
             <td style="word-wrap: break-word;">false</td>
             <td>Boolean</td>
@@ -51,6 +39,48 @@
             <td>Number of extra network buffers to use for each outgoing/incoming gate (result partition/input gate). In credit-based flow control mode, this indicates how many floating credits are shared among all the input channels. The floating buffers are distributed based on backlog (real-time output buffers in the subpartition) feedback, and can help relieve back-pressure caused by unbalanced data distribution among the subpartitions. This value should be increased in case of highe [...]
         </tr>
         <tr>
+            <td><h5>taskmanager.network.netty.client.connectTimeoutSec</h5></td>
+            <td style="word-wrap: break-word;">120</td>
+            <td>Integer</td>
+            <td>The Netty client connection timeout.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.network.netty.client.numThreads</h5></td>
+            <td style="word-wrap: break-word;">-1</td>
+            <td>Integer</td>
+            <td>The number of Netty client threads.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.network.netty.num-arenas</h5></td>
+            <td style="word-wrap: break-word;">-1</td>
+            <td>Integer</td>
+            <td>The number of Netty arenas.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.network.netty.sendReceiveBufferSize</h5></td>
+            <td style="word-wrap: break-word;">0</td>
+            <td>Integer</td>
+            <td>The Netty send and receive buffer size. This defaults to the system buffer size (cat /proc/sys/net/ipv4/tcp_[rw]mem) and is 4 MiB in modern Linux.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.network.netty.server.backlog</h5></td>
+            <td style="word-wrap: break-word;">0</td>
+            <td>Integer</td>
+            <td>The netty server connection backlog.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.network.netty.server.numThreads</h5></td>
+            <td style="word-wrap: break-word;">-1</td>
+            <td>Integer</td>
+            <td>The number of Netty server threads.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.network.netty.transport</h5></td>
+            <td style="word-wrap: break-word;">"auto"</td>
+            <td>String</td>
+            <td>The Netty transport type, either "nio" or "epoll". The "auto" means selecting the property mode automatically based on the platform. Note that the "epoll" mode can get better performance, less GC and have more advanced features which are only available on modern Linux.</td>
+        </tr>
+        <tr>
             <td><h5>taskmanager.network.request-backoff.initial</h5></td>
             <td style="word-wrap: break-word;">100</td>
             <td>Integer</td>
diff --git a/docs/_includes/generated/task_manager_configuration.html b/docs/_includes/generated/all_taskmanager_section.html
similarity index 86%
copy from docs/_includes/generated/task_manager_configuration.html
copy to docs/_includes/generated/all_taskmanager_section.html
index fed0f21..692c38b 100644
--- a/docs/_includes/generated/task_manager_configuration.html
+++ b/docs/_includes/generated/all_taskmanager_section.html
@@ -27,10 +27,16 @@
             <td>Time we wait for the timers in milliseconds to finish all pending timer threads when the stream task is cancelled.</td>
         </tr>
         <tr>
-            <td><h5>task.checkpoint.alignment.max-size</h5></td>
-            <td style="word-wrap: break-word;">-1</td>
-            <td>Long</td>
-            <td>The maximum number of bytes that a checkpoint alignment may buffer. If the checkpoint alignment buffers more than the configured amount of data, the checkpoint is aborted (skipped). A value of -1 indicates that there is no limit.</td>
+            <td><h5>taskmanager.data.port</h5></td>
+            <td style="word-wrap: break-word;">0</td>
+            <td>Integer</td>
+            <td>The task manager’s port used for data exchange operations.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.data.ssl.enabled</h5></td>
+            <td style="word-wrap: break-word;">true</td>
+            <td>Boolean</td>
+            <td>Enable SSL support for the taskmanager data transport. This is applicable only when the global flag for internal SSL (security.ssl.internal.enabled) is set to true</td>
         </tr>
         <tr>
             <td><h5>taskmanager.debug.memory.log</h5></td>
@@ -57,6 +63,12 @@
             <td>Whether to kill the TaskManager when the task thread throws an OutOfMemoryError.</td>
         </tr>
         <tr>
+            <td><h5>taskmanager.memory.segment-size</h5></td>
+            <td style="word-wrap: break-word;">32 kb</td>
+            <td>MemorySize</td>
+            <td>Size of memory buffers used by the network stack and the memory manager.</td>
+        </tr>
+        <tr>
             <td><h5>taskmanager.network.bind-policy</h5></td>
             <td style="word-wrap: break-word;">"ip"</td>
             <td>String</td>
diff --git a/docs/_includes/generated/high_availability_configuration.html b/docs/_includes/generated/common_high_availability_section.html
similarity index 83%
copy from docs/_includes/generated/high_availability_configuration.html
copy to docs/_includes/generated/common_high_availability_section.html
index bf30e91..4c06ca7 100644
--- a/docs/_includes/generated/high_availability_configuration.html
+++ b/docs/_includes/generated/common_high_availability_section.html
@@ -21,12 +21,6 @@
             <td>The ID of the Flink cluster, used to separate multiple Flink clusters from each other. Needs to be set for standalone clusters but is automatically inferred in YARN and Mesos.</td>
         </tr>
         <tr>
-            <td><h5>high-availability.jobmanager.port</h5></td>
-            <td style="word-wrap: break-word;">"0"</td>
-            <td>String</td>
-            <td>Optional port (range) used by the job manager in high-availability mode.</td>
-        </tr>
-        <tr>
             <td><h5>high-availability.storageDir</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>String</td>
diff --git a/docs/_includes/generated/common_high_availability_zk_section.html b/docs/_includes/generated/common_high_availability_zk_section.html
new file mode 100644
index 0000000..18175fb
--- /dev/null
+++ b/docs/_includes/generated/common_high_availability_zk_section.html
@@ -0,0 +1,24 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.root</h5></td>
+            <td style="word-wrap: break-word;">"/flink"</td>
+            <td>String</td>
+            <td>The root path under which Flink stores its entries in ZooKeeper.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.quorum</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The ZooKeeper quorum to use, when running Flink in a high-availability mode with ZooKeeper.</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/common_host_port_section.html b/docs/_includes/generated/common_host_port_section.html
new file mode 100644
index 0000000..ce9b92f
--- /dev/null
+++ b/docs/_includes/generated/common_host_port_section.html
@@ -0,0 +1,66 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>jobmanager.rpc.address</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The config parameter defining the network address to connect to for communication with the job manager. This value is only interpreted in setups where a single JobManager with static name or address exists (simple standalone setups, or container setups with dynamic service name resolution). It is not used in many high-availability setups, when a leader-election service (like ZooKeeper) is used to elect and discover the JobManager leader from potentially multiple standby J [...]
+        </tr>
+        <tr>
+            <td><h5>jobmanager.rpc.port</h5></td>
+            <td style="word-wrap: break-word;">6123</td>
+            <td>Integer</td>
+            <td>The config parameter defining the network port to connect to for communication with the job manager. Like jobmanager.rpc.address, this value is only interpreted in setups where a single JobManager with static name/address and port exists (simple standalone setups, or container setups with dynamic service name resolution). This config option is not used in many high-availability setups, when a leader-election service (like ZooKeeper) is used to elect and discover the JobMa [...]
+        </tr>
+        <tr>
+            <td><h5>rest.address</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The address that should be used by clients to connect to the server.</td>
+        </tr>
+        <tr>
+            <td><h5>rest.bind-address</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The address that the server binds itself.</td>
+        </tr>
+        <tr>
+            <td><h5>rest.bind-port</h5></td>
+            <td style="word-wrap: break-word;">"8081"</td>
+            <td>String</td>
+            <td>The port that the server binds itself. Accepts a list of ports (“50100,50101”), ranges (“50100-50200”) or a combination of both. It is recommended to set a range of ports to avoid collisions when multiple Rest servers are running on the same machine.</td>
+        </tr>
+        <tr>
+            <td><h5>rest.port</h5></td>
+            <td style="word-wrap: break-word;">8081</td>
+            <td>Integer</td>
+            <td>The port that the client connects to. If rest.bind-port has not been specified, then the REST server will bind to this port.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.data.port</h5></td>
+            <td style="word-wrap: break-word;">0</td>
+            <td>Integer</td>
+            <td>The task manager’s port used for data exchange operations.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.host</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The address of the network interface that the TaskManager binds to. This option can be used to define explicitly a binding address. Because different TaskManagers need different values for this option, usually it is specified in an additional non-shared TaskManager-specific config file.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.rpc.port</h5></td>
+            <td style="word-wrap: break-word;">"0"</td>
+            <td>String</td>
+            <td>The task manager’s IPC port. Accepts a list of ports (“50100,50101”), ranges (“50100-50200”) or a combination of both. It is recommended to set a range of ports to avoid collisions when multiple TaskManagers are running on the same machine.</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/common_memory_section.html b/docs/_includes/generated/common_memory_section.html
new file mode 100644
index 0000000..713edff
--- /dev/null
+++ b/docs/_includes/generated/common_memory_section.html
@@ -0,0 +1,102 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>taskmanager.memory.flink.size</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>MemorySize</td>
+            <td>Total Flink Memory size for the TaskExecutors. This includes all the memory that a TaskExecutor consumes, except for JVM Metaspace and JVM Overhead. It consists of Framework Heap Memory, Task Heap Memory, Task Off-Heap Memory, Managed Memory, and Network Memory. See also 'taskmanager.memory.process.size' for total process memory size configuration.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.framework.heap.size</h5></td>
+            <td style="word-wrap: break-word;">128 mb</td>
+            <td>MemorySize</td>
+            <td>Framework Heap Memory size for TaskExecutors. This is the size of JVM heap memory reserved for TaskExecutor framework, which will not be allocated to task slots.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.framework.off-heap.size</h5></td>
+            <td style="word-wrap: break-word;">128 mb</td>
+            <td>MemorySize</td>
+            <td>Framework Off-Heap Memory size for TaskExecutors. This is the size of off-heap memory (JVM direct memory and native memory) reserved for TaskExecutor framework, which will not be allocated to task slots. The configured value will be fully counted when Flink calculates the JVM max direct memory size parameter.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.jvm-metaspace.size</h5></td>
+            <td style="word-wrap: break-word;">96 mb</td>
+            <td>MemorySize</td>
+            <td>JVM Metaspace Size for the TaskExecutors.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.jvm-overhead.fraction</h5></td>
+            <td style="word-wrap: break-word;">0.1</td>
+            <td>Float</td>
+            <td>Fraction of Total Process Memory to be reserved for JVM Overhead. This is off-heap memory reserved for JVM overhead, such as thread stack space, compile cache, etc. This includes native memory but not direct memory, and will not be counted when Flink calculates JVM max direct memory size parameter. The size of JVM Overhead is derived to make up the configured fraction of the Total Process Memory. If the derived size is less/greater than the configured min/max size, the mi [...]
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.jvm-overhead.max</h5></td>
+            <td style="word-wrap: break-word;">1 gb</td>
+            <td>MemorySize</td>
+            <td>Max JVM Overhead size for the TaskExecutors. This is off-heap memory reserved for JVM overhead, such as thread stack space, compile cache, etc. This includes native memory but not direct memory, and will not be counted when Flink calculates JVM max direct memory size parameter. The size of JVM Overhead is derived to make up the configured fraction of the Total Process Memory. If the derived size is less/greater than the configured min/max size, the min/max size will be us [...]
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.jvm-overhead.min</h5></td>
+            <td style="word-wrap: break-word;">192 mb</td>
+            <td>MemorySize</td>
+            <td>Min JVM Overhead size for the TaskExecutors. This is off-heap memory reserved for JVM overhead, such as thread stack space, compile cache, etc. This includes native memory but not direct memory, and will not be counted when Flink calculates JVM max direct memory size parameter. The size of JVM Overhead is derived to make up the configured fraction of the Total Process Memory. If the derived size is less/greater than the configured min/max size, the min/max size will be us [...]
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.managed.fraction</h5></td>
+            <td style="word-wrap: break-word;">0.4</td>
+            <td>Float</td>
+            <td>Fraction of Total Flink Memory to be used as Managed Memory, if Managed Memory size is not explicitly specified.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.managed.size</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>MemorySize</td>
+            <td>Managed Memory size for TaskExecutors. This is the size of off-heap memory managed by the memory manager, reserved for sorting, hash tables, caching of intermediate results and RocksDB state backend. Memory consumers can either allocate memory from the memory manager in the form of MemorySegments, or reserve bytes from the memory manager and keep their memory usage within that boundary. If unspecified, it will be derived to make up the configured fraction of the Total Fli [...]
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.network.fraction</h5></td>
+            <td style="word-wrap: break-word;">0.1</td>
+            <td>Float</td>
+            <td>Fraction of Total Flink Memory to be used as Network Memory. Network Memory is off-heap memory reserved for ShuffleEnvironment (e.g., network buffers). Network Memory size is derived to make up the configured fraction of the Total Flink Memory. If the derived size is less/greater than the configured min/max size, the min/max size will be used. The exact size of Network Memory can be explicitly specified by setting the min/max size to the same value.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.network.max</h5></td>
+            <td style="word-wrap: break-word;">1 gb</td>
+            <td>MemorySize</td>
+            <td>Max Network Memory size for TaskExecutors. Network Memory is off-heap memory reserved for ShuffleEnvironment (e.g., network buffers). Network Memory size is derived to make up the configured fraction of the Total Flink Memory. If the derived size is less/greater than the configured min/max size, the min/max size will be used. The exact size of Network Memory can be explicitly specified by setting the min/max to the same value.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.network.min</h5></td>
+            <td style="word-wrap: break-word;">64 mb</td>
+            <td>MemorySize</td>
+            <td>Min Network Memory size for TaskExecutors. Network Memory is off-heap memory reserved for ShuffleEnvironment (e.g., network buffers). Network Memory size is derived to make up the configured fraction of the Total Flink Memory. If the derived size is less/greater than the configured min/max size, the min/max size will be used. The exact size of Network Memory can be explicitly specified by setting the min/max to the same value.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.process.size</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>MemorySize</td>
+            <td>Total Process Memory size for the TaskExecutors. This includes all the memory that a TaskExecutor consumes, consisting of Total Flink Memory, JVM Metaspace, and JVM Overhead. On containerized setups, this should be set to the container memory. See also 'taskmanager.memory.flink.size' for total Flink memory size configuration.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.task.heap.size</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>MemorySize</td>
+            <td>Task Heap Memory size for TaskExecutors. This is the size of JVM heap memory reserved for tasks. If not specified, it will be derived as Total Flink Memory minus Framework Heap Memory, Task Off-Heap Memory, Managed Memory and Network Memory.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.memory.task.off-heap.size</h5></td>
+            <td style="word-wrap: break-word;">0 bytes</td>
+            <td>MemorySize</td>
+            <td>Task Off-Heap Memory size for TaskExecutors. This is the size of off heap memory (JVM direct memory and native memory) reserved for tasks. The configured value will be fully counted when Flink calculates the JVM max direct memory size parameter.</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/common_miscellaneous_section.html b/docs/_includes/generated/common_miscellaneous_section.html
new file mode 100644
index 0000000..0a4e15a
--- /dev/null
+++ b/docs/_includes/generated/common_miscellaneous_section.html
@@ -0,0 +1,24 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>fs.default-scheme</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The default filesystem scheme, used for paths that do not declare a scheme explicitly. May contain an authority, e.g. host:port in case of an HDFS NameNode.</td>
+        </tr>
+        <tr>
+            <td><h5>io.tmp.dirs</h5></td>
+            <td style="word-wrap: break-word;">'LOCAL_DIRS' on Yarn. '_FLINK_TMP_DIR' on Mesos. System.getProperty("java.io.tmpdir") in standalone.</td>
+            <td>String</td>
+            <td>Directories for temporary files, separated by",", "|", or the system's java.io.File.pathSeparator.</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/common_section.html b/docs/_includes/generated/common_section.html
deleted file mode 100644
index ac1d7d5..0000000
--- a/docs/_includes/generated/common_section.html
+++ /dev/null
@@ -1,84 +0,0 @@
-<table class="table table-bordered">
-    <thead>
-        <tr>
-            <th class="text-left" style="width: 20%">Key</th>
-            <th class="text-left" style="width: 15%">Default</th>
-            <th class="text-left" style="width: 10%">Type</th>
-            <th class="text-left" style="width: 55%">Description</th>
-        </tr>
-    </thead>
-    <tbody>
-        <tr>
-            <td><h5>jobmanager.heap.size</h5></td>
-            <td style="word-wrap: break-word;">"1024m"</td>
-            <td>String</td>
-            <td>JVM heap size for the JobManager.</td>
-        </tr>
-        <tr>
-            <td><h5>taskmanager.memory.flink.size</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>MemorySize</td>
-            <td>Total Flink Memory size for the TaskExecutors. This includes all the memory that a TaskExecutor consumes, except for JVM Metaspace and JVM Overhead. It consists of Framework Heap Memory, Task Heap Memory, Task Off-Heap Memory, Managed Memory, and Network Memory. See also 'taskmanager.memory.process.size' for total process memory size configuration.</td>
-        </tr>
-        <tr>
-            <td><h5>taskmanager.memory.process.size</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>MemorySize</td>
-            <td>Total Process Memory size for the TaskExecutors. This includes all the memory that a TaskExecutor consumes, consisting of Total Flink Memory, JVM Metaspace, and JVM Overhead. On containerized setups, this should be set to the container memory. See also 'taskmanager.memory.flink.size' for total Flink memory size configuration.</td>
-        </tr>
-        <tr>
-            <td><h5>parallelism.default</h5></td>
-            <td style="word-wrap: break-word;">1</td>
-            <td>Integer</td>
-            <td>Default parallelism for jobs.</td>
-        </tr>
-        <tr>
-            <td><h5>taskmanager.numberOfTaskSlots</h5></td>
-            <td style="word-wrap: break-word;">1</td>
-            <td>Integer</td>
-            <td>The number of parallel operator or user function instances that a single TaskManager can run. If this value is larger than 1, a single TaskManager takes multiple instances of a function or operator. That way, the TaskManager can utilize multiple CPU cores, but at the same time, the available memory is divided between the different operator or function instances. This value is typically proportional to the number of physical CPU cores that the TaskManager's machine has (e. [...]
-        </tr>
-        <tr>
-            <td><h5>state.backend</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>The state backend to be used to store and checkpoint state.</td>
-        </tr>
-        <tr>
-            <td><h5>state.checkpoints.dir</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>The default directory used for storing the data files and meta data of checkpoints in a Flink supported filesystem. The storage path must be accessible from all participating processes/nodes(i.e. all TaskManagers and JobManagers).</td>
-        </tr>
-        <tr>
-            <td><h5>state.savepoints.dir</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>The default directory for savepoints. Used by the state backends that write savepoints to file systems (MemoryStateBackend, FsStateBackend, RocksDBStateBackend).</td>
-        </tr>
-        <tr>
-            <td><h5>high-availability</h5></td>
-            <td style="word-wrap: break-word;">"NONE"</td>
-            <td>String</td>
-            <td>Defines high-availability mode used for the cluster execution. To enable high-availability, set this mode to "ZOOKEEPER" or specify FQN of factory class.</td>
-        </tr>
-        <tr>
-            <td><h5>high-availability.storageDir</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>File system path (URI) where Flink persists metadata in high-availability setups.</td>
-        </tr>
-        <tr>
-            <td><h5>security.ssl.internal.enabled</h5></td>
-            <td style="word-wrap: break-word;">false</td>
-            <td>Boolean</td>
-            <td>Turns on SSL for internal network communication. Optionally, specific components may override this through their own settings (rpc, data transport, REST, etc).</td>
-        </tr>
-        <tr>
-            <td><h5>security.ssl.rest.enabled</h5></td>
-            <td style="word-wrap: break-word;">false</td>
-            <td>Boolean</td>
-            <td>Turns on SSL for external communication via the REST endpoints.</td>
-        </tr>
-    </tbody>
-</table>
diff --git a/docs/_includes/generated/common_state_backends_section.html b/docs/_includes/generated/common_state_backends_section.html
new file mode 100644
index 0000000..0df38e9
--- /dev/null
+++ b/docs/_includes/generated/common_state_backends_section.html
@@ -0,0 +1,54 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>state.backend</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The state backend to be used to store and checkpoint state.</td>
+        </tr>
+        <tr>
+            <td><h5>state.checkpoints.dir</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The default directory used for storing the data files and meta data of checkpoints in a Flink supported filesystem. The storage path must be accessible from all participating processes/nodes(i.e. all TaskManagers and JobManagers).</td>
+        </tr>
+        <tr>
+            <td><h5>state.savepoints.dir</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The default directory for savepoints. Used by the state backends that write savepoints to file systems (MemoryStateBackend, FsStateBackend, RocksDBStateBackend).</td>
+        </tr>
+        <tr>
+            <td><h5>state.backend.incremental</h5></td>
+            <td style="word-wrap: break-word;">false</td>
+            <td>Boolean</td>
+            <td>Option whether the state backend should create incremental checkpoints, if possible. For an incremental checkpoint, only a diff from the previous checkpoint is stored, rather than the complete checkpoint state. Some state backends may not support incremental checkpoints and ignore this option.</td>
+        </tr>
+        <tr>
+            <td><h5>state.backend.local-recovery</h5></td>
+            <td style="word-wrap: break-word;">false</td>
+            <td>Boolean</td>
+            <td>This option configures local recovery for this state backend. By default, local recovery is deactivated. Local recovery currently only covers keyed state backends. Currently, MemoryStateBackend does not support local recovery and ignore this option.</td>
+        </tr>
+        <tr>
+            <td><h5>state.checkpoints.num-retained</h5></td>
+            <td style="word-wrap: break-word;">1</td>
+            <td>Integer</td>
+            <td>The maximum number of completed checkpoints to retain.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.state.local.root-dirs</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The config parameter defining the root directories for storing file-based state for local recovery. Local recovery currently only covers keyed state backends. Currently, MemoryStateBackend does not support local recovery and ignore this option</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/core_configuration.html b/docs/_includes/generated/core_configuration.html
index 60f0520..cf68ddf 100644
--- a/docs/_includes/generated/core_configuration.html
+++ b/docs/_includes/generated/core_configuration.html
@@ -27,6 +27,24 @@
             <td>Defines the class resolution strategy when loading classes from user code, meaning whether to first check the user code jar ("child-first") or the application classpath ("parent-first"). The default settings indicate to load classes first from the user code jar, which means that user code jars can include and load different dependencies than Flink uses (transitively).</td>
         </tr>
         <tr>
+            <td><h5>fs.default-scheme</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The default filesystem scheme, used for paths that do not declare a scheme explicitly. May contain an authority, e.g. host:port in case of an HDFS NameNode.</td>
+        </tr>
+        <tr>
+            <td><h5>fs.output.always-create-directory</h5></td>
+            <td style="word-wrap: break-word;">false</td>
+            <td>Boolean</td>
+            <td>File writers running with a parallelism larger than one create a directory for the output file path and put the different result files (one per parallel writer task) into that directory. If this option is set to "true", writers with a parallelism of 1 will also create a directory and place a single result file into it. If the option is set to "false", the writer will directly create the file directly at the output path, without creating a containing directory.</td>
+        </tr>
+        <tr>
+            <td><h5>fs.overwrite-files</h5></td>
+            <td style="word-wrap: break-word;">false</td>
+            <td>Boolean</td>
+            <td>Specifies whether file output writers should overwrite existing files by default. Set to "true" to overwrite by default,"false" otherwise.</td>
+        </tr>
+        <tr>
             <td><h5>io.tmp.dirs</h5></td>
             <td style="word-wrap: break-word;">'LOCAL_DIRS' on Yarn. '_FLINK_TMP_DIR' on Mesos. System.getProperty("java.io.tmpdir") in standalone.</td>
             <td>String</td>
diff --git a/docs/_includes/generated/deprecated_file_sinks_section.html b/docs/_includes/generated/deprecated_file_sinks_section.html
new file mode 100644
index 0000000..26e3d53
--- /dev/null
+++ b/docs/_includes/generated/deprecated_file_sinks_section.html
@@ -0,0 +1,24 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>fs.output.always-create-directory</h5></td>
+            <td style="word-wrap: break-word;">false</td>
+            <td>Boolean</td>
+            <td>File writers running with a parallelism larger than one create a directory for the output file path and put the different result files (one per parallel writer task) into that directory. If this option is set to "true", writers with a parallelism of 1 will also create a directory and place a single result file into it. If the option is set to "false", the writer will directly create the file directly at the output path, without creating a containing directory.</td>
+        </tr>
+        <tr>
+            <td><h5>fs.overwrite-files</h5></td>
+            <td style="word-wrap: break-word;">false</td>
+            <td>Boolean</td>
+            <td>Specifies whether file output writers should overwrite existing files by default. Set to "true" to overwrite by default,"false" otherwise.</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/core_configuration.html b/docs/_includes/generated/expert_class_loading_section.html
similarity index 79%
copy from docs/_includes/generated/core_configuration.html
copy to docs/_includes/generated/expert_class_loading_section.html
index 60f0520..1732e84 100644
--- a/docs/_includes/generated/core_configuration.html
+++ b/docs/_includes/generated/expert_class_loading_section.html
@@ -26,17 +26,5 @@
             <td>String</td>
             <td>Defines the class resolution strategy when loading classes from user code, meaning whether to first check the user code jar ("child-first") or the application classpath ("parent-first"). The default settings indicate to load classes first from the user code jar, which means that user code jars can include and load different dependencies than Flink uses (transitively).</td>
         </tr>
-        <tr>
-            <td><h5>io.tmp.dirs</h5></td>
-            <td style="word-wrap: break-word;">'LOCAL_DIRS' on Yarn. '_FLINK_TMP_DIR' on Mesos. System.getProperty("java.io.tmpdir") in standalone.</td>
-            <td>String</td>
-            <td>Directories for temporary files, separated by",", "|", or the system's java.io.File.pathSeparator.</td>
-        </tr>
-        <tr>
-            <td><h5>parallelism.default</h5></td>
-            <td style="word-wrap: break-word;">1</td>
-            <td>Integer</td>
-            <td>Default parallelism for jobs.</td>
-        </tr>
     </tbody>
 </table>
diff --git a/docs/_includes/generated/expert_fault_tolerance_section.html b/docs/_includes/generated/expert_fault_tolerance_section.html
new file mode 100644
index 0000000..520717b
--- /dev/null
+++ b/docs/_includes/generated/expert_fault_tolerance_section.html
@@ -0,0 +1,60 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>cluster.registration.error-delay</h5></td>
+            <td style="word-wrap: break-word;">10000</td>
+            <td>Long</td>
+            <td>The pause made after an registration attempt caused an exception (other than timeout) in milliseconds.</td>
+        </tr>
+        <tr>
+            <td><h5>cluster.registration.initial-timeout</h5></td>
+            <td style="word-wrap: break-word;">100</td>
+            <td>Long</td>
+            <td>Initial registration timeout between cluster components in milliseconds.</td>
+        </tr>
+        <tr>
+            <td><h5>cluster.registration.max-timeout</h5></td>
+            <td style="word-wrap: break-word;">30000</td>
+            <td>Long</td>
+            <td>Maximum registration timeout between cluster components in milliseconds.</td>
+        </tr>
+        <tr>
+            <td><h5>cluster.registration.refused-registration-delay</h5></td>
+            <td style="word-wrap: break-word;">30000</td>
+            <td>Long</td>
+            <td>The pause made after the registration attempt was refused in milliseconds.</td>
+        </tr>
+        <tr>
+            <td><h5>cluster.services.shutdown-timeout</h5></td>
+            <td style="word-wrap: break-word;">30000</td>
+            <td>Long</td>
+            <td>The shutdown timeout for cluster services like executors in milliseconds.</td>
+        </tr>
+        <tr>
+            <td><h5>heartbeat.interval</h5></td>
+            <td style="word-wrap: break-word;">10000</td>
+            <td>Long</td>
+            <td>Time interval for requesting heartbeat from sender side.</td>
+        </tr>
+        <tr>
+            <td><h5>heartbeat.timeout</h5></td>
+            <td style="word-wrap: break-word;">50000</td>
+            <td>Long</td>
+            <td>Timeout for requesting and receiving heartbeat for both sender and receiver sides.</td>
+        </tr>
+        <tr>
+            <td><h5>jobmanager.execution.failover-strategy</h5></td>
+            <td style="word-wrap: break-word;">region</td>
+            <td>String</td>
+            <td>This option specifies how the job computation recovers from task failures. Accepted values are:<ul><li>'full': Restarts all tasks to recover the job.</li><li>'region': Restarts all tasks that could be affected by the task failure. More details can be found <a href="../dev/task_failure_recovery.html#restart-pipelined-region-failover-strategy">here</a>.</li></ul></td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/expert_high_availability_section.html b/docs/_includes/generated/expert_high_availability_section.html
new file mode 100644
index 0000000..6ddbeb5
--- /dev/null
+++ b/docs/_includes/generated/expert_high_availability_section.html
@@ -0,0 +1,18 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>high-availability.jobmanager.port</h5></td>
+            <td style="word-wrap: break-word;">"0"</td>
+            <td>String</td>
+            <td>Optional port (range) used by the job manager in high-availability mode.</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/expert_high_availability_zk_section.html b/docs/_includes/generated/expert_high_availability_zk_section.html
new file mode 100644
index 0000000..d7774e2
--- /dev/null
+++ b/docs/_includes/generated/expert_high_availability_zk_section.html
@@ -0,0 +1,84 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>high-availability.zookeeper.client.acl</h5></td>
+            <td style="word-wrap: break-word;">"open"</td>
+            <td>String</td>
+            <td>Defines the ACL (open|creator) to be configured on ZK node. The configuration value can be set to “creator” if the ZooKeeper server configuration has the “authProvider” property mapped to use SASLAuthenticationProvider and the cluster is configured to run in secure mode (Kerberos).</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.client.connection-timeout</h5></td>
+            <td style="word-wrap: break-word;">15000</td>
+            <td>Integer</td>
+            <td>Defines the connection timeout for ZooKeeper in ms.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.client.max-retry-attempts</h5></td>
+            <td style="word-wrap: break-word;">3</td>
+            <td>Integer</td>
+            <td>Defines the number of connection retries before the client gives up.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.client.retry-wait</h5></td>
+            <td style="word-wrap: break-word;">5000</td>
+            <td>Integer</td>
+            <td>Defines the pause between consecutive retries in ms.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.client.session-timeout</h5></td>
+            <td style="word-wrap: break-word;">60000</td>
+            <td>Integer</td>
+            <td>Defines the session timeout for the ZooKeeper session in ms.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.checkpoint-counter</h5></td>
+            <td style="word-wrap: break-word;">"/checkpoint-counter"</td>
+            <td>String</td>
+            <td>ZooKeeper root path (ZNode) for checkpoint counters.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.checkpoints</h5></td>
+            <td style="word-wrap: break-word;">"/checkpoints"</td>
+            <td>String</td>
+            <td>ZooKeeper root path (ZNode) for completed checkpoints.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.jobgraphs</h5></td>
+            <td style="word-wrap: break-word;">"/jobgraphs"</td>
+            <td>String</td>
+            <td>ZooKeeper root path (ZNode) for job graphs</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.latch</h5></td>
+            <td style="word-wrap: break-word;">"/leaderlatch"</td>
+            <td>String</td>
+            <td>Defines the znode of the leader latch which is used to elect the leader.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.leader</h5></td>
+            <td style="word-wrap: break-word;">"/leader"</td>
+            <td>String</td>
+            <td>Defines the znode of the leader which contains the URL to the leader and the current leader session ID.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.mesos-workers</h5></td>
+            <td style="word-wrap: break-word;">"/mesos-workers"</td>
+            <td>String</td>
+            <td>The ZooKeeper root path for persisting the Mesos worker information.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.running-registry</h5></td>
+            <td style="word-wrap: break-word;">"/running_job_registry/"</td>
+            <td>String</td>
+            <td></td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/expert_rest_section.html b/docs/_includes/generated/expert_rest_section.html
new file mode 100644
index 0000000..ab1a70c
--- /dev/null
+++ b/docs/_includes/generated/expert_rest_section.html
@@ -0,0 +1,66 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>rest.await-leader-timeout</h5></td>
+            <td style="word-wrap: break-word;">30000</td>
+            <td>Long</td>
+            <td>The time in ms that the client waits for the leader address, e.g., Dispatcher or WebMonitorEndpoint</td>
+        </tr>
+        <tr>
+            <td><h5>rest.client.max-content-length</h5></td>
+            <td style="word-wrap: break-word;">104857600</td>
+            <td>Integer</td>
+            <td>The maximum content length in bytes that the client will handle.</td>
+        </tr>
+        <tr>
+            <td><h5>rest.connection-timeout</h5></td>
+            <td style="word-wrap: break-word;">15000</td>
+            <td>Long</td>
+            <td>The maximum time in ms for the client to establish a TCP connection.</td>
+        </tr>
+        <tr>
+            <td><h5>rest.idleness-timeout</h5></td>
+            <td style="word-wrap: break-word;">300000</td>
+            <td>Long</td>
+            <td>The maximum time in ms for a connection to stay idle before failing.</td>
+        </tr>
+        <tr>
+            <td><h5>rest.retry.delay</h5></td>
+            <td style="word-wrap: break-word;">3000</td>
+            <td>Long</td>
+            <td>The time in ms that the client waits between retries (See also `rest.retry.max-attempts`).</td>
+        </tr>
+        <tr>
+            <td><h5>rest.retry.max-attempts</h5></td>
+            <td style="word-wrap: break-word;">20</td>
+            <td>Integer</td>
+            <td>The number of retries the client will attempt if a retryable operations fails.</td>
+        </tr>
+        <tr>
+            <td><h5>rest.server.max-content-length</h5></td>
+            <td style="word-wrap: break-word;">104857600</td>
+            <td>Integer</td>
+            <td>The maximum content length in bytes that the server will handle.</td>
+        </tr>
+        <tr>
+            <td><h5>rest.server.numThreads</h5></td>
+            <td style="word-wrap: break-word;">4</td>
+            <td>Integer</td>
+            <td>The number of threads for the asynchronous processing of requests.</td>
+        </tr>
+        <tr>
+            <td><h5>rest.server.thread-priority</h5></td>
+            <td style="word-wrap: break-word;">5</td>
+            <td>Integer</td>
+            <td>Thread priority of the REST server's executor for processing asynchronous requests. Lowering the thread priority will give Flink's main components more CPU time whereas increasing will allocate more time for the REST server's processing.</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/expert_rocksdb_section.html b/docs/_includes/generated/expert_rocksdb_section.html
new file mode 100644
index 0000000..d3cbfe1
--- /dev/null
+++ b/docs/_includes/generated/expert_rocksdb_section.html
@@ -0,0 +1,36 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>state.backend.rocksdb.checkpoint.transfer.thread.num</h5></td>
+            <td style="word-wrap: break-word;">1</td>
+            <td>Integer</td>
+            <td>The number of threads (per stateful operator) used to transfer (download and upload) files in RocksDBStateBackend.</td>
+        </tr>
+        <tr>
+            <td><h5>state.backend.rocksdb.localdir</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The local directory (on the TaskManager) where RocksDB puts its files.</td>
+        </tr>
+        <tr>
+            <td><h5>state.backend.rocksdb.options-factory</h5></td>
+            <td style="word-wrap: break-word;">"org.apache.flink.contrib.streaming.state.DefaultConfigurableOptionsFactory"</td>
+            <td>String</td>
+            <td>The options factory class for RocksDB to create DBOptions and ColumnFamilyOptions. The default options factory is org.apache.flink.contrib.streaming.state.DefaultConfigurableOptionsFactory, and it would read the configured options which provided in 'RocksDBConfigurableOptions'.</td>
+        </tr>
+        <tr>
+            <td><h5>state.backend.rocksdb.predefined-options</h5></td>
+            <td style="word-wrap: break-word;">"DEFAULT"</td>
+            <td>String</td>
+            <td>The predefined settings for RocksDB DBOptions and ColumnFamilyOptions by Flink community. Current supported candidate predefined-options are DEFAULT, SPINNING_DISK_OPTIMIZED, SPINNING_DISK_OPTIMIZED_HIGH_MEM or FLASH_SSD_OPTIMIZED. Note that user customized options and options from the OptionsFactory are applied on top of these predefined ones.</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/expert_scheduling_section.html b/docs/_includes/generated/expert_scheduling_section.html
new file mode 100644
index 0000000..a29ad69
--- /dev/null
+++ b/docs/_includes/generated/expert_scheduling_section.html
@@ -0,0 +1,30 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>cluster.evenly-spread-out-slots</h5></td>
+            <td style="word-wrap: break-word;">false</td>
+            <td>Boolean</td>
+            <td>Enable the slot spread out allocation strategy. This strategy tries to spread out the slots evenly across all available <span markdown="span">`TaskExecutors`</span>.</td>
+        </tr>
+        <tr>
+            <td><h5>slot.idle.timeout</h5></td>
+            <td style="word-wrap: break-word;">50000</td>
+            <td>Long</td>
+            <td>The timeout in milliseconds for a idle slot in Slot Pool.</td>
+        </tr>
+        <tr>
+            <td><h5>slot.request.timeout</h5></td>
+            <td style="word-wrap: break-word;">300000</td>
+            <td>Long</td>
+            <td>The timeout in milliseconds for requesting a slot from Slot Pool.</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/expert_security_ssl_section.html b/docs/_includes/generated/expert_security_ssl_section.html
new file mode 100644
index 0000000..9e5af5f
--- /dev/null
+++ b/docs/_includes/generated/expert_security_ssl_section.html
@@ -0,0 +1,42 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>security.ssl.internal.close-notify-flush-timeout</h5></td>
+            <td style="word-wrap: break-word;">-1</td>
+            <td>Integer</td>
+            <td>The timeout (in ms) for flushing the `close_notify` that was triggered by closing a channel. If the `close_notify` was not flushed in the given timeout the channel will be closed forcibly. (-1 = use system default)</td>
+        </tr>
+        <tr>
+            <td><h5>security.ssl.internal.handshake-timeout</h5></td>
+            <td style="word-wrap: break-word;">-1</td>
+            <td>Integer</td>
+            <td>The timeout (in ms) during SSL handshake. (-1 = use system default)</td>
+        </tr>
+        <tr>
+            <td><h5>security.ssl.internal.session-cache-size</h5></td>
+            <td style="word-wrap: break-word;">-1</td>
+            <td>Integer</td>
+            <td>The size of the cache used for storing SSL session objects. According to https://github.com/netty/netty/issues/832, you should always set this to an appropriate number to not run into a bug with stalling IO threads during garbage collection. (-1 = use system default).</td>
+        </tr>
+        <tr>
+            <td><h5>security.ssl.internal.session-timeout</h5></td>
+            <td style="word-wrap: break-word;">-1</td>
+            <td>Integer</td>
+            <td>The timeout (in ms) for the cached SSL session objects. (-1 = use system default)</td>
+        </tr>
+        <tr>
+            <td><h5>security.ssl.provider</h5></td>
+            <td style="word-wrap: break-word;">"JDK"</td>
+            <td>String</td>
+            <td>The SSL engine provider to use for the ssl transport:<ul><li><span markdown="span">`JDK`</span>: default Java-based SSL engine</li><li><span markdown="span">`OPENSSL`</span>: openSSL-based SSL engine using system libraries</li></ul><span markdown="span">`OPENSSL`</span> is based on <a href="http://netty.io/wiki/forked-tomcat-native.html#wiki-h2-4">netty-tcnative</a> and comes in two flavours:<ul><li>dynamically linked: This will use your system's openSSL libraries (if com [...]
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/expert_state_backends_section.html b/docs/_includes/generated/expert_state_backends_section.html
new file mode 100644
index 0000000..9d50be1
--- /dev/null
+++ b/docs/_includes/generated/expert_state_backends_section.html
@@ -0,0 +1,30 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>state.backend.async</h5></td>
+            <td style="word-wrap: break-word;">true</td>
+            <td>Boolean</td>
+            <td>Option whether the state backend should use an asynchronous snapshot method where possible and configurable. Some state backends may not support asynchronous snapshots, or only support asynchronous snapshots, and ignore this option.</td>
+        </tr>
+        <tr>
+            <td><h5>state.backend.fs.memory-threshold</h5></td>
+            <td style="word-wrap: break-word;">1024</td>
+            <td>Integer</td>
+            <td>The minimum size of state data files. All state chunks smaller than that are stored inline in the root checkpoint metadata file.</td>
+        </tr>
+        <tr>
+            <td><h5>state.backend.fs.write-buffer-size</h5></td>
+            <td style="word-wrap: break-word;">4096</td>
+            <td>Integer</td>
+            <td>The default size of the write buffer for the checkpoint streams that write to file systems. The actual write buffer size is determined to be the maximum of the value of this option and option 'state.backend.fs.memory-threshold'.</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/high_availability_configuration.html b/docs/_includes/generated/high_availability_configuration.html
index bf30e91..86573d9 100644
--- a/docs/_includes/generated/high_availability_configuration.html
+++ b/docs/_includes/generated/high_availability_configuration.html
@@ -32,5 +32,89 @@
             <td>String</td>
             <td>File system path (URI) where Flink persists metadata in high-availability setups.</td>
         </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.client.acl</h5></td>
+            <td style="word-wrap: break-word;">"open"</td>
+            <td>String</td>
+            <td>Defines the ACL (open|creator) to be configured on ZK node. The configuration value can be set to “creator” if the ZooKeeper server configuration has the “authProvider” property mapped to use SASLAuthenticationProvider and the cluster is configured to run in secure mode (Kerberos).</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.client.connection-timeout</h5></td>
+            <td style="word-wrap: break-word;">15000</td>
+            <td>Integer</td>
+            <td>Defines the connection timeout for ZooKeeper in ms.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.client.max-retry-attempts</h5></td>
+            <td style="word-wrap: break-word;">3</td>
+            <td>Integer</td>
+            <td>Defines the number of connection retries before the client gives up.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.client.retry-wait</h5></td>
+            <td style="word-wrap: break-word;">5000</td>
+            <td>Integer</td>
+            <td>Defines the pause between consecutive retries in ms.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.client.session-timeout</h5></td>
+            <td style="word-wrap: break-word;">60000</td>
+            <td>Integer</td>
+            <td>Defines the session timeout for the ZooKeeper session in ms.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.checkpoint-counter</h5></td>
+            <td style="word-wrap: break-word;">"/checkpoint-counter"</td>
+            <td>String</td>
+            <td>ZooKeeper root path (ZNode) for checkpoint counters.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.checkpoints</h5></td>
+            <td style="word-wrap: break-word;">"/checkpoints"</td>
+            <td>String</td>
+            <td>ZooKeeper root path (ZNode) for completed checkpoints.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.jobgraphs</h5></td>
+            <td style="word-wrap: break-word;">"/jobgraphs"</td>
+            <td>String</td>
+            <td>ZooKeeper root path (ZNode) for job graphs</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.latch</h5></td>
+            <td style="word-wrap: break-word;">"/leaderlatch"</td>
+            <td>String</td>
+            <td>Defines the znode of the leader latch which is used to elect the leader.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.leader</h5></td>
+            <td style="word-wrap: break-word;">"/leader"</td>
+            <td>String</td>
+            <td>Defines the znode of the leader which contains the URL to the leader and the current leader session ID.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.mesos-workers</h5></td>
+            <td style="word-wrap: break-word;">"/mesos-workers"</td>
+            <td>String</td>
+            <td>The ZooKeeper root path for persisting the Mesos worker information.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.root</h5></td>
+            <td style="word-wrap: break-word;">"/flink"</td>
+            <td>String</td>
+            <td>The root path under which Flink stores its entries in ZooKeeper.</td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.path.running-registry</h5></td>
+            <td style="word-wrap: break-word;">"/running_job_registry/"</td>
+            <td>String</td>
+            <td></td>
+        </tr>
+        <tr>
+            <td><h5>high-availability.zookeeper.quorum</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>The ZooKeeper quorum to use, when running Flink in a high-availability mode with ZooKeeper.</td>
+        </tr>
     </tbody>
 </table>
diff --git a/docs/_includes/generated/job_manager_configuration.html b/docs/_includes/generated/job_manager_configuration.html
index 76671e5..ef9c565 100644
--- a/docs/_includes/generated/job_manager_configuration.html
+++ b/docs/_includes/generated/job_manager_configuration.html
@@ -22,7 +22,7 @@
         </tr>
         <tr>
             <td><h5>jobmanager.execution.failover-strategy</h5></td>
-            <td style="word-wrap: break-word;">"full"</td>
+            <td style="word-wrap: break-word;">region</td>
             <td>String</td>
             <td>This option specifies how the job computation recovers from task failures. Accepted values are:<ul><li>'full': Restarts all tasks to recover the job.</li><li>'region': Restarts all tasks that could be affected by the task failure. More details can be found <a href="../dev/task_failure_recovery.html#restart-pipelined-region-failover-strategy">here</a>.</li></ul></td>
         </tr>
diff --git a/docs/_includes/generated/netty_shuffle_environment_configuration.html b/docs/_includes/generated/netty_shuffle_environment_configuration.html
index 8a5b3ba..901a40b 100644
--- a/docs/_includes/generated/netty_shuffle_environment_configuration.html
+++ b/docs/_includes/generated/netty_shuffle_environment_configuration.html
@@ -51,6 +51,48 @@
             <td>Number of extra network buffers to use for each outgoing/incoming gate (result partition/input gate). In credit-based flow control mode, this indicates how many floating credits are shared among all the input channels. The floating buffers are distributed based on backlog (real-time output buffers in the subpartition) feedback, and can help relieve back-pressure caused by unbalanced data distribution among the subpartitions. This value should be increased in case of highe [...]
         </tr>
         <tr>
+            <td><h5>taskmanager.network.netty.client.connectTimeoutSec</h5></td>
+            <td style="word-wrap: break-word;">120</td>
+            <td>Integer</td>
+            <td>The Netty client connection timeout.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.network.netty.client.numThreads</h5></td>
+            <td style="word-wrap: break-word;">-1</td>
+            <td>Integer</td>
+            <td>The number of Netty client threads.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.network.netty.num-arenas</h5></td>
+            <td style="word-wrap: break-word;">-1</td>
+            <td>Integer</td>
+            <td>The number of Netty arenas.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.network.netty.sendReceiveBufferSize</h5></td>
+            <td style="word-wrap: break-word;">0</td>
+            <td>Integer</td>
+            <td>The Netty send and receive buffer size. This defaults to the system buffer size (cat /proc/sys/net/ipv4/tcp_[rw]mem) and is 4 MiB in modern Linux.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.network.netty.server.backlog</h5></td>
+            <td style="word-wrap: break-word;">0</td>
+            <td>Integer</td>
+            <td>The netty server connection backlog.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.network.netty.server.numThreads</h5></td>
+            <td style="word-wrap: break-word;">-1</td>
+            <td>Integer</td>
+            <td>The number of Netty server threads.</td>
+        </tr>
+        <tr>
+            <td><h5>taskmanager.network.netty.transport</h5></td>
+            <td style="word-wrap: break-word;">"auto"</td>
+            <td>String</td>
+            <td>The Netty transport type, either "nio" or "epoll". The "auto" means selecting the property mode automatically based on the platform. Note that the "epoll" mode can get better performance, less GC and have more advanced features which are only available on modern Linux.</td>
+        </tr>
+        <tr>
             <td><h5>taskmanager.network.request-backoff.initial</h5></td>
             <td style="word-wrap: break-word;">100</td>
             <td>Integer</td>
diff --git a/docs/_includes/generated/resource_manager_configuration.html b/docs/_includes/generated/resource_manager_configuration.html
index 0e7192f..c1955f8 100644
--- a/docs/_includes/generated/resource_manager_configuration.html
+++ b/docs/_includes/generated/resource_manager_configuration.html
@@ -21,12 +21,6 @@
             <td>Percentage of heap space to remove from Job Master containers (YARN / Mesos / Kubernetes), to compensate for other JVM memory usage.</td>
         </tr>
         <tr>
-            <td><h5>local.number-resourcemanager</h5></td>
-            <td style="word-wrap: break-word;">1</td>
-            <td>Integer</td>
-            <td>The number of resource managers start.</td>
-        </tr>
-        <tr>
             <td><h5>resourcemanager.job.timeout</h5></td>
             <td style="word-wrap: break-word;">"5 minutes"</td>
             <td>String</td>
diff --git a/docs/_includes/generated/security_auth_kerberos_section.html b/docs/_includes/generated/security_auth_kerberos_section.html
new file mode 100644
index 0000000..cf27917
--- /dev/null
+++ b/docs/_includes/generated/security_auth_kerberos_section.html
@@ -0,0 +1,36 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>security.kerberos.login.contexts</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>A comma-separated list of login contexts to provide the Kerberos credentials to (for example, `Client,KafkaClient` to use the credentials for ZooKeeper authentication and for Kafka authentication)</td>
+        </tr>
+        <tr>
+            <td><h5>security.kerberos.login.keytab</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>Absolute path to a Kerberos keytab file that contains the user credentials.</td>
+        </tr>
+        <tr>
+            <td><h5>security.kerberos.login.principal</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>Kerberos principal name associated with the keytab.</td>
+        </tr>
+        <tr>
+            <td><h5>security.kerberos.login.use-ticket-cache</h5></td>
+            <td style="word-wrap: break-word;">true</td>
+            <td>Boolean</td>
+            <td>Indicates whether to read from your Kerberos ticket cache.</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/security_auth_zk_section.html b/docs/_includes/generated/security_auth_zk_section.html
new file mode 100644
index 0000000..7aaaaff
--- /dev/null
+++ b/docs/_includes/generated/security_auth_zk_section.html
@@ -0,0 +1,30 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>zookeeper.sasl.disable</h5></td>
+            <td style="word-wrap: break-word;">false</td>
+            <td>Boolean</td>
+            <td></td>
+        </tr>
+        <tr>
+            <td><h5>zookeeper.sasl.login-context-name</h5></td>
+            <td style="word-wrap: break-word;">"Client"</td>
+            <td>String</td>
+            <td></td>
+        </tr>
+        <tr>
+            <td><h5>zookeeper.sasl.service-name</h5></td>
+            <td style="word-wrap: break-word;">"zookeeper"</td>
+            <td>String</td>
+            <td></td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/security_configuration.html b/docs/_includes/generated/security_configuration.html
index bdf49f2..caeaa33 100644
--- a/docs/_includes/generated/security_configuration.html
+++ b/docs/_includes/generated/security_configuration.html
@@ -9,6 +9,30 @@
     </thead>
     <tbody>
         <tr>
+            <td><h5>security.kerberos.login.contexts</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>A comma-separated list of login contexts to provide the Kerberos credentials to (for example, `Client,KafkaClient` to use the credentials for ZooKeeper authentication and for Kafka authentication)</td>
+        </tr>
+        <tr>
+            <td><h5>security.kerberos.login.keytab</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>Absolute path to a Kerberos keytab file that contains the user credentials.</td>
+        </tr>
+        <tr>
+            <td><h5>security.kerberos.login.principal</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>String</td>
+            <td>Kerberos principal name associated with the keytab.</td>
+        </tr>
+        <tr>
+            <td><h5>security.kerberos.login.use-ticket-cache</h5></td>
+            <td style="word-wrap: break-word;">true</td>
+            <td>Boolean</td>
+            <td>Indicates whether to read from your Kerberos ticket cache.</td>
+        </tr>
+        <tr>
             <td><h5>security.ssl.algorithms</h5></td>
             <td style="word-wrap: break-word;">"TLS_RSA_WITH_AES_128_CBC_SHA"</td>
             <td>String</td>
@@ -81,24 +105,6 @@
             <td>The password to decrypt the truststore for Flink's internal endpoints (rpc, data transport, blob server).</td>
         </tr>
         <tr>
-            <td><h5>security.ssl.key-password</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>The secret to decrypt the server key in the keystore.</td>
-        </tr>
-        <tr>
-            <td><h5>security.ssl.keystore</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>The Java keystore file to be used by the flink endpoint for its SSL Key and Certificate.</td>
-        </tr>
-        <tr>
-            <td><h5>security.ssl.keystore-password</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>The secret to decrypt the keystore file.</td>
-        </tr>
-        <tr>
             <td><h5>security.ssl.protocol</h5></td>
             <td style="word-wrap: break-word;">"TLSv1.2"</td>
             <td>String</td>
@@ -159,22 +165,28 @@
             <td>The password to decrypt the truststore for Flink's external REST endpoints.</td>
         </tr>
         <tr>
-            <td><h5>security.ssl.truststore</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>The truststore file containing the public CA certificates to be used by flink endpoints to verify the peer’s certificate.</td>
+            <td><h5>security.ssl.verify-hostname</h5></td>
+            <td style="word-wrap: break-word;">true</td>
+            <td>Boolean</td>
+            <td>Flag to enable peer’s hostname verification during ssl handshake.</td>
         </tr>
         <tr>
-            <td><h5>security.ssl.truststore-password</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
+            <td><h5>zookeeper.sasl.disable</h5></td>
+            <td style="word-wrap: break-word;">false</td>
+            <td>Boolean</td>
+            <td></td>
+        </tr>
+        <tr>
+            <td><h5>zookeeper.sasl.login-context-name</h5></td>
+            <td style="word-wrap: break-word;">"Client"</td>
             <td>String</td>
-            <td>The secret to decrypt the truststore.</td>
+            <td></td>
         </tr>
         <tr>
-            <td><h5>security.ssl.verify-hostname</h5></td>
-            <td style="word-wrap: break-word;">true</td>
-            <td>Boolean</td>
-            <td>Flag to enable peer’s hostname verification during ssl handshake.</td>
+            <td><h5>zookeeper.sasl.service-name</h5></td>
+            <td style="word-wrap: break-word;">"zookeeper"</td>
+            <td>String</td>
+            <td></td>
         </tr>
     </tbody>
 </table>
diff --git a/docs/_includes/generated/security_configuration.html b/docs/_includes/generated/security_ssl_section.html
similarity index 60%
copy from docs/_includes/generated/security_configuration.html
copy to docs/_includes/generated/security_ssl_section.html
index bdf49f2..ae345d8 100644
--- a/docs/_includes/generated/security_configuration.html
+++ b/docs/_includes/generated/security_ssl_section.html
@@ -21,24 +21,12 @@
             <td>The sha1 fingerprint of the internal certificate. This further protects the internal communication to present the exact certificate used by Flink.This is necessary where one cannot use private CA(self signed) or there is internal firm wide CA is required</td>
         </tr>
         <tr>
-            <td><h5>security.ssl.internal.close-notify-flush-timeout</h5></td>
-            <td style="word-wrap: break-word;">-1</td>
-            <td>Integer</td>
-            <td>The timeout (in ms) for flushing the `close_notify` that was triggered by closing a channel. If the `close_notify` was not flushed in the given timeout the channel will be closed forcibly. (-1 = use system default)</td>
-        </tr>
-        <tr>
             <td><h5>security.ssl.internal.enabled</h5></td>
             <td style="word-wrap: break-word;">false</td>
             <td>Boolean</td>
             <td>Turns on SSL for internal network communication. Optionally, specific components may override this through their own settings (rpc, data transport, REST, etc).</td>
         </tr>
         <tr>
-            <td><h5>security.ssl.internal.handshake-timeout</h5></td>
-            <td style="word-wrap: break-word;">-1</td>
-            <td>Integer</td>
-            <td>The timeout (in ms) during SSL handshake. (-1 = use system default)</td>
-        </tr>
-        <tr>
             <td><h5>security.ssl.internal.key-password</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>String</td>
@@ -57,18 +45,6 @@
             <td>The secret to decrypt the keystore file for Flink's for Flink's internal endpoints (rpc, data transport, blob server).</td>
         </tr>
         <tr>
-            <td><h5>security.ssl.internal.session-cache-size</h5></td>
-            <td style="word-wrap: break-word;">-1</td>
-            <td>Integer</td>
-            <td>The size of the cache used for storing SSL session objects. According to https://github.com/netty/netty/issues/832, you should always set this to an appropriate number to not run into a bug with stalling IO threads during garbage collection. (-1 = use system default).</td>
-        </tr>
-        <tr>
-            <td><h5>security.ssl.internal.session-timeout</h5></td>
-            <td style="word-wrap: break-word;">-1</td>
-            <td>Integer</td>
-            <td>The timeout (in ms) for the cached SSL session objects. (-1 = use system default)</td>
-        </tr>
-        <tr>
             <td><h5>security.ssl.internal.truststore</h5></td>
             <td style="word-wrap: break-word;">(none)</td>
             <td>String</td>
@@ -81,36 +57,12 @@
             <td>The password to decrypt the truststore for Flink's internal endpoints (rpc, data transport, blob server).</td>
         </tr>
         <tr>
-            <td><h5>security.ssl.key-password</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>The secret to decrypt the server key in the keystore.</td>
-        </tr>
-        <tr>
-            <td><h5>security.ssl.keystore</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>The Java keystore file to be used by the flink endpoint for its SSL Key and Certificate.</td>
-        </tr>
-        <tr>
-            <td><h5>security.ssl.keystore-password</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>The secret to decrypt the keystore file.</td>
-        </tr>
-        <tr>
             <td><h5>security.ssl.protocol</h5></td>
             <td style="word-wrap: break-word;">"TLSv1.2"</td>
             <td>String</td>
             <td>The SSL protocol version to be supported for the ssl transport. Note that it doesn’t support comma separated list.</td>
         </tr>
         <tr>
-            <td><h5>security.ssl.provider</h5></td>
-            <td style="word-wrap: break-word;">"JDK"</td>
-            <td>String</td>
-            <td>The SSL engine provider to use for the ssl transport:<ul><li><span markdown="span">`JDK`</span>: default Java-based SSL engine</li><li><span markdown="span">`OPENSSL`</span>: openSSL-based SSL engine using system libraries</li></ul><span markdown="span">`OPENSSL`</span> is based on <a href="http://netty.io/wiki/forked-tomcat-native.html#wiki-h2-4">netty-tcnative</a> and comes in two flavours:<ul><li>dynamically linked: This will use your system's openSSL libraries (if com [...]
-        </tr>
-        <tr>
             <td><h5>security.ssl.rest.authentication-enabled</h5></td>
             <td style="word-wrap: break-word;">false</td>
             <td>Boolean</td>
@@ -159,18 +111,6 @@
             <td>The password to decrypt the truststore for Flink's external REST endpoints.</td>
         </tr>
         <tr>
-            <td><h5>security.ssl.truststore</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>The truststore file containing the public CA certificates to be used by flink endpoints to verify the peer’s certificate.</td>
-        </tr>
-        <tr>
-            <td><h5>security.ssl.truststore-password</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>The secret to decrypt the truststore.</td>
-        </tr>
-        <tr>
             <td><h5>security.ssl.verify-hostname</h5></td>
             <td style="word-wrap: break-word;">true</td>
             <td>Boolean</td>
diff --git a/docs/_includes/generated/state_backend_rocksdb_section.html b/docs/_includes/generated/state_backend_rocksdb_section.html
new file mode 100644
index 0000000..974c5c1
--- /dev/null
+++ b/docs/_includes/generated/state_backend_rocksdb_section.html
@@ -0,0 +1,42 @@
+<table class="table table-bordered">
+    <thead>
+        <tr>
+            <th class="text-left" style="width: 20%">Key</th>
+            <th class="text-left" style="width: 15%">Default</th>
+            <th class="text-left" style="width: 10%">Type</th>
+            <th class="text-left" style="width: 55%">Description</th>
+        </tr>
+    </thead>
+    <tbody>
+        <tr>
+            <td><h5>state.backend.rocksdb.memory.fixed-per-slot</h5></td>
+            <td style="word-wrap: break-word;">(none)</td>
+            <td>MemorySize</td>
+            <td>The fixed total amount of memory, shared among all RocksDB instances per slot. This option overrides the 'state.backend.rocksdb.memory.managed' option when configured. If neither this option, nor the 'state.backend.rocksdb.memory.managed' optionare set, then each RocksDB column family state has its own memory caches (as controlled by the column family options).</td>
+        </tr>
+        <tr>
+            <td><h5>state.backend.rocksdb.memory.high-prio-pool-ratio</h5></td>
+            <td style="word-wrap: break-word;">0.1</td>
+            <td>Double</td>
+            <td>The fraction of cache memory that is reserved for high-priority data like index, filter, and compression dictionary blocks. This option only has an effect when 'state.backend.rocksdb.memory.managed' or 'state.backend.rocksdb.memory.fixed-per-slot' are configured.</td>
+        </tr>
+        <tr>
+            <td><h5>state.backend.rocksdb.memory.managed</h5></td>
+            <td style="word-wrap: break-word;">true</td>
+            <td>Boolean</td>
+            <td>If set, the RocksDB state backend will automatically configure itself to use the managed memory budget of the task slot, and divide the memory over write buffers, indexes, block caches, etc. That way, the three major uses of memory of RocksDB will be capped.</td>
+        </tr>
+        <tr>
+            <td><h5>state.backend.rocksdb.memory.write-buffer-ratio</h5></td>
+            <td style="word-wrap: break-word;">0.5</td>
+            <td>Double</td>
+            <td>The maximum amount of memory that write buffers may take, as a fraction of the total shared memory. This option only has an effect when 'state.backend.rocksdb.memory.managed' or 'state.backend.rocksdb.memory.fixed-per-slot' are configured.</td>
+        </tr>
+        <tr>
+            <td><h5>state.backend.rocksdb.timer-service.factory</h5></td>
+            <td style="word-wrap: break-word;">"ROCKSDB"</td>
+            <td>String</td>
+            <td>This determines the factory for timer service state implementation. Options are either HEAP (heap-based, default) or ROCKSDB for an implementation based on RocksDB .</td>
+        </tr>
+    </tbody>
+</table>
diff --git a/docs/_includes/generated/task_manager_configuration.html b/docs/_includes/generated/task_manager_configuration.html
index fed0f21..db02dee 100644
--- a/docs/_includes/generated/task_manager_configuration.html
+++ b/docs/_includes/generated/task_manager_configuration.html
@@ -27,12 +27,6 @@
             <td>Time we wait for the timers in milliseconds to finish all pending timer threads when the stream task is cancelled.</td>
         </tr>
         <tr>
-            <td><h5>task.checkpoint.alignment.max-size</h5></td>
-            <td style="word-wrap: break-word;">-1</td>
-            <td>Long</td>
-            <td>The maximum number of bytes that a checkpoint alignment may buffer. If the checkpoint alignment buffers more than the configured amount of data, the checkpoint is aborted (skipped). A value of -1 indicates that there is no limit.</td>
-        </tr>
-        <tr>
             <td><h5>taskmanager.debug.memory.log</h5></td>
             <td style="word-wrap: break-word;">false</td>
             <td>Boolean</td>
diff --git a/docs/_includes/generated/web_configuration.html b/docs/_includes/generated/web_configuration.html
index 11a6968..f377f51 100644
--- a/docs/_includes/generated/web_configuration.html
+++ b/docs/_includes/generated/web_configuration.html
@@ -15,12 +15,6 @@
             <td>Access-Control-Allow-Origin header for all responses from the web-frontend.</td>
         </tr>
         <tr>
-            <td><h5>web.address</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>String</td>
-            <td>Address for runtime monitor web-frontend server.</td>
-        </tr>
-        <tr>
             <td><h5>web.backpressure.cleanup-interval</h5></td>
             <td style="word-wrap: break-word;">600000</td>
             <td>Integer</td>
@@ -69,12 +63,6 @@
             <td>Refresh interval for the web-frontend in milliseconds.</td>
         </tr>
         <tr>
-            <td><h5>web.ssl.enabled</h5></td>
-            <td style="word-wrap: break-word;">true</td>
-            <td>Boolean</td>
-            <td>Flag indicating whether to override SSL support for the JobManager Web UI.</td>
-        </tr>
-        <tr>
             <td><h5>web.submit.enable</h5></td>
             <td style="word-wrap: break-word;">true</td>
             <td>Boolean</td>
diff --git a/docs/ops/config.md b/docs/ops/config.md
index 0a42ec7..032faa5 100644
--- a/docs/ops/config.md
+++ b/docs/ops/config.md
@@ -23,213 +23,411 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**For single-node setups Flink is ready to go out of the box and you don't need to change the default configuration to get started.**
+All configuration is done in `conf/flink-conf.yaml`, which is expected to be a flat collection of [YAML key value pairs](http://www.yaml.org/spec/1.2/spec.html) with format `key: value`.
+
+The configuration is parsed and evaluated when the Flink processes are started. Changes to the configuration file require restarting the relevant processes.
 
 The out of the box configuration will use your default Java installation. You can manually set the environment variable `JAVA_HOME` or the configuration key `env.java.home` in `conf/flink-conf.yaml` if you want to manually override the Java runtime to use.
 
-This page lists the most common options that are typically needed to set up a well performing (distributed) installation. In addition a full list of all available configuration parameters is listed here.
+* This will be replaced by the TOC
+{:toc}
 
-All configuration is done in `conf/flink-conf.yaml`, which is expected to be a flat collection of [YAML key value pairs](http://www.yaml.org/spec/1.2/spec.html) with format `key: value`.
+# Basic Setup
 
-The system and run scripts parse the config at startup time. Changes to the configuration file require restarting the Flink JobManager and TaskManagers.
+The default configuration supports starting a single-node Flink session cluster without any changes.
+The options in this section are the ones most commonly needed for a basic distributed Flink setup.
 
-The configuration files for the TaskManagers can be different, Flink does not assume uniform machines in the cluster.
+**Hostnames / Ports**
 
-* This will be replaced by the TOC
-{:toc}
+These options are only necessary for *standalone* application- or session deployments ([simple standalone]({{site.baseurl}}/ops/deployment/cluster_setup.html) or [Kubernetes]({{site.baseurl}}/ops/deployment/kubernetes.html)).
 
-## Common Options
+If you use Flink with [Yarn]({{site.baseurl}}/ops/deployment/yarn_setup.html), [Mesos]({{site.baseurl}}/ops/deployment/mesos.html), or the [*active* Kubernetes integration]({{site.baseurl}}/ops/deployment/native_kubernetes.html), the hostnames and ports are automatically discovered.
 
-{% include generated/common_section.html %}
+  - `rest.address`, `rest.port`: These are used by the client to connect to Flink. Set this to the hostname where the master (JobManager) runs, or to the hostname of the (Kubernetes) service in front of the Flink Master's REST interface.
 
-## Full Reference
+  - The `jobmanager.rpc.address` (defaults to *"localhost"*) and `jobmanager.rpc.port` (defaults to *6123*) config entries are used by the TaskManager to connect to the JobManager/ResourceManager. Set this to the hostname where the master (JobManager) runs, or to the hostname of the (Kubernetes internal) service for the Flink master (JobManager). This option is ignored on [setups with high-availability]({{site.baseurl}}/ops/jobmanager_high_availability.html) where the leader election mec [...]
 
-### Core
+**Memory Sizes** 
 
-{% include generated/core_configuration.html %}
+The default memory sizes support simple streaming/batch applications, but are too low to yield good performance for more complex applications.
 
-### Execution
+  - `jobmanager.heap.size`: Sets the size of the *Flink Master* (JobManager / ResourceManager / Dispatcher) JVM heap.
+  - `taskmanager.memory.process.size`: Total size of the TaskManager process, including everything. Flink will subtract some memory for the JVM's own memory requirements (metaspace and others), and divide and configure the rest automatically between its components (network, managed memory, JVM Heap, etc.).
 
-{% include generated/deployment_configuration.html %}
-{% include generated/savepoint_config_configuration.html %}
-{% include generated/execution_configuration.html %}
+These value are configured as memory sizes, for example *1536m* or *2g*.
+
+**Parallelism**
+
+  - `taskmanager.numberOfTaskSlots`: The number of slots that a TaskManager offers *(default: 1)*. Each slot can take one task or pipeline.
+    Having multiple slots in a TaskManager can help amortize certain constant overheads (of the JVM, application libraries, or network connections) across parallel tasks or pipelines.
+
+     Running more smaller TaskManagers with one slot each is a good starting point and leads to the best isolation between tasks. Dedicating the same resources to fewer larger TaskManagers with more slots can help to increase resource utilization, at the cost of weaker isolation between the tasks (more tasks share the same JVM).
+
+  - `parallelism.default`: The default parallelism used when no parallelism is specified anywhere *(default: 1)*.
+
+**Checkpointing**
+
+You can configure checkpointing directly in code within your Flink job or application. Putting these values here in the configuration defines them as defaults in case the application does not configure anything.
+
+  - `state.backend`: The state backend to use. This defines the data structure mechanism for taking snapshots. Common values are `filesystem` or `rocksdb`.
+  - `state.checkpoints.dir`: The directory to write checkpoints to. This takes a path URI like *s3://mybucket/flink-app/checkpoints* or *hdfs://namenode:port/flink/checkpoints*.
+  - `state.savepoints.dir`: The default directory for savepoints. Takes a path URI, similar to `state.checkpoints.dir`.
 
-### JobManager
+**Web UI**
 
-{% include generated/job_manager_configuration.html %}
+  - `web.submit.enable`: Enables uploading and starting jobs through the Flink UI *(true by default)*. Please note that even when this is disabled, session clusters still accept jobs through REST requests (HTTP calls). This flag only guards the feature to upload jobs in the UI.
+  - `web.upload.dir`: The directory where to store uploaded jobs. Only used when `web.submit.enable` is true.
 
-### Restart Strategies
+**Other**
 
-Configuration options to control Flink's restart behaviour in case of job failures.
+  - `io.tmp.dirs`: The directories where Flink puts local data, defaults to the system temp directory (`java.io.tmpdir` property). If a list of directories is configured, Flink will rotate files across the directories.
+    
+    The data put in these directories include by default the files created by RocksDB, spilled intermediate results (batch algorithms), and cached jar files.
+    
+    This data is NOT relied upon for persistence/recovery, but if this data gets deleted, it typically causes a heavyweight recovery operation. It is hence recommended to set this to a directory that is not automatically periodically purged.
+    
+    Yarn, Mesos, and Kubernetes setups automatically configure this value to the local working directories by default.
+
+----
+----
+
+# Common Setup Options
+
+*Common options to configure your Flink application or cluster.*
+
+### Hosts and Ports
+
+Options to configure hostnames and ports for the different Flink components.
+
+The JobManager hostname and port are only relevant for standalone setups without high-availability.
+In that setup, the config values are used by the TaskManagers to find (and connect to) the JobManager.
+In all highly-available setups, the TaskManagers discover the JobManager via the High-Availability-Service (for example ZooKeeper).
+
+Setups using resource orchestration frameworks (K8s, Yarn, Mesos) typically use the framework's service discovery facilities.
+
+You do not need to configure any TaskManager hosts and ports, unless the setup requires the use of specific port ranges or specific network interfaces to bind to.
+
+{% include generated/common_host_port_section.html %}
+
+### Fault Tolerance
+
+These configuration options control Flink's restart behaviour in case of failures during the execution. 
+By configuring these options in your `flink-conf.yaml`, you define the cluster's default restart strategy. 
+
+The default restart strategy will only take effect if no job specific restart strategy has been configured via the `ExecutionConfig`.
 
 {% include generated/restart_strategy_configuration.html %}
 
-#### Fixed Delay Restart Strategy
+**Fixed Delay Restart Strategy**
 
 {% include generated/fixed_delay_restart_strategy_configuration.html %}
 
-#### Failure Rate Restart Strategy
+**Failure Rate Restart Strategy**
 
 {% include generated/failure_rate_restart_strategy_configuration.html %}
 
-### TaskManager
+### Checkpoints and State Backends
 
-{% include generated/task_manager_configuration.html %}
+These options control the basic setup of state backends and checkpointing behavior.
 
-For *batch* jobs Flink allocates a fraction of 0.4 of the total flink memory (configured via taskmanager.memory.flink.size) for its managed memory. Managed memory helps Flink to run the batch operators efficiently. It prevents OutOfMemoryExceptions because Flink knows how much memory it can use to execute operations. If Flink runs out of managed memory, it utilizes disk space. Using managed memory, some operations can be performed directly on the raw data without having to deserialize th [...]
+The options are only relevant for jobs/applications executing in a continuous streaming fashion.
+Jobs/applications executing in a batch fashion do not use state backends and checkpoints, but different internal data structures that are optimized for batch processing.
 
-The default fraction for managed memory can be adjusted using the taskmanager.memory.managed.fraction parameter. An absolute value may be set using taskmanager.memory.managed.size (overrides the fraction parameter). If desired, the managed memory may be allocated outside the JVM heap. This may improve performance in setups with large memory sizes.
+{% include generated/common_state_backends_section.html %}
 
-{% include generated/task_manager_memory_configuration.html %}
+### High Availability
 
-### Distributed Coordination
+High-availability here refers to the ability of the master (JobManager) process to recover from failures.
 
-{% include generated/cluster_configuration.html %}
+The JobManager ensures consistency during recovery across TaskManagers. For the JobManager itself to recover consistently, an external service must store a minimal amount of recovery metadata (like "ID of last committed checkpoint"), as well as help to elect and lock which JobManager is the leader (to avoid split-brain situations).
 
-### Distributed Coordination (via Akka)
+{% include generated/common_high_availability_section.html %}
 
-{% include generated/akka_configuration.html %}
+**Options for high-availability setups with ZooKeeper**
 
-### REST
+{% include generated/common_high_availability_zk_section.html %}
 
-{% include generated/rest_configuration.html %}
+### Memory Configuration
 
-### Blob Server
+These configuration values control the way that TaskManagers and JobManagers use memory.
 
-{% include generated/blob_server_configuration.html %}
+Flink tries to shield users as much as possible from the complexity of configuring the JVM for data-intensive processing.
+In most cases, users should only need to set the values `taskmanager.memory.process.size` or `taskmanager.memory.flink.size` (depending on how the setup), and possibly adjusting the ratio of JVM heap and Managed Memory via `taskmanager.memory.managed.fraction`. The other options below can be used for performane tuning and fixing memory related errors.
 
-### Heartbeat Manager
+For a detailed explanation of how these options interact, see the [documentation on TaskManager memory configuration]({{site.baseurl}}/ops/memory_configuration.html).
 
-{% include generated/heartbeat_manager_configuration.html %}
+{% include generated/common_memory_section.html %}
 
-### SSL Settings
+### Miscellaneous Options
 
-{% include generated/security_configuration.html %}
+{% include generated/common_miscellaneous_section.html %}
 
-### Netty Shuffle Environment
+----
+----
 
-{% include generated/netty_shuffle_environment_configuration.html %}
+# Security
 
-### Network Communication (via Netty)
+Options for configuring Flink's security and secure interaction with external systems.
 
-These parameters allow for advanced tuning. The default values are sufficient when running concurrent high-throughput jobs on a large cluster.
+### SSL
 
-{% include generated/network_netty_configuration.html %}
+Flink's network connections can be secured via SSL. Please refer to the [SSL Setup Docs]({{site.baseurl}}/ops/security-ssl.html) for detailed setup guide and background.
 
-### Web Frontend
+{% include generated/security_ssl_section.html %}
 
-{% include generated/web_configuration.html %}
 
-### File Systems
+### Auth with External Systems
 
-{% include generated/file_system_configuration.html %}
+**ZooKeeper Authentication / Authorization**
 
-### Compiler/Optimizer
+These options are necessary when connecting to a secured ZooKeeper quorum.
 
-{% include generated/optimizer_configuration.html %}
+{% include generated/security_auth_zk_section.html %}
 
-### Runtime Algorithms
+**Kerberos-based Authentication / Authorization**
 
-{% include generated/algorithm_configuration.html %}
+Please refer to the [Flink and Kerberos Docs]({{site.baseurl}}/ops/security-kerberos.html) for a setup guide and a list of external system to which Flink can authenticate itself via Kerberos.
 
-### Resource Manager
+{% include generated/security_auth_kerberos_section.html %}
 
-The configuration keys in this section are independent of the used resource management framework (YARN, Mesos, Standalone, ...)
+----
+----
 
-{% include generated/resource_manager_configuration.html %}
+# Resource Orchestration Frameworks
+
+This section contains options related to integrating Flink with resource orchestration frameworks, like Kubernetes, Yarn, Mesos, etc.
 
-### Shuffle Service
+Note that is not always necessary to integrate Flink with the resource orchestration framework.
+For example, you can easily deploy Flink applications on Kubernetes without Flink knowing that it runs on Kubernetes (and without specifying any of the Kubernetes config options here.) See [this setup guide]({{site.baseurl}}/ops/deployment/kubernetes.html) for an example.
 
-{% include generated/shuffle_service_configuration.html %}
+The options in this section are necessary for setups where Flink itself actively requests and releases resources from the orchestrators.
 
 ### YARN
 
 {% include generated/yarn_config_configuration.html %}
 
+### Kubernetes
+
+{% include generated/kubernetes_config_configuration.html %}
+
 ### Mesos
 
 {% include generated/mesos_configuration.html %}
 
-#### Mesos TaskManager
+**Mesos TaskManager**
 
 {% include generated/mesos_task_manager_configuration.html %}
 
-### Kubernetes
+----
+----
 
-{% include generated/kubernetes_config_configuration.html %}
+# State Backends
 
-### High Availability (HA)
+Please refer to the [State Backend Documentation]({{site.baseurl}}/ops/state/state_backends.html) for background on State Backends.
 
-{% include generated/high_availability_configuration.html %}
+### RocksDB State Backend
 
-#### ZooKeeper-based HA Mode
+These are the options commonly needed to configure the RocksDB state backend. See the [Advanced RocksDB Backend Section](#advanced-rocksdb-state-backends-options) for options necessary for advanced low level configurations and trouble-shooting.
 
-{% include generated/high_availability_zookeeper_configuration.html %}
+{% include generated/state_backend_rocksdb_section.html %}
 
-### ZooKeeper Security
+----
+----
 
-{% include generated/zoo_keeper_configuration.html %}
+# Metrics
 
-### Kerberos-based Security
+Please refer to the [metrics system documentation]({{site.baseurl}}/monitoring/metrics.html) for background on Flink's metrics infrastructure.
 
-{% include generated/kerberos_configuration.html %}
+{% include generated/metric_configuration.html %}
 
-### Environment
+### RocksDB Native Metrics
 
-{% include generated/environment_configuration.html %}
+Flink can report metrics from RocksDB's native code, for applications using the RocksDB state backend.
+The metrics here are scoped to the operators and then further broken down by column family; values are reported as unsigned longs. 
 
-### Pipeline
+<div class="alert alert-warning">
+  <strong>Note:</strong> Enabling RocksDB's native metrics may cause degraded performance and should be set carefully. 
+</div>
 
-{% include generated/pipeline_configuration.html %}
-{% include generated/stream_pipeline_configuration.html %}
+{% include generated/rocks_db_native_metric_configuration.html %}
 
-### Checkpointing
+----
+----
 
-{% include generated/checkpointing_configuration.html %}
-{% include generated/execution_checkpointing_configuration.html %}
+# History Server
 
-### RocksDB State Backend
+The history server keeps the information of completed jobs (graphs, runtimes, statistics). To enable it, you have to enable "job archiving" in the JobManager (`jobmanager.archive.fs.dir`).
 
-{% include generated/rocks_db_configuration.html %}
+See the [History Server Docs]({{site.baseurl}}/monitoring/historyserver.html) for details.
 
-### RocksDB Configurable Options
-Specific RocksDB configurable options, provided by Flink, to create a corresponding `ConfigurableOptionsFactory`.
-And the created one would be used as default `OptionsFactory` in `RocksDBStateBackend`
-unless user define a `OptionsFactory` and set via `RocksDBStateBackend.setOptions(optionsFactory)`
+{% include generated/history_server_configuration.html %}
 
-{% include generated/rocks_db_configurable_configuration.html %}
+----
+----
+
+# Experimental
+
+*Options for experimental features in Flink.*
 
 ### Queryable State
 
+*Queryable State* is an experimental features that gives lets you access Flink's internal state like a key/value store.
+See the [Queryable State Docs]({{site.baseurl}}/dev/stream/state/queryable_state.html) for details.
+
 {% include generated/queryable_state_configuration.html %}
 
-### Metrics
+----
+----
 
-{% include generated/metric_configuration.html %}
-
-### RocksDB Native Metrics
-Certain RocksDB native metrics may be forwarded to Flink's metrics reporter.
-All native metrics are scoped to operators and then further broken down by column family; values are reported as unsigned longs. 
+# Debugging & Expert Tuning
 
 <div class="alert alert-warning">
-  <strong>Note:</strong> Enabling native metrics may cause degraded performance and should be set carefully. 
+  The options below here are meant for expert users and for fixing/debugging problems. Most setups should not need to configure these options.
 </div>
 
-{% include generated/rocks_db_native_metric_configuration.html %}
+### Class Loading
 
-### History Server
+Flink dynamically loads the code for jobs submitted to a session cluster. In addition, Flink tries to hide many dependencies in the classpath from the application. This helps to reduce dependency conflicts between the application code and the dependencies in the classpath.
 
-You have to configure `jobmanager.archive.fs.dir` in order to archive terminated jobs and add it to the list of monitored directories via `historyserver.archive.fs.dir` if you want to display them via the HistoryServer's web frontend.
+Please refer to the [Debugging Classloading Docs]({{site.baseurl}}/monitoring/debugging_classloading.html) for details.
 
-- `jobmanager.archive.fs.dir`: Directory to upload information about terminated jobs to. You have to add this directory to the list of monitored directories of the history server via `historyserver.archive.fs.dir`.
+{% include generated/expert_class_loading_section.html %}
 
-{% include generated/history_server_configuration.html %}
+### Advanced State Backends Options
+
+{% include generated/expert_state_backends_section.html %}
+
+### Advanced RocksDB State Backends Options
+
+Advanced options to tune RocksDB and RocksDB checkpoints.
+
+{% include generated/expert_rocksdb_section.html %}
+
+**RocksDB Configurable Options**
 
-## Legacy
+These options give fine-grained control over the behavior and resoures of ColumnFamilies.
+With the introduction of `state.backend.rocksdb.memory.managed` and `state.backend.rocksdb.memory.fixed-per-slot` (Apache Flink 1.10), it should be only necessary to use the options here for advanced performance tuning. These options here can also be specified in the application program via `RocksDBStateBackend.setOptions(PptionsFactory)`.
+
+{% include generated/rocks_db_configurable_configuration.html %}
+
+### Advanced Fault Tolerance Options
+
+*These parameters can help with problems related to failover and to components erroneously considering each other as failed.*
+
+{% include generated/expert_fault_tolerance_section.html %}
+
+### Advanced Scheduling Options
+
+*These parameters can help with fine-tuning scheduling for specific situations.*
+
+{% include generated/expert_scheduling_section.html %}
+
+### Advanced High-availability Options
+
+{% include generated/expert_high_availability_section.html %}
+
+### Advanced High-availability ZooKeeper Options
+
+{% include generated/expert_high_availability_zk_section.html %}
+
+### Advanced SSL Security Options
+
+{% include generated/expert_security_ssl_section.html %}
+
+### Advanced Options for the REST endpoint and Client
+
+{% include generated/expert_rest_section.html %}
+
+### Advanced Options for Flink Web UI
+
+{% include generated/web_configuration.html %}
+
+### Full Flink Master Options
+
+**Master / JobManager**
+
+{% include generated/all_jobmanager_section.html %}
+
+**Blob Server**
+
+The Blob Server is a component in the Flink Master / JobManager. It is used for distribution of objects that are too large to be attached to a RPC message and that benefit from caching (like Jar files or large serialized code objects).
+
+{% include generated/blob_server_configuration.html %}
+
+**ResourceManager**
+
+These configuration keys control basic Resource Manager behavior, independent of the used resource orchestration management framework (YARN, Mesos, etc.)
+
+{% include generated/resource_manager_configuration.html %}
+
+### Full TaskManagerOptions
+
+{% include generated/all_taskmanager_section.html %}
+
+**Data Transport Network Stack**
+
+These options are for the network stack that handles the streaming and batch data exchanges between TaskManagers.
+
+{% include generated/all_taskmanager_network_section.html %}
+
+### RPC / Akka
+
+Flink uses Akka for RPC between components (JobManager/TaskManager/ResourceManager).
+Flink does not use Akka for data transport.
+
+{% include generated/akka_configuration.html %}
+
+----
+----
+
+# Startup Script Environment
+
+{% include generated/environment_configuration.html %}
+
+----
+----
+
+# Deprecated Options
+
+These options relate to parts of Flink that are not actively developed any more.
+These options may be removed in a future release.
+
+**DataSet API Optimizer**
+
+{% include generated/optimizer_configuration.html %}
+
+**DataSet API Runtime Algorithms**
+
+{% include generated/algorithm_configuration.html %}
+
+**DataSet File Sinks**
+
+{% include generated/deprecated_file_sinks_section.html %}
+
+----
+----
+
+# Backup
+
+#### Execution
+
+{% include generated/deployment_configuration.html %}
+{% include generated/savepoint_config_configuration.html %}
+{% include generated/execution_configuration.html %}
+
+#### Pipeline
+
+{% include generated/pipeline_configuration.html %}
+{% include generated/stream_pipeline_configuration.html %}
+
+#### Checkpointing
+
+{% include generated/execution_checkpointing_configuration.html %}
 
-- `mode`: Execution mode of Flink. Possible values are `legacy` and `new`. In order to start the legacy components, you have to specify `legacy` (DEFAULT: `new`).
+----
+----
 
-## Background
+# Background
 
 ### Configuring the Network Buffers
 
diff --git a/docs/ops/config.zh.md b/docs/ops/config.zh.md
index 8b7febc..6b70ec5 100644
--- a/docs/ops/config.zh.md
+++ b/docs/ops/config.zh.md
@@ -23,217 +23,411 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**For single-node setups Flink is ready to go out of the box and you don't need to change the default configuration to get started.**
+All configuration is done in `conf/flink-conf.yaml`, which is expected to be a flat collection of [YAML key value pairs](http://www.yaml.org/spec/1.2/spec.html) with format `key: value`.
+
+The configuration is parsed and evaluated when the Flink processes are started. Changes to the configuration file require restarting the relevant processes.
 
 The out of the box configuration will use your default Java installation. You can manually set the environment variable `JAVA_HOME` or the configuration key `env.java.home` in `conf/flink-conf.yaml` if you want to manually override the Java runtime to use.
 
-This page lists the most common options that are typically needed to set up a well performing (distributed) installation. In addition a full list of all available configuration parameters is listed here.
+* This will be replaced by the TOC
+{:toc}
 
-All configuration is done in `conf/flink-conf.yaml`, which is expected to be a flat collection of [YAML key value pairs](http://www.yaml.org/spec/1.2/spec.html) with format `key: value`.
+# Basic Setup
 
-The system and run scripts parse the config at startup time. Changes to the configuration file require restarting the Flink JobManager and TaskManagers.
+The default configuration supports starting a single-node Flink session cluster without any changes.
+The options in this section are the ones most commonly needed for a basic distributed Flink setup.
 
-The configuration files for the TaskManagers can be different, Flink does not assume uniform machines in the cluster.
+**Hostnames / Ports**
 
-* This will be replaced by the TOC
-{:toc}
+These options are only necessary for *standalone* application- or session deployments ([simple standalone]({{site.baseurl}}/ops/deployment/cluster_setup.html) or [Kubernetes]({{site.baseurl}}/ops/deployment/kubernetes.html)).
 
-## Common Options
+If you use Flink with [Yarn]({{site.baseurl}}/ops/deployment/yarn_setup.html), [Mesos]({{site.baseurl}}/ops/deployment/mesos.html), or the [*active* Kubernetes integration]({{site.baseurl}}/ops/deployment/native_kubernetes.html), the hostnames and ports are automatically discovered.
 
-{% include generated/common_section.html %}
+  - `rest.address`, `rest.port`: These are used by the client to connect to Flink. Set this to the hostname where the master (JobManager) runs, or to the hostname of the (Kubernetes) service in front of the Flink Master's REST interface.
 
-## Full Reference
+  - The `jobmanager.rpc.address` (defaults to *"localhost"*) and `jobmanager.rpc.port` (defaults to *6123*) config entries are used by the TaskManager to connect to the JobManager/ResourceManager. Set this to the hostname where the master (JobManager) runs, or to the hostname of the (Kubernetes internal) service for the Flink master (JobManager). This option is ignored on [setups with high-availability]({{site.baseurl}}/ops/jobmanager_high_availability.html) where the leader election mec [...]
 
-### Core
+**Memory Sizes** 
 
-{% include generated/core_configuration.html %}
+The default memory sizes support simple streaming/batch applications, but are too low to yield good performance for more complex applications.
 
-### Execution
+  - `jobmanager.heap.size`: Sets the size of the *Flink Master* (JobManager / ResourceManager / Dispatcher) JVM heap.
+  - `taskmanager.memory.process.size`: Total size of the TaskManager process, including everything. Flink will subtract some memory for the JVM's own memory requirements (metaspace and others), and divide and configure the rest automatically between its components (network, managed memory, JVM Heap, etc.).
 
-{% include generated/deployment_configuration.html %}
-{% include generated/savepoint_config_configuration.html %}
-{% include generated/execution_configuration.html %}
+These value are configured as memory sizes, for example *1536m* or *2g*.
+
+**Parallelism**
+
+  - `taskmanager.numberOfTaskSlots`: The number of slots that a TaskManager offers *(default: 1)*. Each slot can take one task or pipeline.
+    Having multiple slots in a TaskManager can help amortize certain constant overheads (of the JVM, application libraries, or network connections) across parallel tasks or pipelines.
+
+     Running more smaller TaskManagers with one slot each is a good starting point and leads to the best isolation between tasks. Dedicating the same resources to fewer larger TaskManagers with more slots can help to increase resource utilization, at the cost of weaker isolation between the tasks (more tasks share the same JVM).
+
+  - `parallelism.default`: The default parallelism used when no parallelism is specified anywhere *(default: 1)*.
+
+**Checkpointing**
+
+You can configure checkpointing directly in code within your Flink job or application. Putting these values here in the configuration defines them as defaults in case the application does not configure anything.
+
+  - `state.backend`: The state backend to use. This defines the data structure mechanism for taking snapshots. Common values are `filesystem` or `rocksdb`.
+  - `state.checkpoints.dir`: The directory to write checkpoints to. This takes a path URI like *s3://mybucket/flink-app/checkpoints* or *hdfs://namenode:port/flink/checkpoints*.
+  - `state.savepoints.dir`: The default directory for savepoints. Takes a path URI, similar to `state.checkpoints.dir`.
 
-### JobManager
+**Web UI**
 
-{% include generated/job_manager_configuration.html %}
+  - `web.submit.enable`: Enables uploading and starting jobs through the Flink UI *(true by default)*. Please note that even when this is disabled, session clusters still accept jobs through REST requests (HTTP calls). This flag only guards the feature to upload jobs in the UI.
+  - `web.upload.dir`: The directory where to store uploaded jobs. Only used when `web.submit.enable` is true.
 
-### Restart Strategies
+**Other**
 
-Configuration options to control Flink's restart behaviour in case of job failures.
+  - `io.tmp.dirs`: The directories where Flink puts local data, defaults to the system temp directory (`java.io.tmpdir` property). If a list of directories is configured, Flink will rotate files across the directories.
+    
+    The data put in these directories include by default the files created by RocksDB, spilled intermediate results (batch algorithms), and cached jar files.
+    
+    This data is NOT relied upon for persistence/recovery, but if this data gets deleted, it typically causes a heavyweight recovery operation. It is hence recommended to set this to a directory that is not automatically periodically purged.
+    
+    Yarn, Mesos, and Kubernetes setups automatically configure this value to the local working directories by default.
+
+----
+----
+
+# Common Setup Options
+
+*Common options to configure your Flink application or cluster.*
+
+### Hosts and Ports
+
+Options to configure hostnames and ports for the different Flink components.
+
+The JobManager hostname and port are only relevant for standalone setups without high-availability.
+In that setup, the config values are used by the TaskManagers to find (and connect to) the JobManager.
+In all highly-available setups, the TaskManagers discover the JobManager via the High-Availability-Service (for example ZooKeeper).
+
+Setups using resource orchestration frameworks (K8s, Yarn, Mesos) typically use the framework's service discovery facilities.
+
+You do not need to configure any TaskManager hosts and ports, unless the setup requires the use of specific port ranges or specific network interfaces to bind to.
+
+{% include generated/common_host_port_section.html %}
+
+### Fault Tolerance
+
+These configuration options control Flink's restart behaviour in case of failures during the execution. 
+By configuring these options in your `flink-conf.yaml`, you define the cluster's default restart strategy. 
+
+The default restart strategy will only take effect if no job specific restart strategy has been configured via the `ExecutionConfig`.
 
 {% include generated/restart_strategy_configuration.html %}
 
-#### Fixed Delay Restart Strategy
+**Fixed Delay Restart Strategy**
 
 {% include generated/fixed_delay_restart_strategy_configuration.html %}
 
-#### Failure Rate Restart Strategy
+**Failure Rate Restart Strategy**
 
 {% include generated/failure_rate_restart_strategy_configuration.html %}
 
-### TaskManager
+### Checkpoints and State Backends
 
-{% include generated/task_manager_configuration.html %}
+These options control the basic setup of state backends and checkpointing behavior.
 
-For *batch* jobs Flink allocates a fraction of 0.4 of the total flink memory (configured via taskmanager.memory.flink.size) for its managed memory. Managed memory helps Flink to run the batch operators efficiently. It prevents OutOfMemoryExceptions because Flink knows how much memory it can use to execute operations. If Flink runs out of managed memory, it utilizes disk space. Using managed memory, some operations can be performed directly on the raw data without having to deserialize th [...]
+The options are only relevant for jobs/applications executing in a continuous streaming fashion.
+Jobs/applications executing in a batch fashion do not use state backends and checkpoints, but different internal data structures that are optimized for batch processing.
 
-The default fraction for managed memory can be adjusted using the taskmanager.memory.managed.fraction parameter. An absolute value may be set using taskmanager.memory.managed.size (overrides the fraction parameter). If desired, the managed memory may be allocated outside the JVM heap. This may improve performance in setups with large memory sizes.
+{% include generated/common_state_backends_section.html %}
 
-{% include generated/task_manager_memory_configuration.html %}
+### High Availability
 
-### Distributed Coordination
+High-availability here refers to the ability of the master (JobManager) process to recover from failures.
 
-{% include generated/cluster_configuration.html %}
+The JobManager ensures consistency during recovery across TaskManagers. For the JobManager itself to recover consistently, an external service must store a minimal amount of recovery metadata (like "ID of last committed checkpoint"), as well as help to elect and lock which JobManager is the leader (to avoid split-brain situations).
 
-### Distributed Coordination (via Akka)
+{% include generated/common_high_availability_section.html %}
 
-{% include generated/akka_configuration.html %}
+**Options for high-availability setups with ZooKeeper**
 
-### REST
+{% include generated/common_high_availability_zk_section.html %}
 
-{% include generated/rest_configuration.html %}
+### Memory Configuration
 
-### Blob Server
+These configuration values control the way that TaskManagers and JobManagers use memory.
 
-{% include generated/blob_server_configuration.html %}
+Flink tries to shield users as much as possible from the complexity of configuring the JVM for data-intensive processing.
+In most cases, users should only need to set the values `taskmanager.memory.process.size` or `taskmanager.memory.flink.size` (depending on how the setup), and possibly adjusting the ratio of JVM heap and Managed Memory via `taskmanager.memory.managed.fraction`. The other options below can be used for performane tuning and fixing memory related errors.
 
-### Heartbeat Manager
+For a detailed explanation of how these options interact, see the [documentation on TaskManager memory configuration]({{site.baseurl}}/ops/memory_configuration.html).
 
-{% include generated/heartbeat_manager_configuration.html %}
+{% include generated/common_memory_section.html %}
 
-### SSL Settings
+### Miscellaneous Options
 
-{% include generated/security_configuration.html %}
+{% include generated/common_miscellaneous_section.html %}
 
-### Netty Shuffle Environment
+----
+----
 
-{% include generated/netty_shuffle_environment_configuration.html %}
+# Security
 
-### Network Communication (via Netty)
+Options for configuring Flink's security and secure interaction with external systems.
 
-These parameters allow for advanced tuning. The default values are sufficient when running concurrent high-throughput jobs on a large cluster.
+### SSL
 
-{% include generated/network_netty_configuration.html %}
+Flink's network connections can be secured via SSL. Please refer to the [SSL Setup Docs]({{site.baseurl}}/ops/security-ssl.html) for detailed setup guide and background.
 
-### Web Frontend
+{% include generated/security_ssl_section.html %}
 
-{% include generated/web_configuration.html %}
 
-### File Systems
+### Auth with External Systems
 
-{% include generated/file_system_configuration.html %}
+**ZooKeeper Authentication / Authorization**
 
-### Compiler/Optimizer
+These options are necessary when connecting to a secured ZooKeeper quorum.
 
-{% include generated/optimizer_configuration.html %}
+{% include generated/security_auth_zk_section.html %}
 
-### Runtime Algorithms
+**Kerberos-based Authentication / Authorization**
 
-{% include generated/algorithm_configuration.html %}
+Please refer to the [Flink and Kerberos Docs]({{site.baseurl}}/ops/security-kerberos.html) for a setup guide and a list of external system to which Flink can authenticate itself via Kerberos.
 
-### Resource Manager
+{% include generated/security_auth_kerberos_section.html %}
 
-The configuration keys in this section are independent of the used resource management framework (YARN, Mesos, Standalone, ...)
+----
+----
 
-{% include generated/resource_manager_configuration.html %}
+# Resource Orchestration Frameworks
+
+This section contains options related to integrating Flink with resource orchestration frameworks, like Kubernetes, Yarn, Mesos, etc.
 
-### Shuffle Service
+Note that is not always necessary to integrate Flink with the resource orchestration framework.
+For example, you can easily deploy Flink applications on Kubernetes without Flink knowing that it runs on Kubernetes (and without specifying any of the Kubernetes config options here.) See [this setup guide]({{site.baseurl}}/ops/deployment/kubernetes.html) for an example.
 
-{% include generated/shuffle_service_configuration.html %}
+The options in this section are necessary for setups where Flink itself actively requests and releases resources from the orchestrators.
 
 ### YARN
 
 {% include generated/yarn_config_configuration.html %}
 
+### Kubernetes
+
+{% include generated/kubernetes_config_configuration.html %}
+
 ### Mesos
 
 {% include generated/mesos_configuration.html %}
 
-#### Mesos TaskManager
+**Mesos TaskManager**
 
 {% include generated/mesos_task_manager_configuration.html %}
 
-### Kubernetes
+----
+----
 
-{% include generated/kubernetes_config_configuration.html %}
+# State Backends
 
-### High Availability (HA)
+Please refer to the [State Backend Documentation]({{site.baseurl}}/ops/state/state_backends.html) for background on State Backends.
 
-{% include generated/high_availability_configuration.html %}
+### RocksDB State Backend
 
-#### ZooKeeper-based HA Mode
+These are the options commonly needed to configure the RocksDB state backend. See the [Advanced RocksDB Backend Section](#advanced-rocksdb-state-backends-options) for options necessary for advanced low level configurations and trouble-shooting.
 
-{% include generated/high_availability_zookeeper_configuration.html %}
+{% include generated/state_backend_rocksdb_section.html %}
 
-### ZooKeeper Security
+----
+----
 
-{% include generated/zoo_keeper_configuration.html %}
+# Metrics
 
-### Kerberos-based Security
+Please refer to the [metrics system documentation]({{site.baseurl}}/monitoring/metrics.html) for background on Flink's metrics infrastructure.
 
-{% include generated/kerberos_configuration.html %}
+{% include generated/metric_configuration.html %}
 
-### Environment
+### RocksDB Native Metrics
 
-{% include generated/environment_configuration.html %}
+Flink can report metrics from RocksDB's native code, for applications using the RocksDB state backend.
+The metrics here are scoped to the operators and then further broken down by column family; values are reported as unsigned longs. 
 
-### Pipeline
+<div class="alert alert-warning">
+  <strong>Note:</strong> Enabling RocksDB's native metrics may cause degraded performance and should be set carefully. 
+</div>
 
-{% include generated/pipeline_configuration.html %}
-{% include generated/stream_pipeline_configuration.html %}
+{% include generated/rocks_db_native_metric_configuration.html %}
 
-### Checkpointing
+----
+----
 
-{% include generated/checkpointing_configuration.html %}
-{% include generated/execution_checkpointing_configuration.html %}
+# History Server
 
-### RocksDB State Backend
+The history server keeps the information of completed jobs (graphs, runtimes, statistics). To enable it, you have to enable "job archiving" in the JobManager (`jobmanager.archive.fs.dir`).
 
-{% include generated/rocks_db_configuration.html %}
+See the [History Server Docs]({{site.baseurl}}/monitoring/historyserver.html) for details.
 
-### RocksDB Configurable Options
-Specific RocksDB configurable options, provided by Flink, to create a corresponding `ConfigurableOptionsFactory`.
-And the created one would be used as default `OptionsFactory` in `RocksDBStateBackend`
-unless user define a `OptionsFactory` and set via `RocksDBStateBackend.setOptions(optionsFactory)`
+{% include generated/history_server_configuration.html %}
 
-{% include generated/rocks_db_configurable_configuration.html %}
+----
+----
+
+# Experimental
+
+*Options for experimental features in Flink.*
 
 ### Queryable State
 
-{% include generated/queryable_state_configuration.html %}
+*Queryable State* is an experimental features that gives lets you access Flink's internal state like a key/value store.
+See the [Queryable State Docs]({{site.baseurl}}/dev/stream/state/queryable_state.html) for details.
 
-### Metrics
+{% include generated/queryable_state_configuration.html %}
 
-{% include generated/metric_configuration.html %}
+----
+----
 
-### RocksDB Native Metrics
-Certain RocksDB native metrics may be forwarded to Flink's metrics reporter.
-All native metrics are scoped to operators and then further broken down by column family; values are reported as unsigned longs.
+# Debugging & Expert Tuning
 
 <div class="alert alert-warning">
-  <strong>Note:</strong> Enabling native metrics may cause degraded performance and should be set carefully.
+  The options below here are meant for expert users and for fixing/debugging problems. Most setups should not need to configure these options.
 </div>
 
-{% include generated/rocks_db_native_metric_configuration.html %}
+### Class Loading
 
-### History Server
+Flink dynamically loads the code for jobs submitted to a session cluster. In addition, Flink tries to hide many dependencies in the classpath from the application. This helps to reduce dependency conflicts between the application code and the dependencies in the classpath.
 
-You have to configure `jobmanager.archive.fs.dir` in order to archive terminated jobs and add it to the list of monitored directories via `historyserver.archive.fs.dir` if you want to display them via the HistoryServer's web frontend.
+Please refer to the [Debugging Classloading Docs]({{site.baseurl}}/monitoring/debugging_classloading.html) for details.
 
-- `jobmanager.archive.fs.dir`: Directory to upload information about terminated jobs to. You have to add this directory to the list of monitored directories of the history server via `historyserver.archive.fs.dir`.
+{% include generated/expert_class_loading_section.html %}
 
-{% include generated/history_server_configuration.html %}
+### Advanced State Backends Options
+
+{% include generated/expert_state_backends_section.html %}
+
+### Advanced RocksDB State Backends Options
+
+Advanced options to tune RocksDB and RocksDB checkpoints.
+
+{% include generated/expert_rocksdb_section.html %}
 
-### Python
+**RocksDB Configurable Options**
 
-{% include generated/python_configuration.html %}
+These options give fine-grained control over the behavior and resoures of ColumnFamilies.
+With the introduction of `state.backend.rocksdb.memory.managed` and `state.backend.rocksdb.memory.fixed-per-slot` (Apache Flink 1.10), it should be only necessary to use the options here for advanced performance tuning. These options here can also be specified in the application program via `RocksDBStateBackend.setOptions(PptionsFactory)`.
 
-## Legacy
+{% include generated/rocks_db_configurable_configuration.html %}
+
+### Advanced Fault Tolerance Options
+
+*These parameters can help with problems related to failover and to components erroneously considering each other as failed.*
+
+{% include generated/expert_fault_tolerance_section.html %}
+
+### Advanced Scheduling Options
+
+*These parameters can help with fine-tuning scheduling for specific situations.*
+
+{% include generated/expert_scheduling_section.html %}
+
+### Advanced High-availability Options
+
+{% include generated/expert_high_availability_section.html %}
+
+### Advanced High-availability ZooKeeper Options
+
+{% include generated/expert_high_availability_zk_section.html %}
+
+### Advanced SSL Security Options
+
+{% include generated/expert_security_ssl_section.html %}
+
+### Advanced Options for the REST endpoint and Client
+
+{% include generated/expert_rest_section.html %}
+
+### Advanced Options for Flink Web UI
+
+{% include generated/web_configuration.html %}
+
+### Full Flink Master Options
+
+**Master / JobManager**
+
+{% include generated/all_jobmanager_section.html %}
+
+**Blob Server**
+
+The Blob Server is a component in the Flink Master / JobManager. It is used for distribution of objects that are too large to be attached to a RPC message and that benefit from caching (like Jar files or large serialized code objects).
+
+{% include generated/blob_server_configuration.html %}
+
+**ResourceManager**
+
+These configuration keys control basic Resource Manager behavior, independent of the used resource orchestration management framework (YARN, Mesos, etc.)
+
+{% include generated/resource_manager_configuration.html %}
+
+### Full TaskManagerOptions
+
+{% include generated/all_taskmanager_section.html %}
+
+**Data Transport Network Stack**
+
+These options are for the network stack that handles the streaming and batch data exchanges between TaskManagers.
+
+{% include generated/all_taskmanager_network_section.html %}
+
+### RPC / Akka
+
+Flink uses Akka for RPC between components (JobManager/TaskManager/ResourceManager).
+Flink does not use Akka for data transport.
+
+{% include generated/akka_configuration.html %}
+
+----
+----
+
+# Startup Script Environment
+
+{% include generated/environment_configuration.html %}
+
+----
+----
+
+# Deprecated Options
+
+These options relate to parts of Flink that are not actively developed any more.
+These options may be removed in a future release.
+
+**DataSet API Optimizer**
+
+{% include generated/optimizer_configuration.html %}
+
+**DataSet API Runtime Algorithms**
+
+{% include generated/algorithm_configuration.html %}
+
+**DataSet File Sinks**
+
+{% include generated/deprecated_file_sinks_section.html %}
+
+----
+----
+
+# Backup
+
+#### Execution
+
+{% include generated/deployment_configuration.html %}
+{% include generated/savepoint_config_configuration.html %}
+{% include generated/execution_configuration.html %}
+
+#### Pipeline
+
+{% include generated/pipeline_configuration.html %}
+{% include generated/stream_pipeline_configuration.html %}
+
+#### Checkpointing
+
+{% include generated/execution_checkpointing_configuration.html %}
 
-- `mode`: Execution mode of Flink. Possible values are `legacy` and `new`. In order to start the legacy components, you have to specify `legacy` (DEFAULT: `new`).
+----
+----
 
-## Background
+# Background
 
 ### Configuring the Network Buffers
 
@@ -325,7 +519,7 @@ When starting a Flink application, users can supply the default number of slots
 ### Configuration Runtime Environment Variables
 You have to set config with prefix `containerized.master.env.` and `containerized.taskmanager.env.` in order to set redefined environment variable in ApplicationMaster and TaskManager.
 
-- `containerized.master.env.`: Prefix for passing custom environment variables to Flink's master process.
+- `containerized.master.env.`: Prefix for passing custom environment variables to Flink's master process. 
    For example for passing LD_LIBRARY_PATH as an env variable to the AppMaster, set containerized.master.env.LD_LIBRARY_PATH: "/usr/lib/native"
     in the flink-conf.yaml.
 - `containerized.taskmanager.env.`: Similar to the above, this configuration prefix allows setting custom environment variables for the workers (TaskManagers).
diff --git a/flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java b/flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java
index e6c507f..e7dbce3 100644
--- a/flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java
+++ b/flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java
@@ -51,13 +51,6 @@ public final class Documentation {
 	@Retention(RetentionPolicy.RUNTIME)
 	@Internal
 	public @interface Section {
-		int POSITION_MEMORY = 10;
-		int POSITION_PARALLELISM_SLOTS = 20;
-		int POSITION_FAULT_TOLERANCE = 30;
-		int POSITION_HIGH_AVAILABILITY = 40;
-		int POSITION_SECURITY = 50;
-
-		String SECTION_COMMON = "common";
 
 		/**
 		 * The sections in the config docs where this option should be included.
@@ -71,6 +64,43 @@ public final class Documentation {
 	}
 
 	/**
+	 * Constants for section names.
+	 */
+	public static final class Sections {
+
+		public static final String COMMON_HOST_PORT = "common_host_port";
+		public static final String COMMON_STATE_BACKENDS = "common_state_backends";
+		public static final String COMMON_HIGH_AVAILABILITY = "common_high_availability";
+		public static final String COMMON_HIGH_AVAILABILITY_ZOOKEEPER = "common_high_availability_zk";
+		public static final String COMMON_MEMORY = "common_memory";
+		public static final String COMMON_MISCELLANEOUS = "common_miscellaneous";
+
+		public static final String SECURITY_SSL = "security_ssl";
+		public static final String SECURITY_AUTH_KERBEROS = "security_auth_kerberos";
+		public static final String SECURITY_AUTH_ZOOKEEPER = "security_auth_zk";
+
+		public static final String STATE_BACKEND_ROCKSDB = "state_backend_rocksdb";
+
+		public static final String EXPERT_CLASS_LOADING = "expert_class_loading";
+		public static final String EXPERT_SCHEDULING = "expert_scheduling";
+		public static final String EXPERT_FAULT_TOLERANCE = "expert_fault_tolerance";
+		public static final String EXPERT_STATE_BACKENDS = "expert_state_backends";
+		public static final String EXPERT_REST = "expert_rest";
+		public static final String EXPERT_HIGH_AVAILABILITY = "expert_high_availability";
+		public static final String EXPERT_ZOOKEEPER_HIGH_AVAILABILITY = "expert_high_availability_zk";
+		public static final String EXPERT_SECURITY_SSL = "expert_security_ssl";
+		public static final String EXPERT_ROCKSDB = "expert_rocksdb";
+
+		public static final String ALL_JOB_MANAGER = "all_jobmanager";
+		public static final String ALL_TASK_MANAGER = "all_taskmanager";
+		public static final String ALL_TASK_MANAGER_NETWORK = "all_taskmanager_network";
+
+		public static final String DEPRECATED_FILE_SINKS = "deprecated_file_sinks";
+
+		private Sections() {}
+	}
+
+	/**
 	 * Annotation used on table config options for adding meta data labels.
 	 *
 	 * <p>The {@link TableOption#execMode()} argument indicates the execution mode the config works for
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java
index 566bac6..4e18eea 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/CheckpointingOptions.java
@@ -32,14 +32,15 @@ public class CheckpointingOptions {
 
 	/** The state backend to be used to store and checkpoint state. */
 	@Documentation.Section(
-		value = {Documentation.Section.SECTION_COMMON},
-		position = Documentation.Section.POSITION_FAULT_TOLERANCE)
+		value = Documentation.Sections.COMMON_STATE_BACKENDS,
+		position = 1)
 	public static final ConfigOption<String> STATE_BACKEND = ConfigOptions
 			.key("state.backend")
 			.noDefaultValue()
 			.withDescription("The state backend to be used to store and checkpoint state.");
 
 	/** The maximum number of completed checkpoints to retain.*/
+	@Documentation.Section(Documentation.Sections.COMMON_STATE_BACKENDS)
 	public static final ConfigOption<Integer> MAX_RETAINED_CHECKPOINTS = ConfigOptions
 			.key("state.checkpoints.num-retained")
 			.defaultValue(1)
@@ -50,6 +51,7 @@ public class CheckpointingOptions {
 	 *
 	 * <p>Some state backends may not support asynchronous snapshots, or only support
 	 * asynchronous snapshots, and ignore this option. */
+	@Documentation.Section(Documentation.Sections.EXPERT_STATE_BACKENDS)
 	public static final ConfigOption<Boolean> ASYNC_SNAPSHOTS = ConfigOptions
 			.key("state.backend.async")
 			.defaultValue(true)
@@ -63,6 +65,7 @@ public class CheckpointingOptions {
 	 *
 	 * <p>Some state backends may not support incremental checkpoints and ignore
 	 * this option.*/
+	@Documentation.Section(Documentation.Sections.COMMON_STATE_BACKENDS)
 	public static final ConfigOption<Boolean> INCREMENTAL_CHECKPOINTS = ConfigOptions
 			.key("state.backend.incremental")
 			.defaultValue(false)
@@ -78,6 +81,7 @@ public class CheckpointingOptions {
 	 * Currently, MemoryStateBackend does not support local recovery and ignore
 	 * this option.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_STATE_BACKENDS)
 	public static final ConfigOption<Boolean> LOCAL_RECOVERY = ConfigOptions
 			.key("state.backend.local-recovery")
 			.defaultValue(false)
@@ -92,6 +96,7 @@ public class CheckpointingOptions {
 	 * Currently, MemoryStateBackend does not support local recovery and ignore
 	 * this option.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_STATE_BACKENDS)
 	public static final ConfigOption<String> LOCAL_RECOVERY_TASK_MANAGER_STATE_ROOT_DIRS = ConfigOptions
 			.key("taskmanager.state.local.root-dirs")
 			.noDefaultValue()
@@ -106,8 +111,8 @@ public class CheckpointingOptions {
 	/** The default directory for savepoints. Used by the state backends that write
 	 * savepoints to file systems (MemoryStateBackend, FsStateBackend, RocksDBStateBackend). */
 	@Documentation.Section(
-		value = {Documentation.Section.SECTION_COMMON},
-		position = Documentation.Section.POSITION_FAULT_TOLERANCE)
+		value = Documentation.Sections.COMMON_STATE_BACKENDS,
+		position = 3)
 	public static final ConfigOption<String> SAVEPOINT_DIRECTORY = ConfigOptions
 			.key("state.savepoints.dir")
 			.noDefaultValue()
@@ -118,8 +123,8 @@ public class CheckpointingOptions {
 	/** The default directory used for storing the data files and meta data of checkpoints in a Flink supported filesystem.
 	 * The storage path must be accessible from all participating processes/nodes(i.e. all TaskManagers and JobManagers).*/
 	@Documentation.Section(
-		value = {Documentation.Section.SECTION_COMMON},
-		position = Documentation.Section.POSITION_FAULT_TOLERANCE)
+		value = Documentation.Sections.COMMON_STATE_BACKENDS,
+		position = 2)
 	public static final ConfigOption<String> CHECKPOINTS_DIRECTORY = ConfigOptions
 			.key("state.checkpoints.dir")
 			.noDefaultValue()
@@ -130,6 +135,7 @@ public class CheckpointingOptions {
 
 	/** The minimum size of state data files. All state chunks smaller than that
 	 * are stored inline in the root checkpoint metadata file. */
+	@Documentation.Section(Documentation.Sections.EXPERT_STATE_BACKENDS)
 	public static final ConfigOption<Integer> FS_SMALL_FILE_THRESHOLD = ConfigOptions
 			.key("state.backend.fs.memory-threshold")
 			.defaultValue(1024)
@@ -139,6 +145,7 @@ public class CheckpointingOptions {
 	/**
 	 * The default size of the write buffer for the checkpoint streams that write to file systems.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_STATE_BACKENDS)
 	public static final ConfigOption<Integer> FS_WRITE_BUFFER_SIZE = ConfigOptions
 		.key("state.backend.fs.write-buffer-size")
 		.defaultValue(4 * 1024)
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/ClusterOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/ClusterOptions.java
index b1462a9..de5479f 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/ClusterOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/ClusterOptions.java
@@ -19,6 +19,7 @@
 package org.apache.flink.configuration;
 
 import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.annotation.docs.Documentation;
 import org.apache.flink.configuration.description.Description;
 
 import static org.apache.flink.configuration.description.TextElement.code;
@@ -29,31 +30,37 @@ import static org.apache.flink.configuration.description.TextElement.code;
 @PublicEvolving
 public class ClusterOptions {
 
+	@Documentation.Section(Documentation.Sections.EXPERT_FAULT_TOLERANCE)
 	public static final ConfigOption<Long> INITIAL_REGISTRATION_TIMEOUT = ConfigOptions
 		.key("cluster.registration.initial-timeout")
 		.defaultValue(100L)
 		.withDescription("Initial registration timeout between cluster components in milliseconds.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_FAULT_TOLERANCE)
 	public static final ConfigOption<Long> MAX_REGISTRATION_TIMEOUT = ConfigOptions
 		.key("cluster.registration.max-timeout")
 		.defaultValue(30000L)
 		.withDescription("Maximum registration timeout between cluster components in milliseconds.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_FAULT_TOLERANCE)
 	public static final ConfigOption<Long> ERROR_REGISTRATION_DELAY = ConfigOptions
 		.key("cluster.registration.error-delay")
 		.defaultValue(10000L)
 		.withDescription("The pause made after an registration attempt caused an exception (other than timeout) in milliseconds.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_FAULT_TOLERANCE)
 	public static final ConfigOption<Long> REFUSED_REGISTRATION_DELAY = ConfigOptions
 		.key("cluster.registration.refused-registration-delay")
 		.defaultValue(30000L)
 		.withDescription("The pause made after the registration attempt was refused in milliseconds.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_FAULT_TOLERANCE)
 	public static final ConfigOption<Long> CLUSTER_SERVICES_SHUTDOWN_TIMEOUT = ConfigOptions
 		.key("cluster.services.shutdown-timeout")
 		.defaultValue(30000L)
 		.withDescription("The shutdown timeout for cluster services like executors in milliseconds.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_SCHEDULING)
 	public static final ConfigOption<Boolean> EVENLY_SPREAD_OUT_SLOTS_STRATEGY = ConfigOptions
 		.key("cluster.evenly-spread-out-slots")
 		.defaultValue(false)
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/CoreOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/CoreOptions.java
index 2449b4e..484f604 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/CoreOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/CoreOptions.java
@@ -34,8 +34,7 @@ import static org.apache.flink.configuration.ConfigOptions.key;
  */
 @PublicEvolving
 @ConfigGroups(groups = {
-	@ConfigGroup(name = "Environment", keyPrefix = "env"),
-	@ConfigGroup(name = "FileSystem", keyPrefix = "fs")
+	@ConfigGroup(name = "Environment", keyPrefix = "env")
 })
 public class CoreOptions {
 
@@ -54,6 +53,7 @@ public class CoreOptions {
 	 *
 	 * <p>Exceptions to the rules are defined via {@link #ALWAYS_PARENT_FIRST_LOADER_PATTERNS}.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_CLASS_LOADING)
 	public static final ConfigOption<String> CLASSLOADER_RESOLVE_ORDER = ConfigOptions
 		.key("classloader.resolve-order")
 		.defaultValue("child-first")
@@ -92,6 +92,7 @@ public class CoreOptions {
 	 *         log bindings.</li>
 	 * </ul>
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_CLASS_LOADING)
 	public static final ConfigOption<String> ALWAYS_PARENT_FIRST_LOADER_PATTERNS = ConfigOptions
 		.key("classloader.parent-first-patterns.default")
 		.defaultValue("java.;scala.;org.apache.flink.;com.esotericsoftware.kryo;org.apache.hadoop.;javax.annotation.;org.slf4j;org.apache.log4j;org.apache.logging;org.apache.commons.logging;ch.qos.logback;org.xml;javax.xml;org.apache.xerces;org.w3c")
@@ -101,6 +102,7 @@ public class CoreOptions {
 			" the fully qualified class name. This setting should generally not be modified. To add another pattern we" +
 			" recommend to use \"classloader.parent-first-patterns.additional\" instead.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_CLASS_LOADING)
 	public static final ConfigOption<String> ALWAYS_PARENT_FIRST_LOADER_PATTERNS_ADDITIONAL = ConfigOptions
 		.key("classloader.parent-first-patterns.additional")
 		.defaultValue("")
@@ -240,6 +242,7 @@ public class CoreOptions {
 	 * ",", "|", or the system's {@link java.io.File#pathSeparator}.
 	 */
 	@Documentation.OverrideDefault("'LOCAL_DIRS' on Yarn. '_FLINK_TMP_DIR' on Mesos. System.getProperty(\"java.io.tmpdir\") in standalone.")
+	@Documentation.Section(Documentation.Sections.COMMON_MISCELLANEOUS)
 	public static final ConfigOption<String> TMP_DIRS =
 		key("io.tmp.dirs")
 			.defaultValue(System.getProperty("java.io.tmpdir"))
@@ -250,9 +253,6 @@ public class CoreOptions {
 	//  program
 	// ------------------------------------------------------------------------
 
-	@Documentation.Section(
-		value = {Documentation.Section.SECTION_COMMON},
-		position = Documentation.Section.POSITION_PARALLELISM_SLOTS)
 	public static final ConfigOption<Integer> DEFAULT_PARALLELISM = ConfigOptions
 		.key("parallelism.default")
 		.defaultValue(1)
@@ -265,6 +265,7 @@ public class CoreOptions {
 	/**
 	 * The default filesystem scheme, used for paths that do not declare a scheme explicitly.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MISCELLANEOUS)
 	public static final ConfigOption<String> DEFAULT_FILESYSTEM_SCHEME = ConfigOptions
 			.key("fs.default-scheme")
 			.noDefaultValue()
@@ -274,6 +275,7 @@ public class CoreOptions {
 	/**
 	 * Specifies whether file output writers should overwrite existing files by default.
 	 */
+	@Documentation.Section(Documentation.Sections.DEPRECATED_FILE_SINKS)
 	public static final ConfigOption<Boolean> FILESYTEM_DEFAULT_OVERRIDE =
 		key("fs.overwrite-files")
 			.defaultValue(false)
@@ -283,6 +285,7 @@ public class CoreOptions {
 	/**
 	 * Specifies whether the file systems should always create a directory for the output, even with a parallelism of one.
 	 */
+	@Documentation.Section(Documentation.Sections.DEPRECATED_FILE_SINKS)
 	public static final ConfigOption<Boolean> FILESYSTEM_OUTPUT_ALWAYS_CREATE_DIRECTORY =
 		key("fs.output.always-create-directory")
 			.defaultValue(false)
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/HeartbeatManagerOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/HeartbeatManagerOptions.java
index 0fe4341..b0ccf0d 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/HeartbeatManagerOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/HeartbeatManagerOptions.java
@@ -19,6 +19,7 @@
 package org.apache.flink.configuration;
 
 import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.annotation.docs.Documentation;
 
 import static org.apache.flink.configuration.ConfigOptions.key;
 
@@ -29,12 +30,14 @@ import static org.apache.flink.configuration.ConfigOptions.key;
 public class HeartbeatManagerOptions {
 
 	/** Time interval for requesting heartbeat from sender side. */
+	@Documentation.Section(Documentation.Sections.EXPERT_FAULT_TOLERANCE)
 	public static final ConfigOption<Long> HEARTBEAT_INTERVAL =
 			key("heartbeat.interval")
 			.defaultValue(10000L)
 			.withDescription("Time interval for requesting heartbeat from sender side.");
 
 	/** Timeout for requesting and receiving heartbeat for both sender and receiver sides. */
+	@Documentation.Section(Documentation.Sections.EXPERT_FAULT_TOLERANCE)
 	public static final ConfigOption<Long> HEARTBEAT_TIMEOUT =
 			key("heartbeat.timeout")
 			.defaultValue(50000L)
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/HighAvailabilityOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/HighAvailabilityOptions.java
index 0087878..8930d00 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/HighAvailabilityOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/HighAvailabilityOptions.java
@@ -19,8 +19,6 @@
 package org.apache.flink.configuration;
 
 import org.apache.flink.annotation.PublicEvolving;
-import org.apache.flink.annotation.docs.ConfigGroup;
-import org.apache.flink.annotation.docs.ConfigGroups;
 import org.apache.flink.annotation.docs.Documentation;
 import org.apache.flink.configuration.description.Description;
 
@@ -29,10 +27,6 @@ import static org.apache.flink.configuration.ConfigOptions.key;
 /**
  * The set of configuration options relating to high-availability settings.
  */
-@PublicEvolving
-@ConfigGroups(groups = {
-	@ConfigGroup(name = "HighAvailabilityZookeeper", keyPrefix = "high-availability.zookeeper")
-})
 public class HighAvailabilityOptions {
 
 	// ------------------------------------------------------------------------
@@ -45,9 +39,7 @@ public class HighAvailabilityOptions {
 	 * To enable high-availability, set this mode to "ZOOKEEPER".
 	 * Can also be set to FQN of HighAvailability factory class.
 	 */
-	@Documentation.Section(
-		value = {Documentation.Section.SECTION_COMMON},
-		position = Documentation.Section.POSITION_HIGH_AVAILABILITY)
+	@Documentation.Section(Documentation.Sections.COMMON_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_MODE =
 			key("high-availability")
 			.defaultValue("NONE")
@@ -59,6 +51,7 @@ public class HighAvailabilityOptions {
 	 * The ID of the Flink cluster, used to separate multiple Flink clusters
 	 * Needs to be set for standalone clusters, is automatically inferred in YARN and Mesos.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_CLUSTER_ID =
 			key("high-availability.cluster-id")
 			.defaultValue("/default")
@@ -69,9 +62,7 @@ public class HighAvailabilityOptions {
 	/**
 	 * File system path (URI) where Flink persists metadata in high-availability setups.
 	 */
-	@Documentation.Section(
-		value = {Documentation.Section.SECTION_COMMON},
-		position = Documentation.Section.POSITION_HIGH_AVAILABILITY)
+	@Documentation.Section(Documentation.Sections.COMMON_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_STORAGE_PATH =
 			key("high-availability.storageDir")
 			.noDefaultValue()
@@ -85,6 +76,7 @@ public class HighAvailabilityOptions {
 	/**
 	 * Optional port (range) used by the job manager in high-availability mode.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_JOB_MANAGER_PORT_RANGE =
 			key("high-availability.jobmanager.port")
 			.defaultValue("0")
@@ -98,6 +90,7 @@ public class HighAvailabilityOptions {
 	/**
 	 * The ZooKeeper quorum to use, when running Flink in a high-availability mode with ZooKeeper.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_HIGH_AVAILABILITY_ZOOKEEPER)
 	public static final ConfigOption<String> HA_ZOOKEEPER_QUORUM =
 			key("high-availability.zookeeper.quorum")
 			.noDefaultValue()
@@ -107,12 +100,14 @@ public class HighAvailabilityOptions {
 	/**
 	 * The root path under which Flink stores its entries in ZooKeeper.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_HIGH_AVAILABILITY_ZOOKEEPER)
 	public static final ConfigOption<String> HA_ZOOKEEPER_ROOT =
 			key("high-availability.zookeeper.path.root")
 			.defaultValue("/flink")
 			.withDeprecatedKeys("recovery.zookeeper.path.root")
 			.withDescription("The root path under which Flink stores its entries in ZooKeeper.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_ZOOKEEPER_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_ZOOKEEPER_LATCH_PATH =
 			key("high-availability.zookeeper.path.latch")
 			.defaultValue("/leaderlatch")
@@ -120,12 +115,14 @@ public class HighAvailabilityOptions {
 			.withDescription("Defines the znode of the leader latch which is used to elect the leader.");
 
 	/** ZooKeeper root path (ZNode) for job graphs. */
+	@Documentation.Section(Documentation.Sections.EXPERT_ZOOKEEPER_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_ZOOKEEPER_JOBGRAPHS_PATH =
 			key("high-availability.zookeeper.path.jobgraphs")
 			.defaultValue("/jobgraphs")
 			.withDeprecatedKeys("recovery.zookeeper.path.jobgraphs")
 			.withDescription("ZooKeeper root path (ZNode) for job graphs");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_ZOOKEEPER_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_ZOOKEEPER_LEADER_PATH =
 			key("high-availability.zookeeper.path.leader")
 			.defaultValue("/leader")
@@ -134,6 +131,7 @@ public class HighAvailabilityOptions {
 				" leader session ID.");
 
 	/** ZooKeeper root path (ZNode) for completed checkpoints. */
+	@Documentation.Section(Documentation.Sections.EXPERT_ZOOKEEPER_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_ZOOKEEPER_CHECKPOINTS_PATH =
 			key("high-availability.zookeeper.path.checkpoints")
 			.defaultValue("/checkpoints")
@@ -141,6 +139,7 @@ public class HighAvailabilityOptions {
 			.withDescription("ZooKeeper root path (ZNode) for completed checkpoints.");
 
 	/** ZooKeeper root path (ZNode) for checkpoint counters. */
+	@Documentation.Section(Documentation.Sections.EXPERT_ZOOKEEPER_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_ZOOKEEPER_CHECKPOINT_COUNTER_PATH =
 			key("high-availability.zookeeper.path.checkpoint-counter")
 			.defaultValue("/checkpoint-counter")
@@ -149,6 +148,7 @@ public class HighAvailabilityOptions {
 
 	/** ZooKeeper root path (ZNode) for Mesos workers. */
 	@PublicEvolving
+	@Documentation.Section(Documentation.Sections.EXPERT_ZOOKEEPER_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> HA_ZOOKEEPER_MESOS_WORKERS_PATH =
 			key("high-availability.zookeeper.path.mesos-workers")
 			.defaultValue("/mesos-workers")
@@ -161,34 +161,40 @@ public class HighAvailabilityOptions {
 	//  ZooKeeper Client Settings
 	// ------------------------------------------------------------------------
 
+	@Documentation.Section(Documentation.Sections.EXPERT_ZOOKEEPER_HIGH_AVAILABILITY)
 	public static final ConfigOption<Integer> ZOOKEEPER_SESSION_TIMEOUT =
 			key("high-availability.zookeeper.client.session-timeout")
 			.defaultValue(60000)
 			.withDeprecatedKeys("recovery.zookeeper.client.session-timeout")
 			.withDescription("Defines the session timeout for the ZooKeeper session in ms.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_ZOOKEEPER_HIGH_AVAILABILITY)
 	public static final ConfigOption<Integer> ZOOKEEPER_CONNECTION_TIMEOUT =
 			key("high-availability.zookeeper.client.connection-timeout")
 			.defaultValue(15000)
 			.withDeprecatedKeys("recovery.zookeeper.client.connection-timeout")
 			.withDescription("Defines the connection timeout for ZooKeeper in ms.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_ZOOKEEPER_HIGH_AVAILABILITY)
 	public static final ConfigOption<Integer> ZOOKEEPER_RETRY_WAIT =
 			key("high-availability.zookeeper.client.retry-wait")
 			.defaultValue(5000)
 			.withDeprecatedKeys("recovery.zookeeper.client.retry-wait")
 			.withDescription("Defines the pause between consecutive retries in ms.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_ZOOKEEPER_HIGH_AVAILABILITY)
 	public static final ConfigOption<Integer> ZOOKEEPER_MAX_RETRY_ATTEMPTS =
 			key("high-availability.zookeeper.client.max-retry-attempts")
 			.defaultValue(3)
 			.withDeprecatedKeys("recovery.zookeeper.client.max-retry-attempts")
 			.withDescription("Defines the number of connection retries before the client gives up.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_ZOOKEEPER_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> ZOOKEEPER_RUNNING_JOB_REGISTRY_PATH =
 			key("high-availability.zookeeper.path.running-registry")
 			.defaultValue("/running_job_registry/");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_ZOOKEEPER_HIGH_AVAILABILITY)
 	public static final ConfigOption<String> ZOOKEEPER_CLIENT_ACL =
 			key("high-availability.zookeeper.client.acl")
 			.defaultValue("open")
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
index 47755e2..efeb340 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
@@ -42,6 +42,7 @@ public class JobManagerOptions {
 	 * leader-election service (like ZooKeeper) is used to elect and discover the JobManager
 	 * leader from potentially multiple standby JobManagers.
 	 */
+	@Documentation.Section({Documentation.Sections.COMMON_HOST_PORT, Documentation.Sections.ALL_JOB_MANAGER})
 	public static final ConfigOption<String> ADDRESS =
 		key("jobmanager.rpc.address")
 		.noDefaultValue()
@@ -64,6 +65,7 @@ public class JobManagerOptions {
 	 * leader-election service (like ZooKeeper) is used to elect and discover the JobManager
 	 * leader from potentially multiple standby JobManagers.
 	 */
+	@Documentation.Section({Documentation.Sections.COMMON_HOST_PORT, Documentation.Sections.ALL_JOB_MANAGER})
 	public static final ConfigOption<Integer> PORT =
 		key("jobmanager.rpc.port")
 		.defaultValue(6123)
@@ -79,9 +81,7 @@ public class JobManagerOptions {
 	/**
 	 * JVM heap size for the JobManager with memory size.
 	 */
-	@Documentation.Section(
-		value = {Documentation.Section.SECTION_COMMON},
-		position = Documentation.Section.POSITION_MEMORY)
+	@Documentation.Section(Documentation.Sections.ALL_JOB_MANAGER)
 	public static final ConfigOption<String> JOB_MANAGER_HEAP_MEMORY =
 		key("jobmanager.heap.size")
 		.defaultValue("1024m")
@@ -100,6 +100,7 @@ public class JobManagerOptions {
 	/**
 	 * The maximum number of prior execution attempts kept in history.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_JOB_MANAGER)
 	public static final ConfigOption<Integer> MAX_ATTEMPTS_HISTORY_SIZE =
 		key("jobmanager.execution.attempts-history-size")
 			.defaultValue(16)
@@ -114,6 +115,8 @@ public class JobManagerOptions {
 	 * failover strategy would also restart failed tasks individually.
 	 * The new "region" strategy supersedes "individual" strategy and should always work.
 	 */
+	@Documentation.Section({Documentation.Sections.ALL_JOB_MANAGER, Documentation.Sections.EXPERT_FAULT_TOLERANCE})
+	@Documentation.OverrideDefault("region")
 	public static final ConfigOption<String> EXECUTION_FAILOVER_STRATEGY =
 		key("jobmanager.execution.failover-strategy")
 			.defaultValue("full")
@@ -132,6 +135,7 @@ public class JobManagerOptions {
 	/**
 	 * The location where the JobManager stores the archives of completed jobs.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_JOB_MANAGER)
 	public static final ConfigOption<String> ARCHIVE_DIR =
 		key("jobmanager.archive.fs.dir")
 			.noDefaultValue()
@@ -141,6 +145,7 @@ public class JobManagerOptions {
 	 * The job store cache size in bytes which is used to keep completed
 	 * jobs in memory.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_JOB_MANAGER)
 	public static final ConfigOption<Long> JOB_STORE_CACHE_SIZE =
 		key("jobstore.cache-size")
 		.defaultValue(50L * 1024L * 1024L)
@@ -149,6 +154,7 @@ public class JobManagerOptions {
 	/**
 	 * The time in seconds after which a completed job expires and is purged from the job store.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_JOB_MANAGER)
 	public static final ConfigOption<Long> JOB_STORE_EXPIRATION_TIME =
 		key("jobstore.expiration-time")
 		.defaultValue(60L * 60L)
@@ -157,6 +163,7 @@ public class JobManagerOptions {
 	/**
 	 * The max number of completed jobs that can be kept in the job store.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_JOB_MANAGER)
 	public static final ConfigOption<Integer> JOB_STORE_MAX_CAPACITY =
 		key("jobstore.max-capacity")
 			.defaultValue(Integer.MAX_VALUE)
@@ -165,6 +172,7 @@ public class JobManagerOptions {
 	/**
 	 * The timeout in milliseconds for requesting a slot from Slot Pool.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_SCHEDULING)
 	public static final ConfigOption<Long> SLOT_REQUEST_TIMEOUT =
 		key("slot.request.timeout")
 		.defaultValue(5L * 60L * 1000L)
@@ -173,6 +181,7 @@ public class JobManagerOptions {
 	/**
 	 * The timeout in milliseconds for a idle slot in Slot Pool.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_SCHEDULING)
 	public static final ConfigOption<Long> SLOT_IDLE_TIMEOUT =
 		key("slot.idle.timeout")
 			// default matches heartbeat.timeout so that sticky allocation is not lost on timeouts for local recovery
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/NettyShuffleEnvironmentOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/NettyShuffleEnvironmentOptions.java
index ca4c131..8670183 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/NettyShuffleEnvironmentOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/NettyShuffleEnvironmentOptions.java
@@ -19,8 +19,6 @@
 package org.apache.flink.configuration;
 
 import org.apache.flink.annotation.PublicEvolving;
-import org.apache.flink.annotation.docs.ConfigGroup;
-import org.apache.flink.annotation.docs.ConfigGroups;
 import org.apache.flink.annotation.docs.Documentation;
 
 import static org.apache.flink.configuration.ConfigOptions.key;
@@ -29,7 +27,6 @@ import static org.apache.flink.configuration.ConfigOptions.key;
  * The set of configuration options relating to network stack.
  */
 @PublicEvolving
-@ConfigGroups(groups = @ConfigGroup(name = "NetworkNetty", keyPrefix = "taskmanager.network.netty"))
 public class NettyShuffleEnvironmentOptions {
 
 	// ------------------------------------------------------------------------
@@ -40,6 +37,7 @@ public class NettyShuffleEnvironmentOptions {
 	 * The default network port the task manager expects to receive transfer envelopes on. The {@code 0} means that
 	 * the TaskManager searches for a free port.
 	 */
+	@Documentation.Section({Documentation.Sections.COMMON_HOST_PORT, Documentation.Sections.ALL_TASK_MANAGER})
 	public static final ConfigOption<Integer> DATA_PORT =
 		key("taskmanager.data.port")
 			.defaultValue(0)
@@ -48,6 +46,7 @@ public class NettyShuffleEnvironmentOptions {
 	/**
 	 * Config parameter to override SSL support for taskmanager's data transport.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<Boolean> DATA_SSL_ENABLED =
 		key("taskmanager.data.ssl.enabled")
 			.defaultValue(true)
@@ -61,6 +60,7 @@ public class NettyShuffleEnvironmentOptions {
 	 * IO bounded scenario when data compression ratio is high. Currently, shuffle data compression is an experimental
 	 * feature and the config option can be changed in the future.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<Boolean> BLOCKING_SHUFFLE_COMPRESSION_ENABLED =
 		key("taskmanager.network.blocking-shuffle.compression.enabled")
 			.defaultValue(false)
@@ -82,6 +82,7 @@ public class NettyShuffleEnvironmentOptions {
 	 * Boolean flag to enable/disable more detailed metrics about inbound/outbound network queue
 	 * lengths.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<Boolean> NETWORK_DETAILED_METRICS =
 		key("taskmanager.network.detailed-metrics")
 			.defaultValue(false)
@@ -141,6 +142,7 @@ public class NettyShuffleEnvironmentOptions {
 	 *
 	 * <p>Reasoning: 1 buffer for in-flight data in the subpartition + 1 buffer for parallel serialization.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<Integer> NETWORK_BUFFERS_PER_CHANNEL =
 		key("taskmanager.network.memory.buffers-per-channel")
 			.defaultValue(2)
@@ -152,6 +154,7 @@ public class NettyShuffleEnvironmentOptions {
 	/**
 	 * Number of extra network buffers to use for each outgoing/incoming gate (result partition/input gate).
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<Integer> NETWORK_EXTRA_BUFFERS_PER_GATE =
 		key("taskmanager.network.memory.floating-buffers-per-gate")
 			.defaultValue(8)
@@ -173,6 +176,7 @@ public class NettyShuffleEnvironmentOptions {
 					"tasks have occupied all the buffers and the downstream tasks are waiting for the exclusive buffers. The timeout breaks" +
 					"the tie by failing the request of exclusive buffers and ask users to increase the number of total buffers.");
 
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<String> NETWORK_BLOCKING_SHUFFLE_TYPE =
 		key("taskmanager.network.blocking-shuffle.type")
 			.defaultValue("file")
@@ -185,36 +189,42 @@ public class NettyShuffleEnvironmentOptions {
 	//  Netty Options
 	// ------------------------------------------------------------------------
 
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<Integer> NUM_ARENAS =
 		key("taskmanager.network.netty.num-arenas")
 			.defaultValue(-1)
 			.withDeprecatedKeys("taskmanager.net.num-arenas")
 			.withDescription("The number of Netty arenas.");
 
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<Integer> NUM_THREADS_SERVER =
 		key("taskmanager.network.netty.server.numThreads")
 			.defaultValue(-1)
 			.withDeprecatedKeys("taskmanager.net.server.numThreads")
 			.withDescription("The number of Netty server threads.");
 
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<Integer> NUM_THREADS_CLIENT =
 		key("taskmanager.network.netty.client.numThreads")
 			.defaultValue(-1)
 			.withDeprecatedKeys("taskmanager.net.client.numThreads")
 			.withDescription("The number of Netty client threads.");
 
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<Integer> CONNECT_BACKLOG =
 		key("taskmanager.network.netty.server.backlog")
 			.defaultValue(0) // default: 0 => Netty's default
 			.withDeprecatedKeys("taskmanager.net.server.backlog")
 			.withDescription("The netty server connection backlog.");
 
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<Integer> CLIENT_CONNECT_TIMEOUT_SECONDS =
 		key("taskmanager.network.netty.client.connectTimeoutSec")
 			.defaultValue(120) // default: 120s = 2min
 			.withDeprecatedKeys("taskmanager.net.client.connectTimeoutSec")
 			.withDescription("The Netty client connection timeout.");
 
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<Integer> SEND_RECEIVE_BUFFER_SIZE =
 		key("taskmanager.network.netty.sendReceiveBufferSize")
 			.defaultValue(0) // default: 0 => Netty's default
@@ -222,6 +232,7 @@ public class NettyShuffleEnvironmentOptions {
 			.withDescription("The Netty send and receive buffer size. This defaults to the system buffer size" +
 				" (cat /proc/sys/net/ipv4/tcp_[rw]mem) and is 4 MiB in modern Linux.");
 
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<String> TRANSPORT_TYPE =
 		key("taskmanager.network.netty.transport")
 			.defaultValue("auto")
@@ -237,6 +248,7 @@ public class NettyShuffleEnvironmentOptions {
 	/**
 	 * Minimum backoff for partition requests of input channels.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<Integer> NETWORK_REQUEST_BACKOFF_INITIAL =
 		key("taskmanager.network.request-backoff.initial")
 			.defaultValue(100)
@@ -246,6 +258,7 @@ public class NettyShuffleEnvironmentOptions {
 	/**
 	 * Maximum backoff for partition requests of input channels.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER_NETWORK)
 	public static final ConfigOption<Integer> NETWORK_REQUEST_BACKOFF_MAX =
 		key("taskmanager.network.request-backoff.max")
 			.defaultValue(10000)
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/ResourceManagerOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/ResourceManagerOptions.java
index 869e97f..fde942b 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/ResourceManagerOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/ResourceManagerOptions.java
@@ -36,8 +36,9 @@ public class ResourceManagerOptions {
 		.withDescription("Timeout for jobs which don't have a job manager as leader assigned.");
 
 	/**
-	 * The number of resource managers start.
+	 * This option is not used any more.
 	 */
+	@Deprecated
 	public static final ConfigOption<Integer> LOCAL_NUMBER_RESOURCE_MANAGER = ConfigOptions
 		.key("local.number-resourcemanager")
 		.defaultValue(1)
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/RestOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/RestOptions.java
index 733e5bf..457d0d2 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/RestOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/RestOptions.java
@@ -19,6 +19,7 @@
 package org.apache.flink.configuration;
 
 import org.apache.flink.annotation.Internal;
+import org.apache.flink.annotation.docs.Documentation;
 import org.apache.flink.configuration.description.Description;
 
 import static org.apache.flink.configuration.ConfigOptions.key;
@@ -35,6 +36,7 @@ public class RestOptions {
 	/**
 	 * The address that the server binds itself to.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_HOST_PORT)
 	public static final ConfigOption<String> BIND_ADDRESS =
 		key("rest.bind-address")
 			.noDefaultValue()
@@ -45,6 +47,7 @@ public class RestOptions {
 	/**
 	 * The port range that the server could bind itself to.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_HOST_PORT)
 	public static final ConfigOption<String> BIND_PORT =
 		key("rest.bind-port")
 			.defaultValue("8081")
@@ -58,6 +61,7 @@ public class RestOptions {
 	/**
 	 * The address that should be used by clients to connect to the server.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_HOST_PORT)
 	public static final ConfigOption<String> ADDRESS =
 		key("rest.address")
 			.noDefaultValue()
@@ -68,6 +72,7 @@ public class RestOptions {
 	 * The port that the REST client connects to and the REST server binds to if {@link #BIND_PORT}
 	 * has not been specified.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_HOST_PORT)
 	public static final ConfigOption<Integer> PORT =
 		key(REST_PORT_KEY)
 			.defaultValue(8081)
@@ -81,6 +86,7 @@ public class RestOptions {
 	 * The time in ms that the client waits for the leader address, e.g., Dispatcher or
 	 * WebMonitorEndpoint.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_REST)
 	public static final ConfigOption<Long> AWAIT_LEADER_TIMEOUT =
 		key("rest.await-leader-timeout")
 			.defaultValue(30_000L)
@@ -91,6 +97,7 @@ public class RestOptions {
 	 * The number of retries the client will attempt if a retryable operations fails.
 	 * @see #RETRY_DELAY
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_REST)
 	public static final ConfigOption<Integer> RETRY_MAX_ATTEMPTS =
 		key("rest.retry.max-attempts")
 			.defaultValue(20)
@@ -101,6 +108,7 @@ public class RestOptions {
 	 * The time in ms that the client waits between retries.
 	 * @see #RETRY_MAX_ATTEMPTS
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_REST)
 	public static final ConfigOption<Long> RETRY_DELAY =
 		key("rest.retry.delay")
 			.defaultValue(3_000L)
@@ -110,6 +118,7 @@ public class RestOptions {
 	/**
 	 * The maximum time in ms for the client to establish a TCP connection.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_REST)
 	public static final ConfigOption<Long> CONNECTION_TIMEOUT =
 		key("rest.connection-timeout")
 			.defaultValue(15_000L)
@@ -118,6 +127,7 @@ public class RestOptions {
 	/**
 	 * The maximum time in ms for a connection to stay idle before failing.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_REST)
 	public static final ConfigOption<Long> IDLENESS_TIMEOUT =
 		key("rest.idleness-timeout")
 			.defaultValue(5L * 60L * 1_000L) // 5 minutes
@@ -126,6 +136,7 @@ public class RestOptions {
 	/**
 	 * The maximum content length that the server will handle.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_REST)
 	public static final ConfigOption<Integer> SERVER_MAX_CONTENT_LENGTH =
 		key("rest.server.max-content-length")
 			.defaultValue(104_857_600)
@@ -134,16 +145,19 @@ public class RestOptions {
 	/**
 	 * The maximum content length that the client will handle.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_REST)
 	public static final ConfigOption<Integer> CLIENT_MAX_CONTENT_LENGTH =
 		key("rest.client.max-content-length")
 			.defaultValue(104_857_600)
 			.withDescription("The maximum content length in bytes that the client will handle.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_REST)
 	public static final ConfigOption<Integer> SERVER_NUM_THREADS =
 		key("rest.server.numThreads")
 			.defaultValue(4)
 			.withDescription("The number of threads for the asynchronous processing of requests.");
 
+	@Documentation.Section(Documentation.Sections.EXPERT_REST)
 	public static final ConfigOption<Integer> SERVER_THREAD_PRIORITY = key("rest.server.thread-priority")
 		.defaultValue(Thread.NORM_PRIORITY)
 		.withDescription("Thread priority of the REST server's executor for processing asynchronous requests. " +
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java
index 8297cfd..0e4d153 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java
@@ -18,9 +18,6 @@
 
 package org.apache.flink.configuration;
 
-import org.apache.flink.annotation.PublicEvolving;
-import org.apache.flink.annotation.docs.ConfigGroup;
-import org.apache.flink.annotation.docs.ConfigGroups;
 import org.apache.flink.annotation.docs.Documentation;
 import org.apache.flink.configuration.description.Description;
 
@@ -33,34 +30,33 @@ import static org.apache.flink.configuration.description.TextElement.text;
 /**
  * The set of configuration options relating to security.
  */
-@PublicEvolving
-@ConfigGroups(groups = {
-	@ConfigGroup(name = "Kerberos", keyPrefix = "security.kerberos"),
-	@ConfigGroup(name = "ZooKeeper", keyPrefix = "zookeeper")
-})
 public class SecurityOptions {
 
 	// ------------------------------------------------------------------------
 	//  Kerberos Options
 	// ------------------------------------------------------------------------
 
+	@Documentation.Section(Documentation.Sections.SECURITY_AUTH_KERBEROS)
 	public static final ConfigOption<String> KERBEROS_LOGIN_PRINCIPAL =
 		key("security.kerberos.login.principal")
 			.noDefaultValue()
 			.withDeprecatedKeys("security.principal")
 			.withDescription("Kerberos principal name associated with the keytab.");
 
+	@Documentation.Section(Documentation.Sections.SECURITY_AUTH_KERBEROS)
 	public static final ConfigOption<String> KERBEROS_LOGIN_KEYTAB =
 		key("security.kerberos.login.keytab")
 			.noDefaultValue()
 			.withDeprecatedKeys("security.keytab")
 			.withDescription("Absolute path to a Kerberos keytab file that contains the user credentials.");
 
+	@Documentation.Section(Documentation.Sections.SECURITY_AUTH_KERBEROS)
 	public static final ConfigOption<Boolean> KERBEROS_LOGIN_USETICKETCACHE =
 		key("security.kerberos.login.use-ticket-cache")
 			.defaultValue(true)
 			.withDescription("Indicates whether to read from your Kerberos ticket cache.");
 
+	@Documentation.Section(Documentation.Sections.SECURITY_AUTH_KERBEROS)
 	public static final ConfigOption<String> KERBEROS_LOGIN_CONTEXTS =
 		key("security.kerberos.login.contexts")
 			.noDefaultValue()
@@ -73,14 +69,17 @@ public class SecurityOptions {
 	//  ZooKeeper Security Options
 	// ------------------------------------------------------------------------
 
+	@Documentation.Section(Documentation.Sections.SECURITY_AUTH_ZOOKEEPER)
 	public static final ConfigOption<Boolean> ZOOKEEPER_SASL_DISABLE =
 		key("zookeeper.sasl.disable")
 			.defaultValue(false);
 
+	@Documentation.Section(Documentation.Sections.SECURITY_AUTH_ZOOKEEPER)
 	public static final ConfigOption<String> ZOOKEEPER_SASL_SERVICE_NAME =
 		key("zookeeper.sasl.service-name")
 			.defaultValue("zookeeper");
 
+	@Documentation.Section(Documentation.Sections.SECURITY_AUTH_ZOOKEEPER)
 	public static final ConfigOption<String> ZOOKEEPER_SASL_LOGIN_CONTEXT_NAME =
 		key("zookeeper.sasl.login-context-name")
 			.defaultValue("Client");
@@ -106,9 +105,7 @@ public class SecurityOptions {
 	/**
 	 * Enable SSL for internal communication (akka rpc, netty data transport, blob server).
 	 */
-	@Documentation.Section(
-		value = {Documentation.Section.SECTION_COMMON},
-		position = Documentation.Section.POSITION_SECURITY)
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<Boolean> SSL_INTERNAL_ENABLED =
 			key("security.ssl.internal.enabled")
 			.defaultValue(false)
@@ -119,9 +116,7 @@ public class SecurityOptions {
 	/**
 	 * Enable SSL for external REST endpoints.
 	 */
-	@Documentation.Section(
-		value = {Documentation.Section.SECTION_COMMON},
-		position = Documentation.Section.POSITION_SECURITY)
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<Boolean> SSL_REST_ENABLED =
 			key("security.ssl.rest.enabled")
 			.defaultValue(false)
@@ -130,6 +125,7 @@ public class SecurityOptions {
 	/**
 	 * Enable mututal SSL authentication for external REST endpoints.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<Boolean> SSL_REST_AUTHENTICATION_ENABLED =
 		key("security.ssl.rest.authentication-enabled")
 			.defaultValue(false)
@@ -140,6 +136,7 @@ public class SecurityOptions {
 	/**
 	 * The Java keystore file containing the flink endpoint key and certificate.
 	 */
+	@Documentation.ExcludeFromDocumentation("The SSL Setup encourages separate configs for internal and REST security.")
 	public static final ConfigOption<String> SSL_KEYSTORE =
 		key("security.ssl.keystore")
 			.noDefaultValue()
@@ -148,6 +145,7 @@ public class SecurityOptions {
 	/**
 	 * Secret to decrypt the keystore file.
 	 */
+	@Documentation.ExcludeFromDocumentation("The SSL Setup encourages separate configs for internal and REST security.")
 	public static final ConfigOption<String> SSL_KEYSTORE_PASSWORD =
 		key("security.ssl.keystore-password")
 			.noDefaultValue()
@@ -156,6 +154,7 @@ public class SecurityOptions {
 	/**
 	 * Secret to decrypt the server key.
 	 */
+	@Documentation.ExcludeFromDocumentation("The SSL Setup encourages separate configs for internal and REST security.")
 	public static final ConfigOption<String> SSL_KEY_PASSWORD =
 		key("security.ssl.key-password")
 			.noDefaultValue()
@@ -164,6 +163,7 @@ public class SecurityOptions {
 	/**
 	 * The truststore file containing the public CA certificates to verify the ssl peers.
 	 */
+	@Documentation.ExcludeFromDocumentation("The SSL Setup encourages separate configs for internal and REST security.")
 	public static final ConfigOption<String> SSL_TRUSTSTORE =
 		key("security.ssl.truststore")
 			.noDefaultValue()
@@ -173,6 +173,7 @@ public class SecurityOptions {
 	/**
 	 * Secret to decrypt the truststore.
 	 */
+	@Documentation.ExcludeFromDocumentation("The SSL Setup encourages separate configs for internal and REST security.")
 	public static final ConfigOption<String> SSL_TRUSTSTORE_PASSWORD =
 		key("security.ssl.truststore-password")
 			.noDefaultValue()
@@ -183,6 +184,7 @@ public class SecurityOptions {
 	/**
 	 * For internal SSL, the Java keystore file containing the private key and certificate.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_INTERNAL_KEYSTORE =
 			key("security.ssl.internal.keystore")
 					.noDefaultValue()
@@ -192,6 +194,7 @@ public class SecurityOptions {
 	/**
 	 * For internal SSL, the password to decrypt the keystore file containing the certificate.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_INTERNAL_KEYSTORE_PASSWORD =
 			key("security.ssl.internal.keystore-password")
 					.noDefaultValue()
@@ -201,6 +204,7 @@ public class SecurityOptions {
 	/**
 	 * For internal SSL, the password to decrypt the private key.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_INTERNAL_KEY_PASSWORD =
 			key("security.ssl.internal.key-password")
 					.noDefaultValue()
@@ -210,6 +214,7 @@ public class SecurityOptions {
 	/**
 	 * For internal SSL, the truststore file containing the public CA certificates to verify the ssl peers.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_INTERNAL_TRUSTSTORE =
 			key("security.ssl.internal.truststore")
 					.noDefaultValue()
@@ -219,6 +224,7 @@ public class SecurityOptions {
 	/**
 	 * For internal SSL, the secret to decrypt the truststore.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_INTERNAL_TRUSTSTORE_PASSWORD =
 			key("security.ssl.internal.truststore-password")
 					.noDefaultValue()
@@ -228,6 +234,7 @@ public class SecurityOptions {
 	/**
 	 * For internal SSL, the sha1 fingerprint of the internal certificate to verify the client.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_INTERNAL_CERT_FINGERPRINT =
 		key("security.ssl.internal.cert.fingerprint")
 			.noDefaultValue()
@@ -240,6 +247,7 @@ public class SecurityOptions {
 	/**
 	 * For external (REST) SSL, the Java keystore file containing the private key and certificate.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_REST_KEYSTORE =
 			key("security.ssl.rest.keystore")
 					.noDefaultValue()
@@ -249,6 +257,7 @@ public class SecurityOptions {
 	/**
 	 * For external (REST) SSL, the password to decrypt the keystore file containing the certificate.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_REST_KEYSTORE_PASSWORD =
 			key("security.ssl.rest.keystore-password")
 					.noDefaultValue()
@@ -258,6 +267,7 @@ public class SecurityOptions {
 	/**
 	 * For external (REST) SSL, the password to decrypt the private key.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_REST_KEY_PASSWORD =
 			key("security.ssl.rest.key-password")
 					.noDefaultValue()
@@ -267,6 +277,7 @@ public class SecurityOptions {
 	/**
 	 * For external (REST) SSL, the truststore file containing the public CA certificates to verify the ssl peers.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_REST_TRUSTSTORE =
 			key("security.ssl.rest.truststore")
 					.noDefaultValue()
@@ -276,6 +287,7 @@ public class SecurityOptions {
 	/**
 	 * For external (REST) SSL, the secret to decrypt the truststore.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_REST_TRUSTSTORE_PASSWORD =
 			key("security.ssl.rest.truststore-password")
 					.noDefaultValue()
@@ -285,6 +297,7 @@ public class SecurityOptions {
 	/**
 	 * For external (REST) SSL, the sha1 fingerprint of the rest client certificate to verify.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_REST_CERT_FINGERPRINT =
 		key("security.ssl.rest.cert.fingerprint")
 			.noDefaultValue()
@@ -297,6 +310,7 @@ public class SecurityOptions {
 	/**
 	 * SSL protocol version to be supported.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_PROTOCOL =
 		key("security.ssl.protocol")
 			.defaultValue("TLSv1.2")
@@ -308,6 +322,7 @@ public class SecurityOptions {
 	 *
 	 * <p>More options here - http://docs.oracle.com/javase/8/docs/technotes/guides/security/StandardNames.html#ciphersuites
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<String> SSL_ALGORITHMS =
 		key("security.ssl.algorithms")
 			.defaultValue("TLS_RSA_WITH_AES_128_CBC_SHA")
@@ -321,6 +336,7 @@ public class SecurityOptions {
 	/**
 	 * Flag to enable/disable hostname verification for the ssl connections.
 	 */
+	@Documentation.Section(Documentation.Sections.SECURITY_SSL)
 	public static final ConfigOption<Boolean> SSL_VERIFY_HOSTNAME =
 		key("security.ssl.verify-hostname")
 			.defaultValue(true)
@@ -329,6 +345,7 @@ public class SecurityOptions {
 	/**
 	 * SSL engine provider.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_SECURITY_SSL)
 	public static final ConfigOption<String> SSL_PROVIDER =
 		key("security.ssl.provider")
 			.defaultValue("JDK")
@@ -365,6 +382,7 @@ public class SecurityOptions {
 	/**
 	 * SSL session cache size.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_SECURITY_SSL)
 	public static final ConfigOption<Integer> SSL_INTERNAL_SESSION_CACHE_SIZE =
 		key("security.ssl.internal.session-cache-size")
 			.defaultValue(-1)
@@ -377,6 +395,7 @@ public class SecurityOptions {
 	/**
 	 * SSL session timeout.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_SECURITY_SSL)
 	public static final ConfigOption<Integer> SSL_INTERNAL_SESSION_TIMEOUT =
 		key("security.ssl.internal.session-timeout")
 			.defaultValue(-1)
@@ -386,6 +405,7 @@ public class SecurityOptions {
 	/**
 	 * SSL session timeout during handshakes.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_SECURITY_SSL)
 	public static final ConfigOption<Integer> SSL_INTERNAL_HANDSHAKE_TIMEOUT =
 		key("security.ssl.internal.handshake-timeout")
 			.defaultValue(-1)
@@ -395,6 +415,7 @@ public class SecurityOptions {
 	/**
 	 * SSL session timeout after flushing the <tt>close_notify</tt> message.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_SECURITY_SSL)
 	public static final ConfigOption<Integer> SSL_INTERNAL_CLOSE_NOTIFY_FLUSH_TIMEOUT =
 		key("security.ssl.internal.close-notify-flush-timeout")
 			.defaultValue(-1)
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java
index 05ad75c..a5014c8 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java
@@ -72,6 +72,7 @@ public class TaskManagerOptions {
 	/**
 	 * Whether to kill the TaskManager when the task thread throws an OutOfMemoryError.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<Boolean> KILL_ON_OUT_OF_MEMORY =
 			key("taskmanager.jvm-exit-on-oom")
 			.booleanType()
@@ -96,6 +97,7 @@ public class TaskManagerOptions {
 	 * The config parameter defining the task manager's hostname.
 	 * Overrides {@link #HOST_BIND_POLICY} automatic address binding.
 	 */
+	@Documentation.Section({Documentation.Sections.COMMON_HOST_PORT, Documentation.Sections.ALL_TASK_MANAGER})
 	public static final ConfigOption<String> HOST =
 		key("taskmanager.host")
 			.stringType()
@@ -109,6 +111,7 @@ public class TaskManagerOptions {
 	 * The default network port range the task manager expects incoming IPC connections. The {@code "0"} means that
 	 * the TaskManager searches for a free port.
 	 */
+	@Documentation.Section({Documentation.Sections.COMMON_HOST_PORT, Documentation.Sections.ALL_TASK_MANAGER})
 	public static final ConfigOption<String> RPC_PORT =
 		key("taskmanager.rpc.port")
 			.stringType()
@@ -121,6 +124,7 @@ public class TaskManagerOptions {
 	 * The initial registration backoff between two consecutive registration attempts. The backoff
 	 * is doubled for each new registration attempt until it reaches the maximum registration backoff.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<Duration> INITIAL_REGISTRATION_BACKOFF =
 		key("taskmanager.registration.initial-backoff")
 			.durationType()
@@ -132,6 +136,7 @@ public class TaskManagerOptions {
 	/**
 	 * The maximum registration backoff between two consecutive registration attempts.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<Duration> REGISTRATION_MAX_BACKOFF =
 		key("taskmanager.registration.max-backoff")
 			.durationType()
@@ -143,6 +148,7 @@ public class TaskManagerOptions {
 	/**
 	 * The backoff after a registration has been refused by the job manager before retrying to connect.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<Duration> REFUSED_REGISTRATION_BACKOFF =
 		key("taskmanager.registration.refused-backoff")
 			.durationType()
@@ -154,6 +160,7 @@ public class TaskManagerOptions {
 	 * Defines the timeout it can take for the TaskManager registration. If the duration is
 	 * exceeded without a successful registration, then the TaskManager terminates.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<Duration> REGISTRATION_TIMEOUT =
 		key("taskmanager.registration.timeout")
 			.durationType()
@@ -165,9 +172,7 @@ public class TaskManagerOptions {
 	/**
 	 * The config parameter defining the number of task slots of a task manager.
 	 */
-	@Documentation.Section(
-		value = {Documentation.Section.SECTION_COMMON},
-		position = Documentation.Section.POSITION_PARALLELISM_SLOTS)
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<Integer> NUM_TASK_SLOTS =
 		key("taskmanager.numberOfTaskSlots")
 			.intType()
@@ -179,6 +184,7 @@ public class TaskManagerOptions {
 				" is typically proportional to the number of physical CPU cores that the TaskManager's machine has" +
 				" (e.g., equal to the number of cores, or half the number of cores).");
 
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<Boolean> DEBUG_MEMORY_LOG =
 		key("taskmanager.debug.memory.log")
 			.booleanType()
@@ -186,6 +192,7 @@ public class TaskManagerOptions {
 			.withDeprecatedKeys("taskmanager.debug.memory.startLogThread")
 			.withDescription("Flag indicating whether to start a thread, which repeatedly logs the memory usage of the JVM.");
 
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<Long> DEBUG_MEMORY_USAGE_LOG_INTERVAL_MS =
 		key("taskmanager.debug.memory.log-interval")
 			.longType()
@@ -200,6 +207,7 @@ public class TaskManagerOptions {
 	/**
 	 * Size of memory buffers used by the network stack and the memory manager.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<MemorySize> MEMORY_SEGMENT_SIZE =
 			key("taskmanager.memory.segment-size")
 			.memoryType()
@@ -210,6 +218,7 @@ public class TaskManagerOptions {
 	 * The config parameter for automatically defining the TaskManager's binding address,
 	 * if {@link #HOST} configuration option is not set.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<String> HOST_BIND_POLICY =
 		key("taskmanager.network.bind-policy")
 			.stringType()
@@ -254,9 +263,7 @@ public class TaskManagerOptions {
 	/**
 	 * Total Process Memory size for the TaskExecutors.
 	 */
-	@Documentation.Section(
-		value = {Documentation.Section.SECTION_COMMON},
-		position = Documentation.Section.POSITION_MEMORY)
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<MemorySize> TOTAL_PROCESS_MEMORY =
 		key("taskmanager.memory.process.size")
 			.memoryType()
@@ -270,9 +277,7 @@ public class TaskManagerOptions {
 	/**
 	 * Total Flink Memory size for the TaskExecutors.
 	 */
-	@Documentation.Section(
-		value = {Documentation.Section.SECTION_COMMON},
-		position = Documentation.Section.POSITION_MEMORY)
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<MemorySize> TOTAL_FLINK_MEMORY =
 		key("taskmanager.memory.flink.size")
 			.memoryType()
@@ -286,6 +291,7 @@ public class TaskManagerOptions {
 	/**
 	 * Framework Heap Memory size for TaskExecutors.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<MemorySize> FRAMEWORK_HEAP_MEMORY =
 		key("taskmanager.memory.framework.heap.size")
 			.memoryType()
@@ -296,6 +302,7 @@ public class TaskManagerOptions {
 	/**
 	 * Framework Off-Heap Memory size for TaskExecutors.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<MemorySize> FRAMEWORK_OFF_HEAP_MEMORY =
 		key("taskmanager.memory.framework.off-heap.size")
 			.memoryType()
@@ -308,6 +315,7 @@ public class TaskManagerOptions {
 	/**
 	 * Task Heap Memory size for TaskExecutors.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<MemorySize> TASK_HEAP_MEMORY =
 		key("taskmanager.memory.task.heap.size")
 			.memoryType()
@@ -319,6 +327,7 @@ public class TaskManagerOptions {
 	/**
 	 * Task Off-Heap Memory size for TaskExecutors.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<MemorySize> TASK_OFF_HEAP_MEMORY =
 		key("taskmanager.memory.task.off-heap.size")
 			.memoryType()
@@ -330,6 +339,7 @@ public class TaskManagerOptions {
 	/**
 	 * Managed Memory size for TaskExecutors.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<MemorySize> MANAGED_MEMORY_SIZE =
 		key("taskmanager.memory.managed.size")
 			.memoryType()
@@ -345,6 +355,7 @@ public class TaskManagerOptions {
 	/**
 	 * Fraction of Total Flink Memory to be used as Managed Memory, if {@link #MANAGED_MEMORY_SIZE} is not specified.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<Float> MANAGED_MEMORY_FRACTION =
 		key("taskmanager.memory.managed.fraction")
 			.floatType()
@@ -355,6 +366,7 @@ public class TaskManagerOptions {
 	/**
 	 * Min Network Memory size for TaskExecutors.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<MemorySize> NETWORK_MEMORY_MIN =
 		key("taskmanager.memory.network.min")
 			.memoryType()
@@ -369,6 +381,7 @@ public class TaskManagerOptions {
 	/**
 	 * Max Network Memory size for TaskExecutors.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<MemorySize> NETWORK_MEMORY_MAX =
 		key("taskmanager.memory.network.max")
 			.memoryType()
@@ -383,6 +396,7 @@ public class TaskManagerOptions {
 	/**
 	 * Fraction of Total Flink Memory to be used as Network Memory.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<Float> NETWORK_MEMORY_FRACTION =
 		key("taskmanager.memory.network.fraction")
 			.floatType()
@@ -397,6 +411,7 @@ public class TaskManagerOptions {
 	/**
 	 * JVM Metaspace Size for the TaskExecutors.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<MemorySize> JVM_METASPACE =
 		key("taskmanager.memory.jvm-metaspace.size")
 			.memoryType()
@@ -406,6 +421,7 @@ public class TaskManagerOptions {
 	/**
 	 * Min JVM Overhead size for the TaskExecutors.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<MemorySize> JVM_OVERHEAD_MIN =
 		key("taskmanager.memory.jvm-overhead.min")
 			.memoryType()
@@ -420,6 +436,7 @@ public class TaskManagerOptions {
 	/**
 	 * Max JVM Overhead size for the TaskExecutors.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<MemorySize> JVM_OVERHEAD_MAX =
 		key("taskmanager.memory.jvm-overhead.max")
 			.memoryType()
@@ -434,6 +451,7 @@ public class TaskManagerOptions {
 	/**
 	 * Fraction of Total Process Memory to be reserved for JVM Overhead.
 	 */
+	@Documentation.Section(Documentation.Sections.COMMON_MEMORY)
 	public static final ConfigOption<Float> JVM_OVERHEAD_FRACTION =
 		key("taskmanager.memory.jvm-overhead.fraction")
 			.floatType()
@@ -454,6 +472,7 @@ public class TaskManagerOptions {
 	 * Time interval in milliseconds between two successive task cancellation
 	 * attempts.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<Long> TASK_CANCELLATION_INTERVAL =
 			key("task.cancellation.interval")
 			.longType()
@@ -466,6 +485,7 @@ public class TaskManagerOptions {
 	 * leads to a fatal TaskManager error. A value of <code>0</code> deactivates
 	 * the watch dog.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<Long> TASK_CANCELLATION_TIMEOUT =
 			key("task.cancellation.timeout")
 			.longType()
@@ -477,6 +497,7 @@ public class TaskManagerOptions {
 	 * This configures how long we wait for the timers in milliseconds to finish all pending timer threads
 	 * when the stream task is cancelled.
 	 */
+	@Documentation.Section(Documentation.Sections.ALL_TASK_MANAGER)
 	public static final ConfigOption<Long> TASK_CANCELLATION_TIMEOUT_TIMERS = ConfigOptions
 			.key("task.cancellation.timers.timeout")
 			.longType()
@@ -492,6 +513,7 @@ public class TaskManagerOptions {
 	 *
 	 * <p>The default value of {@code -1} indicates that there is no limit.
 	 */
+	@Documentation.ExcludeFromDocumentation("With flow control, there is no alignment spilling any more")
 	public static final ConfigOption<Long> TASK_CHECKPOINT_ALIGNMENT_BYTES_LIMIT =
 			key("task.checkpoint.alignment.max-size")
 			.longType()
diff --git a/flink-core/src/main/java/org/apache/flink/configuration/WebOptions.java b/flink-core/src/main/java/org/apache/flink/configuration/WebOptions.java
index d7a41b1..762f376 100644
--- a/flink-core/src/main/java/org/apache/flink/configuration/WebOptions.java
+++ b/flink-core/src/main/java/org/apache/flink/configuration/WebOptions.java
@@ -32,6 +32,7 @@ public class WebOptions {
 	/**
 	 * Config parameter defining the runtime monitor web-frontend server address.
 	 */
+	@Deprecated
 	public static final ConfigOption<String> ADDRESS =
 		key("web.address")
 			.noDefaultValue()
@@ -71,6 +72,7 @@ public class WebOptions {
 	/**
 	 * Config parameter to override SSL support for the JobManager Web UI.
 	 */
+	@Deprecated
 	public static final ConfigOption<Boolean> SSL_ENABLED =
 		key("web.ssl.enabled")
 			.defaultValue(true)
diff --git a/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBOptions.java b/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBOptions.java
index c147c38..932d036 100644
--- a/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBOptions.java
+++ b/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBOptions.java
@@ -18,6 +18,7 @@
 
 package org.apache.flink.contrib.streaming.state;
 
+import org.apache.flink.annotation.docs.Documentation;
 import org.apache.flink.configuration.ConfigOption;
 import org.apache.flink.configuration.ConfigOptions;
 import org.apache.flink.configuration.MemorySize;
@@ -35,6 +36,7 @@ import static org.apache.flink.contrib.streaming.state.RocksDBStateBackend.Prior
 public class RocksDBOptions {
 
 	/** The local directory (on the TaskManager) where RocksDB puts its files. */
+	@Documentation.Section(Documentation.Sections.EXPERT_ROCKSDB)
 	public static final ConfigOption<String> LOCAL_DIRECTORIES = ConfigOptions
 		.key("state.backend.rocksdb.localdir")
 		.noDefaultValue()
@@ -44,6 +46,7 @@ public class RocksDBOptions {
 	/**
 	 * Choice of timer service implementation.
 	 */
+	@Documentation.Section(Documentation.Sections.STATE_BACKEND_ROCKSDB)
 	public static final ConfigOption<String> TIMER_SERVICE_FACTORY = ConfigOptions
 		.key("state.backend.rocksdb.timer-service.factory")
 		.defaultValue(ROCKSDB.name())
@@ -54,6 +57,7 @@ public class RocksDBOptions {
 	/**
 	 * The number of threads used to transfer (download and upload) files in RocksDBStateBackend.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_ROCKSDB)
 	public static final ConfigOption<Integer> CHECKPOINT_TRANSFER_THREAD_NUM = ConfigOptions
 		.key("state.backend.rocksdb.checkpoint.transfer.thread.num")
 		.defaultValue(1)
@@ -75,6 +79,7 @@ public class RocksDBOptions {
 	/**
 	 * The predefined settings for RocksDB DBOptions and ColumnFamilyOptions by Flink community.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_ROCKSDB)
 	public static final ConfigOption<String> PREDEFINED_OPTIONS = ConfigOptions
 		.key("state.backend.rocksdb.predefined-options")
 		.defaultValue(DEFAULT.name())
@@ -86,6 +91,7 @@ public class RocksDBOptions {
 	/**
 	 * The options factory class for RocksDB to create DBOptions and ColumnFamilyOptions.
 	 */
+	@Documentation.Section(Documentation.Sections.EXPERT_ROCKSDB)
 	public static final ConfigOption<String> OPTIONS_FACTORY = ConfigOptions
 		.key("state.backend.rocksdb.options-factory")
 		.defaultValue(DefaultConfigurableOptionsFactory.class.getName())
@@ -93,6 +99,7 @@ public class RocksDBOptions {
 				"The default options factory is %s, and it would read the configured options which provided in 'RocksDBConfigurableOptions'.",
 				DefaultConfigurableOptionsFactory.class.getName()));
 
+	@Documentation.Section(Documentation.Sections.STATE_BACKEND_ROCKSDB)
 	public static final ConfigOption<Boolean> USE_MANAGED_MEMORY = ConfigOptions
 		.key("state.backend.rocksdb.memory.managed")
 		.booleanType()
@@ -101,6 +108,7 @@ public class RocksDBOptions {
 			"managed memory budget of the task slot, and divide the memory over write buffers, indexes, " +
 			"block caches, etc. That way, the three major uses of memory of RocksDB will be capped.");
 
+	@Documentation.Section(Documentation.Sections.STATE_BACKEND_ROCKSDB)
 	public static final ConfigOption<MemorySize> FIX_PER_SLOT_MEMORY_SIZE = ConfigOptions
 		.key("state.backend.rocksdb.memory.fixed-per-slot")
 		.memoryType()
@@ -111,6 +119,7 @@ public class RocksDBOptions {
 			"are set, then each RocksDB column family state has its own memory caches (as controlled by the column " +
 			"family options).", USE_MANAGED_MEMORY.key(), USE_MANAGED_MEMORY.key()));
 
+	@Documentation.Section(Documentation.Sections.STATE_BACKEND_ROCKSDB)
 	public static final ConfigOption<Double> WRITE_BUFFER_RATIO = ConfigOptions
 		.key("state.backend.rocksdb.memory.write-buffer-ratio")
 		.doubleType()
@@ -121,6 +130,7 @@ public class RocksDBOptions {
 			USE_MANAGED_MEMORY.key(),
 			FIX_PER_SLOT_MEMORY_SIZE.key()));
 
+	@Documentation.Section(Documentation.Sections.STATE_BACKEND_ROCKSDB)
 	public static final ConfigOption<Double> HIGH_PRIORITY_POOL_RATIO = ConfigOptions
 		.key("state.backend.rocksdb.memory.high-prio-pool-ratio")
 		.doubleType()


[flink] 07/10: [FLINK-15695][docs] Remove outdated background sections from configuration docs

Posted by se...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

sewen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit d6a7b01d2424bdf81fce124948bd33ea05a6ce3b
Author: Stephan Ewen <se...@apache.org>
AuthorDate: Fri Jan 31 10:26:05 2020 +0100

    [FLINK-15695][docs] Remove outdated background sections from configuration docs
---
 docs/ops/config.md    | 114 ++++++--------------------------------------------
 docs/ops/config.zh.md | 114 ++++++--------------------------------------------
 2 files changed, 24 insertions(+), 204 deletions(-)

diff --git a/docs/ops/config.md b/docs/ops/config.md
index 032faa5..ff07d75 100644
--- a/docs/ops/config.md
+++ b/docs/ops/config.md
@@ -59,7 +59,7 @@ These value are configured as memory sizes, for example *1536m* or *2g*.
 **Parallelism**
 
   - `taskmanager.numberOfTaskSlots`: The number of slots that a TaskManager offers *(default: 1)*. Each slot can take one task or pipeline.
-    Having multiple slots in a TaskManager can help amortize certain constant overheads (of the JVM, application libraries, or network connections) across parallel tasks or pipelines.
+    Having multiple slots in a TaskManager can help amortize certain constant overheads (of the JVM, application libraries, or network connections) across parallel tasks or pipelines. See the [Task Slots and Resources]({{site.baseurl}}/concepts/runtime.html#task-slots-and-resources) concepts section for details.
 
      Running more smaller TaskManagers with one slot each is a good starting point and leads to the best isolation between tasks. Dedicating the same resources to fewer larger TaskManagers with more slots can help to increase resource utilization, at the cost of weaker isolation between the tasks (more tasks share the same JVM).
 
@@ -380,10 +380,20 @@ Flink does not use Akka for data transport.
 ----
 ----
 
-# Startup Script Environment
+# JVM and Logging Options
 
 {% include generated/environment_configuration.html %}
 
+# Forwarding Environment Variables
+
+You can configure  environment variables to be set on the Flink Master and TaskManager processes started on Yarn/Mesos.
+
+  - `containerized.master.env.`: Prefix for passing custom environment variables to Flink's master process. 
+   For example for passing LD_LIBRARY_PATH as an env variable to the Master, set containerized.master.env.LD_LIBRARY_PATH: "/usr/lib/native"
+    in the flink-conf.yaml.
+
+  - `containerized.taskmanager.env.`: Similar to the above, this configuration prefix allows setting custom environment variables for the workers (TaskManagers).
+
 ----
 ----
 
@@ -424,104 +434,4 @@ These options may be removed in a future release.
 
 {% include generated/execution_checkpointing_configuration.html %}
 
-----
-----
-
-# Background
-
-### Configuring the Network Buffers
-
-If you ever see the Exception `java.io.IOException: Insufficient number of network buffers`, you
-need to adapt the amount of memory used for network buffers in order for your program to run on your
-task managers.
-
-Network buffers are a critical resource for the communication layers. They are used to buffer
-records before transmission over a network, and to buffer incoming data before dissecting it into
-records and handing them to the application. A sufficient number of network buffers is critical to
-achieve a good throughput.
-
-<div class="alert alert-info">
-Since Flink 1.3, you may follow the idiom "more is better" without any penalty on the latency (we
-prevent excessive buffering in each outgoing and incoming channel, i.e. *buffer bloat*, by limiting
-the actual number of buffers used by each channel).
-</div>
-
-In general, configure the task manager to have enough buffers that each logical network connection
-you expect to be open at the same time has a dedicated buffer. A logical network connection exists
-for each point-to-point exchange of data over the network, which typically happens at
-repartitioning or broadcasting steps (shuffle phase). In those, each parallel task inside the
-TaskManager has to be able to talk to all other parallel tasks.
-
-<div class="alert alert-warning">
-  <strong>Note:</strong> Since Flink 1.5, network buffers will always be allocated off-heap, i.e. outside of the JVM heap. This way, we can pass these buffers directly to the underlying network stack layers.
-</div>
-
-#### Setting Memory Fractions
-
-Previously, the number of network buffers was set manually which became a quite error-prone task
-(see below). Since Flink 1.3, it is possible to define a fraction of memory that is being used for
-network buffers with the following configuration parameters:
-
-- `taskmanager.memory.network.fraction`: Fraction of JVM memory to use for network buffers (DEFAULT: 0.1),
-- `taskmanager.memory.network.min`: Minimum memory size for network buffers (DEFAULT: 64MB),
-- `taskmanager.memory.network.max`: Maximum memory size for network buffers (DEFAULT: 1GB), and
-- `taskmanager.memory.segment-size`: Size of memory buffers used by the memory manager and the
-network stack in bytes (DEFAULT: 32KB).
-
-#### Setting the Number of Network Buffers directly
-
-<div class="alert alert-warning">
-  <strong>Note:</strong> This way of configuring the amount of memory used for network buffers is deprecated. Please consider using the method above by defining a fraction of memory to use.
-</div>
-
-The required number of buffers on a task manager is
-*total-degree-of-parallelism* (number of targets) \* *intra-node-parallelism* (number of sources in one task manager) \* *n*
-with *n* being a constant that defines how many repartitioning-/broadcasting steps you expect to be
-active at the same time. Since the *intra-node-parallelism* is typically the number of cores, and
-more than 4 repartitioning or broadcasting channels are rarely active in parallel, it frequently
-boils down to
-
-{% highlight plain %}
-#slots-per-TM^2 * #TMs * 4
-{% endhighlight %}
-
-Where `#slots per TM` are the [number of slots per TaskManager](#configuring-taskmanager-processing-slots) and `#TMs` are the total number of task managers.
-
-To support, for example, a cluster of 20 8-slot machines, you should use roughly 5000 network
-buffers for optimal throughput.
-
-Each network buffer has by default a size of 32 KiBytes. In the example above, the system would thus
-allocate roughly 300 MiBytes for network buffers.
-
-The number and size of network buffers can be configured with the following parameters:
-
-- `taskmanager.network.numberOfBuffers`, and
-- `taskmanager.memory.segment-size`.
-
-### Configuring Temporary I/O Directories
-
-Although Flink aims to process as much data in main memory as possible, it is not uncommon that more data needs to be processed than memory is available. Flink's runtime is designed to write temporary data to disk to handle these situations.
-
-The `io.tmp.dirs` parameter specifies a list of directories into which Flink writes temporary files. The paths of the directories need to be separated by ':' (colon character). Flink will concurrently write (or read) one temporary file to (from) each configured directory. This way, temporary I/O can be evenly distributed over multiple independent I/O devices such as hard disks to improve performance. To leverage fast I/O devices (e.g., SSD, RAID, NAS), it is possible to specify a directo [...]
-
-If the `io.tmp.dirs` parameter is not explicitly specified, Flink writes temporary data to the temporary directory of the operating system, such as */tmp* in Linux systems.
-
-### Configuring TaskManager processing slots
-
-Flink executes a program in parallel by splitting it into subtasks and scheduling these subtasks to processing slots.
-
-Each Flink TaskManager provides processing slots in the cluster. The number of slots is typically proportional to the number of available CPU cores __of each__ TaskManager. As a general recommendation, the number of available CPU cores is a good default for `taskmanager.numberOfTaskSlots`.
-
-When starting a Flink application, users can supply the default number of slots to use for that job. The command line value therefore is called `-p` (for parallelism). In addition, it is possible to [set the number of slots in the programming APIs]({{site.baseurl}}/dev/parallel.html) for the whole application and for individual operators.
-
-<img src="{{ site.baseurl }}/fig/slots_parallelism.svg" class="img-responsive" />
-
-### Configuration Runtime Environment Variables
-You have to set config with prefix `containerized.master.env.` and `containerized.taskmanager.env.` in order to set redefined environment variable in ApplicationMaster and TaskManager.
-
-- `containerized.master.env.`: Prefix for passing custom environment variables to Flink's master process. 
-   For example for passing LD_LIBRARY_PATH as an env variable to the AppMaster, set containerized.master.env.LD_LIBRARY_PATH: "/usr/lib/native"
-    in the flink-conf.yaml.
-- `containerized.taskmanager.env.`: Similar to the above, this configuration prefix allows setting custom environment variables for the workers (TaskManagers).
-
 {% top %}
diff --git a/docs/ops/config.zh.md b/docs/ops/config.zh.md
index 6b70ec5..99d7092 100644
--- a/docs/ops/config.zh.md
+++ b/docs/ops/config.zh.md
@@ -59,7 +59,7 @@ These value are configured as memory sizes, for example *1536m* or *2g*.
 **Parallelism**
 
   - `taskmanager.numberOfTaskSlots`: The number of slots that a TaskManager offers *(default: 1)*. Each slot can take one task or pipeline.
-    Having multiple slots in a TaskManager can help amortize certain constant overheads (of the JVM, application libraries, or network connections) across parallel tasks or pipelines.
+    Having multiple slots in a TaskManager can help amortize certain constant overheads (of the JVM, application libraries, or network connections) across parallel tasks or pipelines. See the [Task Slots and Resources]({{site.baseurl}}/concepts/runtime.html#task-slots-and-resources) concepts section for details.
 
      Running more smaller TaskManagers with one slot each is a good starting point and leads to the best isolation between tasks. Dedicating the same resources to fewer larger TaskManagers with more slots can help to increase resource utilization, at the cost of weaker isolation between the tasks (more tasks share the same JVM).
 
@@ -380,10 +380,20 @@ Flink does not use Akka for data transport.
 ----
 ----
 
-# Startup Script Environment
+# JVM and Logging Options
 
 {% include generated/environment_configuration.html %}
 
+# Forwarding Environment Variables
+
+You can configure  environment variables to be set on the Flink Master and TaskManager processes started on Yarn/Mesos.
+
+  - `containerized.master.env.`: Prefix for passing custom environment variables to Flink's master process. 
+   For example for passing LD_LIBRARY_PATH as an env variable to the Master, set containerized.master.env.LD_LIBRARY_PATH: "/usr/lib/native"
+    in the flink-conf.yaml.
+
+  - `containerized.taskmanager.env.`: Similar to the above, this configuration prefix allows setting custom environment variables for the workers (TaskManagers).
+
 ----
 ----
 
@@ -424,104 +434,4 @@ These options may be removed in a future release.
 
 {% include generated/execution_checkpointing_configuration.html %}
 
-----
-----
-
-# Background
-
-### Configuring the Network Buffers
-
-If you ever see the Exception `java.io.IOException: Insufficient number of network buffers`, you
-need to adapt the amount of memory used for network buffers in order for your program to run on your
-task managers.
-
-Network buffers are a critical resource for the communication layers. They are used to buffer
-records before transmission over a network, and to buffer incoming data before dissecting it into
-records and handing them to the application. A sufficient number of network buffers is critical to
-achieve a good throughput.
-
-<div class="alert alert-info">
-Since Flink 1.3, you may follow the idiom "more is better" without any penalty on the latency (we
-prevent excessive buffering in each outgoing and incoming channel, i.e. *buffer bloat*, by limiting
-the actual number of buffers used by each channel).
-</div>
-
-In general, configure the task manager to have enough buffers that each logical network connection
-you expect to be open at the same time has a dedicated buffer. A logical network connection exists
-for each point-to-point exchange of data over the network, which typically happens at
-repartitioning or broadcasting steps (shuffle phase). In those, each parallel task inside the
-TaskManager has to be able to talk to all other parallel tasks.
-
-<div class="alert alert-warning">
-  <strong>Note:</strong> Since Flink 1.5, network buffers will always be allocated off-heap, i.e. outside of the JVM heap. This way, we can pass these buffers directly to the underlying network stack layers.
-</div>
-
-#### Setting Memory Fractions
-
-Previously, the number of network buffers was set manually which became a quite error-prone task
-(see below). Since Flink 1.3, it is possible to define a fraction of memory that is being used for
-network buffers with the following configuration parameters:
-
-- `taskmanager.memory.network.fraction`: Fraction of JVM memory to use for network buffers (DEFAULT: 0.1),
-- `taskmanager.memory.network.min`: Minimum memory size for network buffers (DEFAULT: 64MB),
-- `taskmanager.memory.network.max`: Maximum memory size for network buffers (DEFAULT: 1GB), and
-- `taskmanager.memory.segment-size`: Size of memory buffers used by the memory manager and the
-network stack in bytes (DEFAULT: 32KB).
-
-#### Setting the Number of Network Buffers directly
-
-<div class="alert alert-warning">
-  <strong>Note:</strong> This way of configuring the amount of memory used for network buffers is deprecated. Please consider using the method above by defining a fraction of memory to use.
-</div>
-
-The required number of buffers on a task manager is
-*total-degree-of-parallelism* (number of targets) \* *intra-node-parallelism* (number of sources in one task manager) \* *n*
-with *n* being a constant that defines how many repartitioning-/broadcasting steps you expect to be
-active at the same time. Since the *intra-node-parallelism* is typically the number of cores, and
-more than 4 repartitioning or broadcasting channels are rarely active in parallel, it frequently
-boils down to
-
-{% highlight plain %}
-#slots-per-TM^2 * #TMs * 4
-{% endhighlight %}
-
-Where `#slots per TM` are the [number of slots per TaskManager](#configuring-taskmanager-processing-slots) and `#TMs` are the total number of task managers.
-
-To support, for example, a cluster of 20 8-slot machines, you should use roughly 5000 network
-buffers for optimal throughput.
-
-Each network buffer has by default a size of 32 KiBytes. In the example above, the system would thus
-allocate roughly 300 MiBytes for network buffers.
-
-The number and size of network buffers can be configured with the following parameters:
-
-- `taskmanager.network.numberOfBuffers`, and
-- `taskmanager.memory.segment-size`.
-
-### Configuring Temporary I/O Directories
-
-Although Flink aims to process as much data in main memory as possible, it is not uncommon that more data needs to be processed than memory is available. Flink's runtime is designed to write temporary data to disk to handle these situations.
-
-The `io.tmp.dirs` parameter specifies a list of directories into which Flink writes temporary files. The paths of the directories need to be separated by ':' (colon character). Flink will concurrently write (or read) one temporary file to (from) each configured directory. This way, temporary I/O can be evenly distributed over multiple independent I/O devices such as hard disks to improve performance. To leverage fast I/O devices (e.g., SSD, RAID, NAS), it is possible to specify a directo [...]
-
-If the `io.tmp.dirs` parameter is not explicitly specified, Flink writes temporary data to the temporary directory of the operating system, such as */tmp* in Linux systems.
-
-### Configuring TaskManager processing slots
-
-Flink executes a program in parallel by splitting it into subtasks and scheduling these subtasks to processing slots.
-
-Each Flink TaskManager provides processing slots in the cluster. The number of slots is typically proportional to the number of available CPU cores __of each__ TaskManager. As a general recommendation, the number of available CPU cores is a good default for `taskmanager.numberOfTaskSlots`.
-
-When starting a Flink application, users can supply the default number of slots to use for that job. The command line value therefore is called `-p` (for parallelism). In addition, it is possible to [set the number of slots in the programming APIs]({{site.baseurl}}/dev/parallel.html) for the whole application and for individual operators.
-
-<img src="{{ site.baseurl }}/fig/slots_parallelism.svg" class="img-responsive" />
-
-### Configuration Runtime Environment Variables
-You have to set config with prefix `containerized.master.env.` and `containerized.taskmanager.env.` in order to set redefined environment variable in ApplicationMaster and TaskManager.
-
-- `containerized.master.env.`: Prefix for passing custom environment variables to Flink's master process. 
-   For example for passing LD_LIBRARY_PATH as an env variable to the AppMaster, set containerized.master.env.LD_LIBRARY_PATH: "/usr/lib/native"
-    in the flink-conf.yaml.
-- `containerized.taskmanager.env.`: Similar to the above, this configuration prefix allows setting custom environment variables for the workers (TaskManagers).
-
 {% top %}