You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by ja...@apache.org on 2020/05/19 07:46:21 UTC

[flink] 01/02: [FLINK-17353][docs] Fix Broken links in Flink docs master

This is an automated email from the ASF dual-hosted git repository.

jark pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 4dc3bdaa460f35ea6e0fa31ce9068a7dba54316d
Author: yangyichao-mango <10...@qq.com>
AuthorDate: Sun May 17 15:04:28 2020 +0800

    [FLINK-17353][docs] Fix Broken links in Flink docs master
    
    This closes #12196
---
 docs/concepts/flink-architecture.zh.md             | 132 +++++++++++
 docs/dev/connectors/elasticsearch.md               |   2 +-
 docs/dev/connectors/elasticsearch.zh.md            |   2 +-
 docs/dev/stream/state/checkpointing.md             |   4 +-
 docs/dev/stream/state/checkpointing.zh.md          |   4 +-
 docs/dev/table/common.zh.md                        |   2 +-
 docs/dev/user_defined_functions.zh.md              | 241 +++++++++++++++++++++
 .../flink-operations-playground.md                 |   4 +-
 .../flink-operations-playground.zh.md              |   4 +-
 .../walkthroughs/python_table_api.zh.md            |   2 +-
 docs/index.md                                      |   7 +-
 docs/index.zh.md                                   |   5 +-
 docs/internals/task_lifecycle.md                   |   2 +-
 docs/internals/task_lifecycle.zh.md                |   2 +-
 docs/monitoring/metrics.zh.md                      |   2 +-
 docs/ops/config.md                                 |   2 +-
 docs/ops/config.zh.md                              |   2 +-
 docs/ops/memory/mem_migration.zh.md                |   6 +-
 docs/ops/memory/mem_trouble.zh.md                  |   4 +-
 docs/ops/memory/mem_tuning.zh.md                   |   3 +-
 docs/ops/python_shell.zh.md                        |   2 +-
 docs/ops/state/savepoints.md                       |   2 +-
 docs/ops/state/savepoints.zh.md                    |   2 +-
 23 files changed, 406 insertions(+), 32 deletions(-)

diff --git a/docs/concepts/flink-architecture.zh.md b/docs/concepts/flink-architecture.zh.md
new file mode 100644
index 0000000..8414943
--- /dev/null
+++ b/docs/concepts/flink-architecture.zh.md
@@ -0,0 +1,132 @@
+---
+title: Flink Architecture
+nav-id: flink-architecture
+nav-pos: 4
+nav-title: Flink Architecture
+nav-parent_id: concepts
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+* This will be replaced by the TOC
+{:toc}
+
+## Flink Applications and Flink Sessions
+
+`TODO: expand this section`
+
+{% top %}
+
+## Anatomy of a Flink Cluster
+
+`TODO: expand this section, especially about components of the Flink Master and
+container environments`
+
+The Flink runtime consists of two types of processes:
+
+  - The *Flink Master* coordinates the distributed execution. It schedules
+    tasks, coordinates checkpoints, coordinates recovery on failures, etc.
+
+    There is always at least one *Flink Master*. A high-availability setup
+    might have multiple *Flink Masters*, one of which is always the
+    *leader*, and the others are *standby*.
+
+  - The *TaskManagers* (also called *workers*) execute the *tasks* (or more
+    specifically, the subtasks) of a dataflow, and buffer and exchange the data
+    *streams*.
+
+    There must always be at least one TaskManager.
+
+The Flink Master and TaskManagers can be started in various ways: directly on
+the machines as a [standalone cluster]({% link
+ops/deployment/cluster_setup.md %}), in containers, or managed by resource
+frameworks like [YARN]({% link ops/deployment/yarn_setup.md
+%}) or [Mesos]({% link ops/deployment/mesos.md %}).
+TaskManagers connect to Flink Masters, announcing themselves as available, and
+are assigned work.
+
+The *client* is not part of the runtime and program execution, but is used to
+prepare and send a dataflow to the Flink Master.  After that, the client can
+disconnect, or stay connected to receive progress reports. The client runs
+either as part of the Java/Scala program that triggers the execution, or in the
+command line process `./bin/flink run ...`.
+
+<img src="{{ site.baseurl }}/fig/processes.svg" alt="The processes involved in executing a Flink dataflow" class="offset" width="80%" />
+
+{% top %}
+
+## Tasks and Operator Chains
+
+For distributed execution, Flink *chains* operator subtasks together into
+*tasks*. Each task is executed by one thread.  Chaining operators together into
+tasks is a useful optimization: it reduces the overhead of thread-to-thread
+handover and buffering, and increases overall throughput while decreasing
+latency.  The chaining behavior can be configured; see the [chaining docs]({%
+link dev/stream/operators/index.md %}#task-chaining-and-resource-groups) for
+details.
+
+The sample dataflow in the figure below is executed with five subtasks, and
+hence with five parallel threads.
+
+<img src="{{ site.baseurl }}/fig/tasks_chains.svg" alt="Operator chaining into Tasks" class="offset" width="80%" />
+
+{% top %}
+
+## Task Slots and Resources
+
+Each worker (TaskManager) is a *JVM process*, and may execute one or more
+subtasks in separate threads.  To control how many tasks a worker accepts, a
+worker has so called **task slots** (at least one).
+
+Each *task slot* represents a fixed subset of resources of the TaskManager. A
+TaskManager with three slots, for example, will dedicate 1/3 of its managed
+memory to each slot. Slotting the resources means that a subtask will not
+compete with subtasks from other jobs for managed memory, but instead has a
+certain amount of reserved managed memory. Note that no CPU isolation happens
+here; currently slots only separate the managed memory of tasks.
+
+By adjusting the number of task slots, users can define how subtasks are
+isolated from each other.  Having one slot per TaskManager means that each task
+group runs in a separate JVM (which can be started in a separate container, for
+example). Having multiple slots means more subtasks share the same JVM. Tasks
+in the same JVM share TCP connections (via multiplexing) and heartbeat
+messages. They may also share data sets and data structures, thus reducing the
+per-task overhead.
+
+<img src="{{ site.baseurl }}/fig/tasks_slots.svg" alt="A TaskManager with Task Slots and Tasks" class="offset" width="80%" />
+
+By default, Flink allows subtasks to share slots even if they are subtasks of
+different tasks, so long as they are from the same job. The result is that one
+slot may hold an entire pipeline of the job. Allowing this *slot sharing* has
+two main benefits:
+
+  - A Flink cluster needs exactly as many task slots as the highest parallelism
+    used in the job.  No need to calculate how many tasks (with varying
+    parallelism) a program contains in total.
+
+  - It is easier to get better resource utilization. Without slot sharing, the
+    non-intensive *source/map()* subtasks would block as many resources as the
+    resource intensive *window* subtasks.  With slot sharing, increasing the
+    base parallelism in our example from two to six yields full utilization of
+    the slotted resources, while making sure that the heavy subtasks are fairly
+    distributed among the TaskManagers.
+
+<img src="{{ site.baseurl }}/fig/slot_sharing.svg" alt="TaskManagers with shared Task Slots" class="offset" width="80%" />
+
+{% top %}
diff --git a/docs/dev/connectors/elasticsearch.md b/docs/dev/connectors/elasticsearch.md
index 5bc1404..4b8b2da 100644
--- a/docs/dev/connectors/elasticsearch.md
+++ b/docs/dev/connectors/elasticsearch.md
@@ -317,7 +317,7 @@ time of checkpoints. This effectively assures that all requests before the
 checkpoint was triggered have been successfully acknowledged by Elasticsearch, before
 proceeding to process more records sent to the sink.
 
-More details on checkpoints and fault tolerance are in the [fault tolerance docs]({{site.baseurl}}/internals/stream_checkpointing.html).
+More details on checkpoints and fault tolerance are in the [fault tolerance docs]({{site.baseurl}}/training/fault_tolerance.html).
 
 To use fault tolerant Elasticsearch Sinks, checkpointing of the topology needs to be enabled at the execution environment:
 
diff --git a/docs/dev/connectors/elasticsearch.zh.md b/docs/dev/connectors/elasticsearch.zh.md
index 5921954..640f4d6 100644
--- a/docs/dev/connectors/elasticsearch.zh.md
+++ b/docs/dev/connectors/elasticsearch.zh.md
@@ -317,7 +317,7 @@ time of checkpoints. This effectively assures that all requests before the
 checkpoint was triggered have been successfully acknowledged by Elasticsearch, before
 proceeding to process more records sent to the sink.
 
-More details on checkpoints and fault tolerance are in the [fault tolerance docs]({{site.baseurl}}/internals/stream_checkpointing.html).
+More details on checkpoints and fault tolerance are in the [fault tolerance docs]({{site.baseurl}}/zh/training/fault_tolerance.html).
 
 To use fault tolerant Elasticsearch Sinks, checkpointing of the topology needs to be enabled at the execution environment:
 
diff --git a/docs/dev/stream/state/checkpointing.md b/docs/dev/stream/state/checkpointing.md
index c193fc3..f5fef89 100644
--- a/docs/dev/stream/state/checkpointing.md
+++ b/docs/dev/stream/state/checkpointing.md
@@ -32,7 +32,7 @@ any type of more elaborate operation.
 In order to make state fault tolerant, Flink needs to **checkpoint** the state. Checkpoints allow Flink to recover state and positions
 in the streams to give the application the same semantics as a failure-free execution.
 
-The [documentation on streaming fault tolerance]({{ site.baseurl }}/internals/stream_checkpointing.html) describes in detail the technique behind Flink's streaming fault tolerance mechanism.
+The [documentation on streaming fault tolerance]({{ site.baseurl }}/training/fault_tolerance.html) describes in detail the technique behind Flink's streaming fault tolerance mechanism.
 
 
 ## Prerequisites
@@ -173,7 +173,7 @@ Some more parameters and/or defaults may be set via `conf/flink-conf.yaml` (see
 
 ## Selecting a State Backend
 
-Flink's [checkpointing mechanism]({{ site.baseurl }}/internals/stream_checkpointing.html) stores consistent snapshots
+Flink's [checkpointing mechanism]({{ site.baseurl }}/training/fault_tolerance.html) stores consistent snapshots
 of all the state in timers and stateful operators, including connectors, windows, and any [user-defined state](state.html).
 Where the checkpoints are stored (e.g., JobManager memory, file system, database) depends on the configured
 **State Backend**. 
diff --git a/docs/dev/stream/state/checkpointing.zh.md b/docs/dev/stream/state/checkpointing.zh.md
index d4aa989..c940ada 100644
--- a/docs/dev/stream/state/checkpointing.zh.md
+++ b/docs/dev/stream/state/checkpointing.zh.md
@@ -29,7 +29,7 @@ Flink 中的每个方法或算子都能够是**有状态的**(阅读 [working
 状态化的方法在处理单个 元素/事件 的时候存储数据,让状态成为使各个类型的算子更加精细的重要部分。
 为了让状态容错,Flink 需要为状态添加 **checkpoint(检查点)**。Checkpoint 使得 Flink 能够恢复状态和在流中的位置,从而向应用提供和无故障执行时一样的语义。
 
-[容错文档]({{ site.baseurl }}/zh/internals/stream_checkpointing.html) 中介绍了 Flink 流计算容错机制内部的技术原理。
+[容错文档]({{ site.baseurl }}/zh/training/fault_tolerance.html) 中介绍了 Flink 流计算容错机制内部的技术原理。
 
 
 ## 前提条件
@@ -165,7 +165,7 @@ env.get_checkpoint_config().set_prefer_checkpoint_for_recovery(True)
 
 ## 选择一个 State Backend
 
-Flink 的 [checkpointing 机制]({{ site.baseurl }}/zh/internals/stream_checkpointing.html) 会将 timer 以及 stateful 的 operator 进行快照,然后存储下来,
+Flink 的 [checkpointing 机制]({{ site.baseurl }}/zh/training/fault_tolerance.html) 会将 timer 以及 stateful 的 operator 进行快照,然后存储下来,
 包括连接器(connectors),窗口(windows)以及任何用户[自定义的状态](state.html)。
 Checkpoint 存储在哪里取决于所配置的 **State Backend**(比如 JobManager memory、 file system、 database)。
 
diff --git a/docs/dev/table/common.zh.md b/docs/dev/table/common.zh.md
index bd36b1e..c10f2d3 100644
--- a/docs/dev/table/common.zh.md
+++ b/docs/dev/table/common.zh.md
@@ -561,7 +561,7 @@ revenue = orders \
 
 Flink SQL 是基于实现了SQL标准的 [Apache Calcite](https://calcite.apache.org) 的。SQL 查询由常规字符串指定。
 
-文档 [SQL]({{ site.baseurl }}/zh/dev/table/sql.html) 描述了Flink对流处理和批处理表的SQL支持。
+文档 [SQL]({{ site.baseurl }}/zh/dev/table/sql/index.html) 描述了Flink对流处理和批处理表的SQL支持。
 
 下面的示例演示了如何指定查询并将结果作为 `Table` 对象返回。
 
diff --git a/docs/dev/user_defined_functions.zh.md b/docs/dev/user_defined_functions.zh.md
new file mode 100644
index 0000000..bdfbe54
--- /dev/null
+++ b/docs/dev/user_defined_functions.zh.md
@@ -0,0 +1,241 @@
+---
+title: 'User-Defined Functions'
+nav-id: user_defined_function
+nav-parent_id: streaming
+nav-pos: 4
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Most operations require a user-defined function. This section lists different
+ways of how they can be specified. We also cover `Accumulators`, which can be
+used to gain insights into your Flink application.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+
+## Implementing an interface
+
+The most basic way is to implement one of the provided interfaces:
+
+{% highlight java %}
+class MyMapFunction implements MapFunction<String, Integer> {
+  public Integer map(String value) { return Integer.parseInt(value); }
+};
+data.map(new MyMapFunction());
+{% endhighlight %}
+
+## Anonymous classes
+
+You can pass a function as an anonymous class:
+{% highlight java %}
+data.map(new MapFunction<String, Integer> () {
+  public Integer map(String value) { return Integer.parseInt(value); }
+});
+{% endhighlight %}
+
+## Java 8 Lambdas
+
+Flink also supports Java 8 Lambdas in the Java API.
+
+{% highlight java %}
+data.filter(s -> s.startsWith("http://"));
+{% endhighlight %}
+
+{% highlight java %}
+data.reduce((i1,i2) -> i1 + i2);
+{% endhighlight %}
+
+## Rich functions
+
+All transformations that require a user-defined function can
+instead take as argument a *rich* function. For example, instead of
+
+{% highlight java %}
+class MyMapFunction implements MapFunction<String, Integer> {
+  public Integer map(String value) { return Integer.parseInt(value); }
+};
+{% endhighlight %}
+
+you can write
+
+{% highlight java %}
+class MyMapFunction extends RichMapFunction<String, Integer> {
+  public Integer map(String value) { return Integer.parseInt(value); }
+};
+{% endhighlight %}
+
+and pass the function as usual to a `map` transformation:
+
+{% highlight java %}
+data.map(new MyMapFunction());
+{% endhighlight %}
+
+Rich functions can also be defined as an anonymous class:
+{% highlight java %}
+data.map (new RichMapFunction<String, Integer>() {
+  public Integer map(String value) { return Integer.parseInt(value); }
+});
+{% endhighlight %}
+
+</div>
+<div data-lang="scala" markdown="1">
+
+
+## Lambda Functions
+
+As already seen in previous examples all operations accept lambda functions for describing
+the operation:
+{% highlight scala %}
+val data: DataSet[String] = // [...]
+data.filter { _.startsWith("http://") }
+{% endhighlight %}
+
+{% highlight scala %}
+val data: DataSet[Int] = // [...]
+data.reduce { (i1,i2) => i1 + i2 }
+// or
+data.reduce { _ + _ }
+{% endhighlight %}
+
+## Rich functions
+
+All transformations that take as argument a lambda function can
+instead take as argument a *rich* function. For example, instead of
+
+{% highlight scala %}
+data.map { x => x.toInt }
+{% endhighlight %}
+
+you can write
+
+{% highlight scala %}
+class MyMapFunction extends RichMapFunction[String, Int] {
+  def map(in: String):Int = { in.toInt }
+};
+{% endhighlight %}
+
+and pass the function to a `map` transformation:
+
+{% highlight scala %}
+data.map(new MyMapFunction())
+{% endhighlight %}
+
+Rich functions can also be defined as an anonymous class:
+{% highlight scala %}
+data.map (new RichMapFunction[String, Int] {
+  def map(in: String):Int = { in.toInt }
+})
+{% endhighlight %}
+</div>
+
+</div>
+
+Rich functions provide, in addition to the user-defined function (map,
+reduce, etc), four methods: `open`, `close`, `getRuntimeContext`, and
+`setRuntimeContext`. These are useful for parameterizing the function
+(see [Passing Parameters to Functions]({{ site.baseurl }}/dev/batch/index.html#passing-parameters-to-functions)),
+creating and finalizing local state, accessing broadcast variables (see
+[Broadcast Variables]({{ site.baseurl }}/dev/batch/index.html#broadcast-variables)), and for accessing runtime
+information such as accumulators and counters (see
+[Accumulators and Counters](#accumulators--counters)), and information
+on iterations (see [Iterations]({{ site.baseurl }}/dev/batch/iterations.html)).
+
+{% top %}
+
+## Accumulators & Counters
+
+Accumulators are simple constructs with an **add operation** and a **final accumulated result**,
+which is available after the job ended.
+
+The most straightforward accumulator is a **counter**: You can increment it using the
+```Accumulator.add(V value)``` method. At the end of the job Flink will sum up (merge) all partial
+results and send the result to the client. Accumulators are useful during debugging or if you
+quickly want to find out more about your data.
+
+Flink currently has the following **built-in accumulators**. Each of them implements the
+{% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/Accumulator.java "Accumulator" %}
+interface.
+
+- {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/IntCounter.java "__IntCounter__" %},
+  {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/LongCounter.java "__LongCounter__" %}
+  and {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/DoubleCounter.java "__DoubleCounter__" %}:
+  See below for an example using a counter.
+- {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/Histogram.java "__Histogram__" %}:
+  A histogram implementation for a discrete number of bins. Internally it is just a map from Integer
+  to Integer. You can use this to compute distributions of values, e.g. the distribution of
+  words-per-line for a word count program.
+
+__How to use accumulators:__
+
+First you have to create an accumulator object (here a counter) in the user-defined transformation
+function where you want to use it.
+
+{% highlight java %}
+private IntCounter numLines = new IntCounter();
+{% endhighlight %}
+
+Second you have to register the accumulator object, typically in the ```open()``` method of the
+*rich* function. Here you also define the name.
+
+{% highlight java %}
+getRuntimeContext().addAccumulator("num-lines", this.numLines);
+{% endhighlight %}
+
+You can now use the accumulator anywhere in the operator function, including in the ```open()``` and
+```close()``` methods.
+
+{% highlight java %}
+this.numLines.add(1);
+{% endhighlight %}
+
+The overall result will be stored in the ```JobExecutionResult``` object which is
+returned from the `execute()` method of the execution environment
+(currently this only works if the execution waits for the
+completion of the job).
+
+{% highlight java %}
+myJobExecutionResult.getAccumulatorResult("num-lines")
+{% endhighlight %}
+
+All accumulators share a single namespace per job. Thus you can use the same accumulator in
+different operator functions of your job. Flink will internally merge all accumulators with the same
+name.
+
+A note on accumulators and iterations: Currently the result of accumulators is only available after
+the overall job has ended. We plan to also make the result of the previous iteration available in the
+next iteration. You can use
+{% gh_link /flink-java/src/main/java/org/apache/flink/api/java/operators/IterativeDataSet.java#L98 "Aggregators" %}
+to compute per-iteration statistics and base the termination of iterations on such statistics.
+
+__Custom accumulators:__
+
+To implement your own accumulator you simply have to write your implementation of the Accumulator
+interface. Feel free to create a pull request if you think your custom accumulator should be shipped
+with Flink.
+
+You have the choice to implement either
+{% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/Accumulator.java "Accumulator" %}
+or {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/accumulators/SimpleAccumulator.java "SimpleAccumulator" %}.
+
+```Accumulator<V,R>``` is most flexible: It defines a type ```V``` for the value to add, and a
+result type ```R``` for the final result. E.g. for a histogram, ```V``` is a number and ```R``` is
+ a histogram. ```SimpleAccumulator``` is for the cases where both types are the same, e.g. for counters.
+
+{% top %}
diff --git a/docs/getting-started/docker-playgrounds/flink-operations-playground.md b/docs/getting-started/docker-playgrounds/flink-operations-playground.md
index 1e2a569..6d9f409 100644
--- a/docs/getting-started/docker-playgrounds/flink-operations-playground.md
+++ b/docs/getting-started/docker-playgrounds/flink-operations-playground.md
@@ -316,7 +316,7 @@ docker-compose up -d taskmanager
 
 When the Master is notified about the new TaskManager, it schedules the tasks of the 
 recovering Job to the newly available TaskSlots. Upon restart, the tasks recover their state from
-the last successful [checkpoint]({{ site.baseurl }}/internals/stream_checkpointing.html) that was taken
+the last successful [checkpoint]({{ site.baseurl }}/training/fault_tolerance.html) that was taken
 before the failure and switch to the `RUNNING` state.
 
 The Job will quickly process the full backlog of input events (accumulated during the outage) 
@@ -806,7 +806,7 @@ You might have noticed that the *Click Event Count* application was always start
 and `--event-time` program arguments. By omitting these in the command of the *client* container in the 
 `docker-compose.yaml`, you can change the behavior of the Job.
 
-* `--checkpointing` enables [checkpoint]({{ site.baseurl }}/internals/stream_checkpointing.html), 
+* `--checkpointing` enables [checkpoint]({{ site.baseurl }}/training/fault_tolerance.html), 
 which is Flink's fault-tolerance mechanism. If you run without it and go through 
 [failure and recovery](#observing-failure--recovery), you should will see that data is actually 
 lost.
diff --git a/docs/getting-started/docker-playgrounds/flink-operations-playground.zh.md b/docs/getting-started/docker-playgrounds/flink-operations-playground.zh.md
index 1e2a569..6d9f409 100644
--- a/docs/getting-started/docker-playgrounds/flink-operations-playground.zh.md
+++ b/docs/getting-started/docker-playgrounds/flink-operations-playground.zh.md
@@ -316,7 +316,7 @@ docker-compose up -d taskmanager
 
 When the Master is notified about the new TaskManager, it schedules the tasks of the 
 recovering Job to the newly available TaskSlots. Upon restart, the tasks recover their state from
-the last successful [checkpoint]({{ site.baseurl }}/internals/stream_checkpointing.html) that was taken
+the last successful [checkpoint]({{ site.baseurl }}/training/fault_tolerance.html) that was taken
 before the failure and switch to the `RUNNING` state.
 
 The Job will quickly process the full backlog of input events (accumulated during the outage) 
@@ -806,7 +806,7 @@ You might have noticed that the *Click Event Count* application was always start
 and `--event-time` program arguments. By omitting these in the command of the *client* container in the 
 `docker-compose.yaml`, you can change the behavior of the Job.
 
-* `--checkpointing` enables [checkpoint]({{ site.baseurl }}/internals/stream_checkpointing.html), 
+* `--checkpointing` enables [checkpoint]({{ site.baseurl }}/training/fault_tolerance.html), 
 which is Flink's fault-tolerance mechanism. If you run without it and go through 
 [failure and recovery](#observing-failure--recovery), you should will see that data is actually 
 lost.
diff --git a/docs/getting-started/walkthroughs/python_table_api.zh.md b/docs/getting-started/walkthroughs/python_table_api.zh.md
index a82ceb3..34a0170 100644
--- a/docs/getting-started/walkthroughs/python_table_api.zh.md
+++ b/docs/getting-started/walkthroughs/python_table_api.zh.md
@@ -28,7 +28,7 @@ under the License.
 
 在该教程中,我们会从零开始,介绍如何创建一个Flink Python项目及运行Python Table API程序。
 
-关于Python执行环境的要求,请参考Python Table API[环境安装]({{ site.baseurl }}/dev/dev/table/python/installation.html)。
+关于Python执行环境的要求,请参考Python Table API[环境安装]({{ site.baseurl }}/dev/table/python/installation.html)。
 
 ## 创建一个Python Table API项目
 
diff --git a/docs/index.md b/docs/index.md
index 9f0acae..09e7f7e 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -36,9 +36,10 @@ Apache Flink is an open source platform for distributed stream and batch data pr
 * **Docker Playgrounds**: Set up a sandboxed Flink environment in just a few minutes to explore and play with Flink.
   * [Run and manage Flink streaming applications](./getting-started/docker-playgrounds/flink-operations-playground.html)
 
-* **Concepts**: Learn about Flink's basic concepts to better understand the documentation.
-  * [Dataflow Programming Model](concepts/programming-model.html)
-  * [Distributed Runtime](concepts/runtime.html)
+* **Concepts**: Learn about Flink's concepts to better understand the documentation.
+  * [Stateful Stream Processing](concepts/stateful-stream-processing.html)
+  * [Timely Stream Processing](concepts/timely-stream-processing.html)
+  * [Flink Architecture](concepts/flink-architecture.html)
   * [Glossary](concepts/glossary.html)
 
 ## API References
diff --git a/docs/index.zh.md b/docs/index.zh.md
index 2315024..76e6d45 100644
--- a/docs/index.zh.md
+++ b/docs/index.zh.md
@@ -38,8 +38,9 @@ Apache Flink 是一个分布式流批一体化的开源平台。Flink 的核心
   * [运行与管理 Flink 流处理应用](./getting-started/docker-playgrounds/flink-operations-playground.html)
 
 * **概念**: 学习 Flink 的基本概念能更好地理解文档。
-  * [数据流编程模型](concepts/programming-model.html)
-  * [分布式执行](concepts/runtime.html)
+  * [有状态流处理](concepts/stateful-stream-processing.html)
+  * [实时流处理](concepts/timely-stream-processing.html)
+  * [Flink 架构](concepts/flink-architecture.html)
   * [术语表](concepts/glossary.html)
 
 ## API 参考
diff --git a/docs/internals/task_lifecycle.md b/docs/internals/task_lifecycle.md
index 44f847f..4d5c485 100644
--- a/docs/internals/task_lifecycle.md
+++ b/docs/internals/task_lifecycle.md
@@ -92,7 +92,7 @@ operator is opened and before it is closed. The responsibility of this method is
 to the specified [state backend]({{ site.baseurl }}/ops/state/state_backends.html) from where it will be retrieved when 
 the job resumes execution after a failure. Below we include a brief description of Flink's checkpointing mechanism, 
 and for a more detailed discussion on the principles around checkpointing in Flink please read the corresponding documentation: 
-[Data Streaming Fault Tolerance]({{ site.baseurl }}/internals/stream_checkpointing.html).
+[Data Streaming Fault Tolerance]({{ site.baseurl }}/training/fault_tolerance.html).
 
 ## Task Lifecycle
 
diff --git a/docs/internals/task_lifecycle.zh.md b/docs/internals/task_lifecycle.zh.md
index 7de935b..bc5cccb 100644
--- a/docs/internals/task_lifecycle.zh.md
+++ b/docs/internals/task_lifecycle.zh.md
@@ -92,7 +92,7 @@ operator is opened and before it is closed. The responsibility of this method is
 to the specified [state backend]({{ site.baseurl }}/ops/state/state_backends.html) from where it will be retrieved when 
 the job resumes execution after a failure. Below we include a brief description of Flink's checkpointing mechanism, 
 and for a more detailed discussion on the principles around checkpointing in Flink please read the corresponding documentation: 
-[Data Streaming Fault Tolerance]({{ site.baseurl }}/internals/stream_checkpointing.html).
+[Data Streaming Fault Tolerance]({{ site.baseurl }}/training/fault_tolerance.html).
 
 ## Task Lifecycle
 
diff --git a/docs/monitoring/metrics.zh.md b/docs/monitoring/metrics.zh.md
index 04f6fd9..29c6e70 100644
--- a/docs/monitoring/metrics.zh.md
+++ b/docs/monitoring/metrics.zh.md
@@ -29,7 +29,7 @@ Flink exposes a metric system that allows gathering and exposing metrics to exte
 
 ## Registering metrics
 
-You can access the metric system from any user function that extends [RichFunction]({{ site.baseurl }}/dev/api_concepts.html#rich-functions) by calling `getRuntimeContext().getMetricGroup()`.
+You can access the metric system from any user function that extends [RichFunction]({{ site.baseurl }}/zh/dev/user_defined_functions.html#rich-functions) by calling `getRuntimeContext().getMetricGroup()`.
 This method returns a `MetricGroup` object on which you can create and register new metrics.
 
 ### Metric types
diff --git a/docs/ops/config.md b/docs/ops/config.md
index 950d65b..f33d03f 100644
--- a/docs/ops/config.md
+++ b/docs/ops/config.md
@@ -158,7 +158,7 @@ In most cases, users should only need to set the values `taskmanager.memory.proc
 
 For a detailed explanation of how these options interact,
 see the documentation on [TaskManager]({{site.baseurl}}/ops/memory/mem_setup_tm.html) and
-[JobManager]({{site.baseurl}}/ops/memory/mem_setup_jm.html) memory configurations.
+[JobManager]({{site.baseurl}}/ops/memory/mem_setup_master.html) memory configurations.
 
 {% include generated/common_memory_section.html %}
 
diff --git a/docs/ops/config.zh.md b/docs/ops/config.zh.md
index 1d3151d..80244fd 100644
--- a/docs/ops/config.zh.md
+++ b/docs/ops/config.zh.md
@@ -158,7 +158,7 @@ In most cases, users should only need to set the values `taskmanager.memory.proc
 
 For a detailed explanation of how these options interact,
 see the documentation on [TaskManager]({{site.baseurl}}/ops/memory/mem_setup_tm.html) and
-[JobManager]({{site.baseurl}}/ops/memory/mem_setup_jm.html) memory configurations.
+[JobManager]({{site.baseurl}}/ops/memory/mem_setup_master.html) memory configurations.
 
 {% include generated/common_memory_section.html %}
 
diff --git a/docs/ops/memory/mem_migration.zh.md b/docs/ops/memory/mem_migration.zh.md
index 7c8a525..72af546 100644
--- a/docs/ops/memory/mem_migration.zh.md
+++ b/docs/ops/memory/mem_migration.zh.md
@@ -119,7 +119,7 @@ Flink 自带的[默认 flink-conf.yaml](#flink-confyaml-中的默认配置) 文
 
 尽管网络内存的配置参数没有发生太多变化,我们仍建议您检查其配置结果。
 网络内存的大小可能会受到其他内存部分大小变化的影响,例如总内存变化时,根据占比计算出的网络内存也可能发生变化。
-请参考[内存模型详解](mem_detail.html)。
+请参考[内存模型详解](mem_setup.html)。
 
 容器切除(Cut-Off)内存相关的配置参数(`containerized.heap-cutoff-ratio` 和 `containerized.heap-cutoff-min`)将不再对进程生效。
 
@@ -153,7 +153,7 @@ Flink 在 Mesos 上还有另一个具有同样语义的配置参数 `mesos.resou
 或 [FsStateBackend](../state/state_backends.html#fsstatebackend)),那么它同样需要使用 JVM 堆内存。
 
 Flink 现在总是会预留一部分 JVM 堆内存供框架使用([`taskmanager.memory.framework.heap.size`](../config.html#taskmanager-memory-framework-heap-size))。
-请参考[框架内存](mem_detail.html#框架内存)。
+请参考[框架内存](mem_setup.html#框架内存)。
 
 ## 托管内存
 
@@ -201,7 +201,7 @@ Flink 现在总是会预留一部分 JVM 堆内存供框架使用([`taskmanage
 * 任务堆外内存([`taskmanager.memory.task.off-heap.size`](../config.html#taskmanager-memory-task-off-heap-size))
 * 框架堆外内存([`taskmanager.memory.framework.off-heap.size`](../config.html#taskmanager-memory-framework-off-heap-size))
 * JVM Metaspace([`taskmanager.memory.jvm-metaspace.size`](../config.html#taskmanager-memory-jvm-metaspace-size))
-* JVM 开销(请参考[内存模型详解](mem_detail.html))
+* JVM 开销(请参考[内存模型详解](mem_setup_tm.html#detailed-memory-model))
 
 <span class="label label-info">提示</span> JobManager 进程仍保留了容器切除内存,相关配置项和此前一样仍对 JobManager 生效。
 
diff --git a/docs/ops/memory/mem_trouble.zh.md b/docs/ops/memory/mem_trouble.zh.md
index 0ecb5ad..52e08e8 100644
--- a/docs/ops/memory/mem_trouble.zh.md
+++ b/docs/ops/memory/mem_trouble.zh.md
@@ -28,14 +28,14 @@ under the License.
 ## IllegalConfigurationException
 
 如果遇到从 *TaskExecutorProcessUtils* 抛出的 *IllegalConfigurationException* 异常,这通常说明您的配置参数中存在无效值(例如内存大小为负数、占比大于 1 等)或者配置冲突。
-请根据异常信息,确认[内存模型详解](mem_detail.html)中与出错的内存部分对应章节的内容。
+请根据异常信息,确认[内存模型详解](../config.html#memory-configuration)中与出错的内存部分对应章节的内容。
 
 ## OutOfMemoryError: Java heap space
 
 该异常说明 JVM 的堆空间过小。
 可以通过增大[总内存](mem_setup.html#配置总内存)或[任务堆内存](mem_setup.html#任务算子堆内存)的方法来增大 JVM 堆空间。
 
-<span class="label label-info">提示</span> 也可以增大[框架堆内存](mem_detail.html#框架内存)。这是一个进阶配置,只有在确认是 Flink 框架自身需要更多内存时才应该去调整。
+<span class="label label-info">提示</span> 也可以增大[框架堆内存](mem_setup_tm.html#框架内存)。这是一个进阶配置,只有在确认是 Flink 框架自身需要更多内存时才应该去调整。
 
 ## OutOfMemoryError: Direct buffer memory
 
diff --git a/docs/ops/memory/mem_tuning.zh.md b/docs/ops/memory/mem_tuning.zh.md
index 9fb950d..eac6331 100644
--- a/docs/ops/memory/mem_tuning.zh.md
+++ b/docs/ops/memory/mem_tuning.zh.md
@@ -30,7 +30,7 @@ under the License.
 ## 独立部署模式(Standalone Deployment)下的内存配置
 
 [独立部署模式](../deployment/cluster_setup.html),我们通常更关注 Flink 应用本身使用的内存大小。
-建议配置 [Flink 总内存](mem_setup.html#配置总内存)([`taskmanager.memory.flink.size`](../config.html#taskmanager-memory-flink-size))或者它的[组成部分](mem_detail.html)。
+建议配置 [Flink 总内存](mem_setup.html#配置总内存)([`taskmanager.memory.flink.size`](../config.html#taskmanager-memory-flink-size))或者它的([`jobmanager.memory.flink.size`])(../config.html#jobmanager-memory-flink-size.html)。
 此外,如果出现 [Metaspace 不足的问题](mem_trouble.html#outofmemoryerror-metaspace),可以调整 *JVM Metaspace* 的大小。
 
 这种情况下通常无需配置*进程总内存*,因为不管是 Flink 还是部署环境都不会对 *JVM 开销* 进行限制,它只与机器的物理资源相关。
@@ -41,7 +41,6 @@ under the License.
 该配置参数用于指定分配给 Flink *JVM 进程*的总内存,也就是需要申请的容器大小。
 
 <span class="label label-info">提示</span> 如果配置了 *Flink 总内存*,Flink 会自动加上 JVM 相关的内存部分,根据推算出的*进程总内存*大小申请容器。
-请参考[内存模型详解](mem_detail.html)。
 
 <div class="alert alert-warning">
   <strong>注意:</strong> 如果 Flink 或者用户代码分配超过容器大小的非托管的堆外(本地)内存,部署环境可能会杀掉超用内存的容器,造成作业执行失败。
diff --git a/docs/ops/python_shell.zh.md b/docs/ops/python_shell.zh.md
index e5f2a6c..2f561c7 100644
--- a/docs/ops/python_shell.zh.md
+++ b/docs/ops/python_shell.zh.md
@@ -27,7 +27,7 @@ Flink附带了一个集成的交互式Python Shell。
 本地安装Flink,请看[本地安装](deployment/local.html)页面。
 您也可以从源码安装Flink,请看[从源码构建 Flink](../flinkDev/building.html)页面。
 
-<span class="label label-info">注意</span> Python Shell会调用“python”命令。关于Python执行环境的要求,请参考Python Table API[环境安装]({{ site.baseurl }}/dev/dev/table/python/installation.html)。
+<span class="label label-info">注意</span> Python Shell会调用“python”命令。关于Python执行环境的要求,请参考Python Table API[环境安装]({{ site.baseurl }}/dev/table/python/installation.html)。
 
 你可以通过PyPi安装PyFlink,然后使用Python Shell:
 
diff --git a/docs/ops/state/savepoints.md b/docs/ops/state/savepoints.md
index c235344..d1e07f2 100644
--- a/docs/ops/state/savepoints.md
+++ b/docs/ops/state/savepoints.md
@@ -27,7 +27,7 @@ under the License.
 
 ## What is a Savepoint? How is a Savepoint different from a Checkpoint?
 
-A Savepoint is a consistent image of the execution state of a streaming job, created via Flink's [checkpointing mechanism]({{ site.baseurl }}/internals/stream_checkpointing.html). You can use Savepoints to stop-and-resume, fork,
+A Savepoint is a consistent image of the execution state of a streaming job, created via Flink's [checkpointing mechanism]({{ site.baseurl }}/training/fault_tolerance.html). You can use Savepoints to stop-and-resume, fork,
 or update your Flink jobs. Savepoints consist of two parts: a directory with (typically large) binary files on stable storage (e.g. HDFS, S3, ...) and a (relatively small) meta data file. The files on stable storage represent the net data of the job's execution state
 image. The meta data file of a Savepoint contains (primarily) pointers to all files on stable storage that are part of the Savepoint, in form of absolute paths.
 
diff --git a/docs/ops/state/savepoints.zh.md b/docs/ops/state/savepoints.zh.md
index 6bdb9df..b8c52f7 100644
--- a/docs/ops/state/savepoints.zh.md
+++ b/docs/ops/state/savepoints.zh.md
@@ -27,7 +27,7 @@ under the License.
 
 ## 什么是 Savepoint ? Savepoint 与 Checkpoint 有什么不同?
 
-Savepoint 是依据 Flink [checkpointing 机制]({{ site.baseurl }}/zh/internals/stream_checkpointing.html)所创建的流作业执行状态的一致镜像。 你可以使用 Savepoint 进行 Flink 作业的停止与重启、fork 或者更新。 Savepoint 由两部分组成:稳定存储(列入 HDFS,S3,...) 上包含二进制文件的目录(通常很大),和元数据文件(相对较小)。 稳定存储上的文件表示作业执行状态的数据镜像。 Savepoint 的元数据文件以(绝对路径)的形式包含(主要)指向作为 Savepoint 一部分的稳定存储上的所有文件的指针。
+Savepoint 是依据 Flink [checkpointing 机制]({{ site.baseurl }}/zh/training/fault_tolerance.html)所创建的流作业执行状态的一致镜像。 你可以使用 Savepoint 进行 Flink 作业的停止与重启、fork 或者更新。 Savepoint 由两部分组成:稳定存储(列入 HDFS,S3,...) 上包含二进制文件的目录(通常很大),和元数据文件(相对较小)。 稳定存储上的文件表示作业执行状态的数据镜像。 Savepoint 的元数据文件以(绝对路径)的形式包含(主要)指向作为 Savepoint 一部分的稳定存储上的所有文件的指针。
 
 <div class="alert alert-warning">
 <strong>注意:</strong> 为了允许程序和 Flink 版本之间的升级,请务必查看以下有关<a href="#分配算子-id">分配算子 ID </a>的部分 。