You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/05/14 08:06:51 UTC

[GitHub] [flink] morsapaes commented on a change in pull request #12057: [FLINK-16210][docs] Extending the "Flink Architecture" section.

morsapaes commented on a change in pull request #12057:
URL: https://github.com/apache/flink/pull/12057#discussion_r424946648



##########
File path: docs/concepts/flink-architecture.md
##########
@@ -61,13 +59,53 @@ frameworks like [YARN]({{ site.baseurl }}{% link ops/deployment/yarn_setup.md
 TaskManagers connect to Flink Masters, announcing themselves as available, and
 are assigned work.
 
-The *client* is not part of the runtime and program execution, but is used to
-prepare and send a dataflow to the Flink Master.  After that, the client can
-disconnect, or stay connected to receive progress reports. The client runs
-either as part of the Java/Scala program that triggers the execution, or in the
-command line process `./bin/flink run ...`.
+### Flink Master
+
+The _Flink Master_ coordinates the distributed execution of Flink Applications:
+it decides when to schedule the next task (or set of tasks), reacts to finished
+tasks or execution failures, coordinates checkpoints, coordinates recovery on
+failures, among others. This process consists of three different components:
+
+  * **ResourceManager** 
 
-<img src="{{ site.baseurl }}/fig/processes.svg" alt="The processes involved in executing a Flink dataflow" class="offset" width="80%" />
+    The _ResourceManager_ is responsible for resource de-/allocation and
+    provisioning in a Flink cluster — it manages TaskManager slots, Flink’s
+    smallest resource processing unit (see [Flink Workers](#flink-workers)).
+    Flink implements multiple ResourceManagers for different environments and
+    resource providers such as YARN, Mesos, Kubernetes and standalone
+    deployments. In a standalone setup, the ResourceManager can only distribute
+    the slots of available TaskManagers and cannot start new TaskManagers on
+    its own.  
+
+  * **Dispatcher** 
+
+    The _Dispatcher_ provides a REST interface to submit Flink applications for
+    execution and starts a new JobManager component for each submitted job. It
+    also runs the Flink WebUI to provide information about job executions.
+
+  * **JobManager** 
+
+    The _JobManager_ is responsible for managing the execution of a single
+    [JobGraph]({{ site.baseurl }}/concepts/glossary.html#logical-graph).
+    Multiple jobs can run simultaneously in a Flink cluster, each having its
+    own JobManager.
+
+There is always at least one Flink Master. A high-availability setup might have
+multiple Flink Masters, one of which is always the *leader*, and the others are
+*standby* (see [High Availability (HA)]({{ site.baseurl
+}}/ops/jobmanager_high_availability.html)).
+
+### Flink Workers
+
+The *TaskManagers* (also called *workers*) execute the tasks (or more
+specifically, the subtasks) of a dataflow, and buffer and exchange the data

Review comment:
       This was a carryover from the base that was already there. Agree and will remove.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org