You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/12/07 14:07:13 UTC

[GitHub] [flink] tillrohrmann commented on a change in pull request #14258: [FLINK-20356][docs] Updated Mesos documentation.

tillrohrmann commented on a change in pull request #14258:
URL: https://github.com/apache/flink/pull/14258#discussion_r537496569



##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against the
-recovered information from Zookeeper and makes sure to bring back the cluster in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the very same machine.

Review comment:
       ```suggestion
   onto the the Mesos master. Additionally, Hadoop needs to be installed on the very same machine.
   ```

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against the
-recovered information from Zookeeper and makes sure to bring back the cluster in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to run the TaskManager 

Review comment:
       I think `bin/mesos-appmaster.sh` won't start a process on the Mesos master. Instead it will start a process using the `MesosSessionClusterEntrypoint` wherever this script is called.

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 

Review comment:
       Technically, `mesos-appmaster.sh` starts the Mesos application master (JobManager with Mesos integration). The process runs wherever you call it (also outside of the Mesos cluster).

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against the
-recovered information from Zookeeper and makes sure to bring back the cluster in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to run the TaskManager 
+processes.
 
-The JobManager and the web interface provide a central point for monitoring,
-job submission, and other client interaction with the cluster
-(see [FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)).
-
-### Startup script and configuration overlays
-
-The startup script provide a way to configure and start the application
-master. All further configuration is then inherited by the workers nodes. This
-is achieved using configuration overlays. Configuration overlays provide a way
-to infer configuration from environment variables and config files which are
-shipped to the worker nodes.
-
-
-## DC/OS
-
-This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution
-with a sophisticated application management layer. It comes pre-installed with
-Marathon, a service to supervise applications and maintain their state in case
-of failures.
-
-If you don't have a running DC/OS cluster, please follow the
-[instructions on how to install DC/OS on the official website](https://dcos.io/install/).
-
-Once you have a DC/OS cluster, you may install Flink through the DC/OS
-Universe. In the search prompt, just search for Flink. Alternatively, you can use the DC/OS CLI:
-
-    dcos package install flink
-
-Further information can be found in the
-[DC/OS examples documentation](https://github.com/dcos/examples/tree/master/1.8/flink).
-
-
-## Mesos without DC/OS
-
-You can also run Mesos without DC/OS.
-
-### Installing Mesos
-
-Please follow the [instructions on how to setup Mesos on the official website](http://mesos.apache.org/getting-started/).
-
-After installation you have to configure the set of master and agent nodes by creating the files `MESOS_HOME/etc/mesos/masters` and `MESOS_HOME/etc/mesos/slaves`.
-These files contain in each row a single hostname on which the respective component will be started (assuming SSH access to these nodes).
-
-Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the template found in the same directory.
-In this file, you have to define
-
-    export MESOS_work_dir=WORK_DIRECTORY
-
-and it is recommended to uncommment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-
-
-In order to configure the Mesos agents, you have to create `MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same directory.
-You have to configure
-
-    export MESOS_master=MASTER_HOSTNAME:MASTER_PORT
-
-and uncomment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-    export MESOS_work_dir=WORK_DIRECTORY
-
-#### Mesos Library
-
-In order to run Java applications with Mesos you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux.
-Under Mac OS X you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`.
-
-#### Deploying Mesos
-
-In order to start your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-start-cluster.sh`.
-In order to stop your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-stop-cluster.sh`.
-More information about the deployment scripts can be found [here](http://mesos.apache.org/documentation/latest/deploy-scripts/).
-
-### Installing Marathon
-
-Optionally, you may also [install Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run Flink in [high availability (HA) mode](#high-availability).
-
-### Pre-installing Flink vs Docker/Mesos containers
-
-You may install Flink on all of your Mesos Master and Agent nodes.
-You can also pull the binaries from the Flink web site during deployment and apply your custom configuration before launching the application master.
-A more convenient and easier to maintain approach is to use Docker containers to manage the Flink binaries and configuration.
-
-This is controlled via the following configuration entries:
-
-    mesos.resourcemanager.tasks.container.type: mesos _or_ docker
-
-If set to 'docker', specify the image name:
-
-    mesos.resourcemanager.tasks.container.image.name: image_name
+For `bin/mesos-appmaster.sh` to work, you have to set the two variables `HADOOP_CLASSPATH` and 
+`MESOS_NATIVE_JAVA_LIBRARY`:
 
+{% highlight bash %}
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+{% endhighlight %}
 
-### Flink session cluster on Mesos
+`MESOS_NATIVE_JAVA_LIBRARY` needs to point to Mesos' native Java library. The library name `libmesos.so` 
+used above refers to Mesos' Linux library. Running Mesos on MacOS would require you to use 
+`libmesos.dylib` instead.
 
-A Flink session cluster is executed as a long-running Mesos Deployment. Note that you can run multiple Flink jobs on a session cluster. Each job needs to be submitted to the cluster after the cluster has been deployed.
+### Starting a Flink Session on Mesos
 
-In the `/bin` directory of the Flink distribution, you find two startup scripts
-which manage the Flink processes in a Mesos cluster:
+Connect to the Mesos workers, change into Flink's home directory and call `bin/mesos-appmaster.sh`:
 
-1. `mesos-appmaster.sh`
-   This starts the Mesos application master which will register the Mesos scheduler.
-   It is also responsible for starting up the worker nodes.
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
 
-2. `mesos-taskmanager.sh`
-   The entry point for the Mesos worker processes.
-   You don't need to explicitly execute this script.
-   It is automatically launched by the Mesos worker node to bring up a new TaskManager.
+# (1) create Flink on Mesos cluster
+./bin/mesos-appmaster.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dmesos.resourcemanager.tasks.cpus=6
+{% endhighlight %}
 
-In order to run the `mesos-appmaster.sh` script you have to define `mesos.master` in the `flink-conf.yaml` or pass it via `-Dmesos.master=...` to the Java process.
+The call above uses two variables not introduced, yet, as they depend on the cluster:
+* `MESOS_MASTER` refers to the Mesos master's IP address or hostname. It's important to not use `localhost` 
+  or `127.0.0.1` as the corresponding parameters are being shared with the Mesos cluster and the TaskManagers.
+* `FLINK_USER` refers to the user that owns the Mesos master's Flink installation directory (see Mesos' 
+documentation on [specifying a user](http://mesos.apache.org/documentation/latest/fetcher/#specifying-a-user-name)
+for further details).
 
-When executing `mesos-appmaster.sh`, it will create a job manager on the machine where you executed the script.
-In contrast to that, the task managers will be run as Mesos tasks in the Mesos cluster.
+The Flink on Mesos cluster is now deployed in [Session Mode]({% link deployment/index.md %}#session-mode).
+Note that you can run multiple Flink jobs on a Session cluster. Each job needs to be submitted to the 
+cluster. TaskManagers are deployed on the Mesos workers as needed. Keep in mind that you can only run as 
+many jobs as the Mesos cluster allows in terms of resources provided by the Mesos workers. Play around 
+with Flink's parameters to find the right resource utilization for your needs.
 
-### Flink job cluster on Mesos
+Check out [Flink's Mesos configuration]({% link deployment/config.md %}#mesos) to further influence 
+the resources Flink on Mesos is going to allocate.
 
-A Flink job cluster is a dedicated cluster which runs a single job.
-There is no extra job submission needed.
+## Deployment Modes Supported by Flink on Mesos
 
-In the `/bin` directory of the Flink distribution, you find one startup script
-which manage the Flink processes in a Mesos cluster:
+For production use, we recommend deploying Flink Applications in the 
+[Per-Job Mode]({% link deployment/index.md %}#per-job-mode), as it provides a better isolation 
+for each job.
 
-1. `mesos-appmaster-job.sh`
-   This starts the Mesos application master which will register the Mesos scheduler, retrieve the job graph and then launch the task managers accordingly.
+### Application Mode
 
-In order to run the `mesos-appmaster-job.sh` script you have to define `mesos.master` and `internal.jobgraph-path` in the `flink-conf.yaml`
-or pass it via `-Dmesos.master=... -Dinterval.jobgraph-path=...` to the Java process.
+Flink on Mesos does not support Application Mode.
 
-The job graph file may be generated like this way:
+### Per-Job Cluster Mode
 
+A job which is executed in Per-Job Cluster Mode spins up a dedicated Flink cluster that is only 
+used for that specific job. No extra job submission is needed. `bin/mesos-appmaster-job.sh` is used 
+as the startup script. It will start a Flink cluster for a dedicated job which is passed as a 
+JobGraph file. This file can be created by applying the following code to your Job source code:
 {% highlight java %}
 final JobGraph jobGraph = env.getStreamGraph().getJobGraph();
 final String jobGraphFilename = "job.graph";
 File jobGraphFile = new File(jobGraphFilename);
 try (FileOutputStream output = new FileOutputStream(jobGraphFile);
-	ObjectOutputStream obOutput = new ObjectOutputStream(output)){
-	obOutput.writeObject(jobGraph);
+    ObjectOutputStream obOutput = new ObjectOutputStream(output)){
+    obOutput.writeObject(jobGraph);
 }
 {% endhighlight %}
 
-<span class="label label-info">Note</span> Make sure that all Mesos processes have the user code jar on the classpath. There are two ways:
-
-1. One way is putting them in the `lib/` directory, which will result in the user code jar being loaded by the system classloader.
-1. The other way is creating a `usrlib/` directory in the parent directory of `lib/` and putting the user code jar in the `usrlib/` directory.
-After launching a job cluster via `bin/mesos-appmaster-job.sh ...`, the user code jar will be loaded by the user code classloader.
-
-#### General configuration
-
-It is possible to completely parameterize a Mesos application through Java properties passed to the Mesos application master.
-This also allows to specify general Flink configuration parameters.
-For example:
-
-    bin/mesos-appmaster.sh \
-        -Dmesos.master=master.foobar.org:5050 \
-        -Djobmanager.memory.process.size=1472m \
-        -Djobmanager.rpc.port=6123 \
-        -Drest.port=8081 \
-        -Dtaskmanager.memory.process.size=3500m \
-        -Dtaskmanager.numberOfTaskSlots=2 \
-        -Dparallelism.default=10
-
-### High Availability
-
-You will need to run a service like Marathon or Apache Aurora which takes care of restarting the JobManager process in case of node or process failures.
-In addition, Zookeeper needs to be configured like described in the [High Availability section of the Flink docs]({% link deployment/ha/index.md %}).
+Flink on Mesos Per-Job cluster can be started in the following way:
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+
+# (1) create Per-Job Flink on Mesos cluster
+./bin/mesos-appmaster-job.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dinternal.jobgraph-path=$JOB_GRAPH_FILE
+{% endhighlight %} 
+
+`JOB_GRAPH_FILE` in the command above refers to the path of the uploaded JobGraph file defining the 
+job that shall be executed on the Per-Job Flink cluster. The meaning of `MESOS_MASTER` and `FLINK_USER` 
+are described in the [Getting Started](#starting-a-flink-session-on-mesos) guide of this page.
+
+### Session Mode
+
+The [Getting Started](#starting-a-flink-session-on-mesos) guide at the top of this page describes 
+deploying Flink in Session Mode.
+
+## Flink on Mesos Reference
+
+### Flink on Mesos Architecture
+
+The Flink on Mesos implementation consists of two components: The application master and the workers. 
+The workers are simple TaskManagers parameterized by the environment which is set up through the 
+application master. The most sophisticated component of the Flink on Mesos implementation is the 
+application master. The application master currently hosts the following components:
+- **Mesos Scheduler**: The Scheduler is responsible for registering a framework with Mesos, requesting 
+  resources, and launching worker nodes. The Scheduler continuously needs to report back to Mesos to 
+  ensure the framework is in a healthy state. To verify the health of the cluster, the Scheduler 
+  monitors the spawned workers, marks them as failed and restarts them if necessary.
+
+  Flink's Mesos Scheduler itself is currently not highly available. However, it persists all necessary 
+  information about its state (e.g. configuration, list of workers) in [ZooKeeper](#high-availability-on-mesos). 
+  In the presence of a failure, it relies on an external system to bring up a new Scheduler (see the 
+  [Marathon subsection](#marathon) for further details). The Scheduler will then register with Mesos 
+  again and go through the reconciliation phase. In the reconciliation phase, the Scheduler receives 
+  a list of running workers nodes. It matches these against the recovered information from ZooKeeper 
+  and makes sure to bring back the cluster in the state before the failure.
+- **Artifact Server**: The Artifact Server is responsible for providing resources to the worker nodes. 
+  The resources can be anything from the Flink binaries to shared secrets or configuration files. 
+  For instance, in non-containerized environments, the Artifact Server will provide the Flink binaries. 
+  What files will be served depends on the configuration overlay used.
+
+Flink's Mesos startup scripts `bin/mesos-appmaster.sh` and `bin/mesos-appmaster-job.sh` provide a way 
+to configure and start the application master. The worker nodes inherit all further configuration. 
+They are deployed through `bin/mesos-taskmanager.sh`. The configuration inheritance is achieved using 
+configuration overlays. Configuration overlays provide a way to infer a configuration from environment 
+variables and config files which are shipped to the worker nodes.
+
+See [Mesos Architecture](http://mesos.apache.org/documentation/latest/architecture/) for a more details 
+on how frameworks are handled by Mesos.
+
+### Deploying User Libraries
+
+User libraries can be passed to the Mesos workers by placing them in Flink's `lib/` folder. This way, 
+they will be picked by Mesos' Fetcher and copied over into the worker's sandbox folders. Alternatively, 
+Docker containerization can be used as described in [Installing Flink on the Workers](#installing-flink-on-the-workers).
+
+### Installing Flink on the Workers
+
+Flink on Mesos offers two ways to distribute the Flink and user binaries within the Mesos cluster:
+1. **Using Mesos' Artifact Server**: The Artifact Server provides the resources which are moved by 
+   [Mesos' Fetcher](http://mesos.apache.org/documentation/latest/fetcher/) into the Mesos worker's 
+   [sandbox folders](http://mesos.apache.org/documentation/latest/sandbox/). It can be explicitly 
+   specified by setting [mesos.resourcemanager.tasks.container.type]({% link deployment/config.md %}#mesos-resourcemanager-tasks-container-type) 
+   to `mesos`. This is the default option and is used in the example commands of this page.
+2. **Using Docker containerization**: This enables the user to provide user libraries and other 
+   customizations as part of a Docker image. Docker utilization can be enabled by setting 
+   [mesos.resourcemanager.tasks.container.type]({% link deployment/config.md %}#mesos-resourcemanager-tasks-container-type) 
+   to `docker` and by providing the image name through [mesos.resourcemanager.tasks.container.image.name]({% link deployment/config.md %}#mesos-resourcemanager-tasks-container-image-name).
+
+### High Availability on Mesos
+
+You will need to run a service like Marathon or Apache Aurora which takes care of restarting the 
+JobManager process in case of node or process failures. In addition, Zookeeper needs to be configured 
+as described in the [High Availability section of the Flink docs]({% link deployment/ha/index.md %}).
 
 #### Marathon
 
-Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script.
-In particular, it should also adjust any configuration parameters for the Flink cluster.
+Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script. In particular, it should 
+also adjust any configuration parameters for the Flink cluster.
 
 Here is an example configuration for Marathon:
+{% highlight javascript %}
+{
+    "id": "flink",
+    "cmd": "/opt/flink-1.11.2/bin/mesos-appmaster.sh -Dmesos.resourcemanager.framework.user=root -Dmesos.resourcemanager.tasks.taskmanager-cmd=/opt/flink-1.11.2/bin/mesos-taskmanager.sh -Dmesos.master=master:5050 -Djobmanager.memory.process.size=1472m -Dtaskmanager.memory.process.size=3500m -Dtaskmanager.numberOfTaskSlots=2 -Dparallelism.default=2",
+    "cpus": 2,
+    "mem": 1024,
+    "disk": 0,
+    "instances": 1,
+    "env": {
+        "MESOS_NATIVE_JAVA_LIBRARY": "/usr/lib/libmesos.so",
+        "HADOOP_CLASSPATH": "/opt/hadoop-2.10.1/etc/hadoop:/opt/hadoop-2.10.1/share/hadoop/common/lib/*:/opt/hadoop-2.10.1/share/hadoop/common/*:/opt/hadoop-2.10.1/share/hadoop/hdfs:/opt/hadoop-2.10.1/share/hadoop/hdfs/lib/*:/opt/hadoop-2.10.1/share/hadoop/hdfs/*:/opt/hadoop-2.10.1/share/hadoop/yarn:/opt/hadoop-2.10.1/share/hadoop/yarn/lib/*:/opt/hadoop-2.10.1/share/hadoop/yarn/*:/opt/hadoop-2.10.1/share/hadoop/mapreduce/lib/*:/opt/hadoop-2.10.1/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar"
+    },
+    "healthChecks": [
+        {
+            "protocol": "HTTP",
+            "path": "/",
+            "port": 8081,
+            "gracePeriodSeconds": 300,
+            "intervalSeconds": 60,
+            "timeoutSeconds": 20,
+            "maxConsecutiveFailures": 3
+        }
+    ],
+    "user": "root"
+}
+Flink is installed into `/opt/flink-1.11.2` for this example having `root` as the owner of the Flink 

Review comment:
       I think we can refer to the current version via
   ```suggestion
   Flink is installed into `/opt/flink-{{site.version}}` for this example having `root` as the owner of the Flink 
   ```
   
   The same applies to the Marathon configuration provided above.

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against the
-recovered information from Zookeeper and makes sure to bring back the cluster in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to run the TaskManager 
+processes.
 
-The JobManager and the web interface provide a central point for monitoring,
-job submission, and other client interaction with the cluster
-(see [FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)).
-
-### Startup script and configuration overlays
-
-The startup script provide a way to configure and start the application
-master. All further configuration is then inherited by the workers nodes. This
-is achieved using configuration overlays. Configuration overlays provide a way
-to infer configuration from environment variables and config files which are
-shipped to the worker nodes.
-
-
-## DC/OS
-
-This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution
-with a sophisticated application management layer. It comes pre-installed with
-Marathon, a service to supervise applications and maintain their state in case
-of failures.
-
-If you don't have a running DC/OS cluster, please follow the
-[instructions on how to install DC/OS on the official website](https://dcos.io/install/).
-
-Once you have a DC/OS cluster, you may install Flink through the DC/OS
-Universe. In the search prompt, just search for Flink. Alternatively, you can use the DC/OS CLI:
-
-    dcos package install flink
-
-Further information can be found in the
-[DC/OS examples documentation](https://github.com/dcos/examples/tree/master/1.8/flink).
-
-
-## Mesos without DC/OS
-
-You can also run Mesos without DC/OS.
-
-### Installing Mesos
-
-Please follow the [instructions on how to setup Mesos on the official website](http://mesos.apache.org/getting-started/).
-
-After installation you have to configure the set of master and agent nodes by creating the files `MESOS_HOME/etc/mesos/masters` and `MESOS_HOME/etc/mesos/slaves`.
-These files contain in each row a single hostname on which the respective component will be started (assuming SSH access to these nodes).
-
-Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the template found in the same directory.
-In this file, you have to define
-
-    export MESOS_work_dir=WORK_DIRECTORY
-
-and it is recommended to uncommment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-
-
-In order to configure the Mesos agents, you have to create `MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same directory.
-You have to configure
-
-    export MESOS_master=MASTER_HOSTNAME:MASTER_PORT
-
-and uncomment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-    export MESOS_work_dir=WORK_DIRECTORY
-
-#### Mesos Library
-
-In order to run Java applications with Mesos you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux.
-Under Mac OS X you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`.
-
-#### Deploying Mesos
-
-In order to start your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-start-cluster.sh`.
-In order to stop your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-stop-cluster.sh`.
-More information about the deployment scripts can be found [here](http://mesos.apache.org/documentation/latest/deploy-scripts/).
-
-### Installing Marathon
-
-Optionally, you may also [install Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run Flink in [high availability (HA) mode](#high-availability).
-
-### Pre-installing Flink vs Docker/Mesos containers
-
-You may install Flink on all of your Mesos Master and Agent nodes.
-You can also pull the binaries from the Flink web site during deployment and apply your custom configuration before launching the application master.
-A more convenient and easier to maintain approach is to use Docker containers to manage the Flink binaries and configuration.
-
-This is controlled via the following configuration entries:
-
-    mesos.resourcemanager.tasks.container.type: mesos _or_ docker
-
-If set to 'docker', specify the image name:
-
-    mesos.resourcemanager.tasks.container.image.name: image_name
+For `bin/mesos-appmaster.sh` to work, you have to set the two variables `HADOOP_CLASSPATH` and 
+`MESOS_NATIVE_JAVA_LIBRARY`:
 
+{% highlight bash %}
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+{% endhighlight %}
 
-### Flink session cluster on Mesos
+`MESOS_NATIVE_JAVA_LIBRARY` needs to point to Mesos' native Java library. The library name `libmesos.so` 
+used above refers to Mesos' Linux library. Running Mesos on MacOS would require you to use 
+`libmesos.dylib` instead.
 
-A Flink session cluster is executed as a long-running Mesos Deployment. Note that you can run multiple Flink jobs on a session cluster. Each job needs to be submitted to the cluster after the cluster has been deployed.
+### Starting a Flink Session on Mesos
 
-In the `/bin` directory of the Flink distribution, you find two startup scripts
-which manage the Flink processes in a Mesos cluster:
+Connect to the Mesos workers, change into Flink's home directory and call `bin/mesos-appmaster.sh`:
 
-1. `mesos-appmaster.sh`
-   This starts the Mesos application master which will register the Mesos scheduler.
-   It is also responsible for starting up the worker nodes.
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
 
-2. `mesos-taskmanager.sh`
-   The entry point for the Mesos worker processes.
-   You don't need to explicitly execute this script.
-   It is automatically launched by the Mesos worker node to bring up a new TaskManager.
+# (1) create Flink on Mesos cluster
+./bin/mesos-appmaster.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dmesos.resourcemanager.tasks.cpus=6
+{% endhighlight %}
 
-In order to run the `mesos-appmaster.sh` script you have to define `mesos.master` in the `flink-conf.yaml` or pass it via `-Dmesos.master=...` to the Java process.
+The call above uses two variables not introduced, yet, as they depend on the cluster:
+* `MESOS_MASTER` refers to the Mesos master's IP address or hostname. It's important to not use `localhost` 
+  or `127.0.0.1` as the corresponding parameters are being shared with the Mesos cluster and the TaskManagers.
+* `FLINK_USER` refers to the user that owns the Mesos master's Flink installation directory (see Mesos' 
+documentation on [specifying a user](http://mesos.apache.org/documentation/latest/fetcher/#specifying-a-user-name)
+for further details).
 
-When executing `mesos-appmaster.sh`, it will create a job manager on the machine where you executed the script.
-In contrast to that, the task managers will be run as Mesos tasks in the Mesos cluster.
+The Flink on Mesos cluster is now deployed in [Session Mode]({% link deployment/index.md %}#session-mode).
+Note that you can run multiple Flink jobs on a Session cluster. Each job needs to be submitted to the 
+cluster. TaskManagers are deployed on the Mesos workers as needed. Keep in mind that you can only run as 
+many jobs as the Mesos cluster allows in terms of resources provided by the Mesos workers. Play around 
+with Flink's parameters to find the right resource utilization for your needs.
 
-### Flink job cluster on Mesos
+Check out [Flink's Mesos configuration]({% link deployment/config.md %}#mesos) to further influence 
+the resources Flink on Mesos is going to allocate.
 
-A Flink job cluster is a dedicated cluster which runs a single job.
-There is no extra job submission needed.
+## Deployment Modes Supported by Flink on Mesos
 
-In the `/bin` directory of the Flink distribution, you find one startup script
-which manage the Flink processes in a Mesos cluster:
+For production use, we recommend deploying Flink Applications in the 
+[Per-Job Mode]({% link deployment/index.md %}#per-job-mode), as it provides a better isolation 
+for each job.
 
-1. `mesos-appmaster-job.sh`
-   This starts the Mesos application master which will register the Mesos scheduler, retrieve the job graph and then launch the task managers accordingly.
+### Application Mode
 
-In order to run the `mesos-appmaster-job.sh` script you have to define `mesos.master` and `internal.jobgraph-path` in the `flink-conf.yaml`
-or pass it via `-Dmesos.master=... -Dinterval.jobgraph-path=...` to the Java process.
+Flink on Mesos does not support Application Mode.
 
-The job graph file may be generated like this way:
+### Per-Job Cluster Mode
 
+A job which is executed in Per-Job Cluster Mode spins up a dedicated Flink cluster that is only 
+used for that specific job. No extra job submission is needed. `bin/mesos-appmaster-job.sh` is used 
+as the startup script. It will start a Flink cluster for a dedicated job which is passed as a 
+JobGraph file. This file can be created by applying the following code to your Job source code:
 {% highlight java %}
 final JobGraph jobGraph = env.getStreamGraph().getJobGraph();
 final String jobGraphFilename = "job.graph";
 File jobGraphFile = new File(jobGraphFilename);
 try (FileOutputStream output = new FileOutputStream(jobGraphFile);
-	ObjectOutputStream obOutput = new ObjectOutputStream(output)){
-	obOutput.writeObject(jobGraph);
+    ObjectOutputStream obOutput = new ObjectOutputStream(output)){
+    obOutput.writeObject(jobGraph);
 }
 {% endhighlight %}
 
-<span class="label label-info">Note</span> Make sure that all Mesos processes have the user code jar on the classpath. There are two ways:
-
-1. One way is putting them in the `lib/` directory, which will result in the user code jar being loaded by the system classloader.
-1. The other way is creating a `usrlib/` directory in the parent directory of `lib/` and putting the user code jar in the `usrlib/` directory.
-After launching a job cluster via `bin/mesos-appmaster-job.sh ...`, the user code jar will be loaded by the user code classloader.
-
-#### General configuration
-
-It is possible to completely parameterize a Mesos application through Java properties passed to the Mesos application master.
-This also allows to specify general Flink configuration parameters.
-For example:
-
-    bin/mesos-appmaster.sh \
-        -Dmesos.master=master.foobar.org:5050 \
-        -Djobmanager.memory.process.size=1472m \
-        -Djobmanager.rpc.port=6123 \
-        -Drest.port=8081 \
-        -Dtaskmanager.memory.process.size=3500m \
-        -Dtaskmanager.numberOfTaskSlots=2 \
-        -Dparallelism.default=10
-
-### High Availability
-
-You will need to run a service like Marathon or Apache Aurora which takes care of restarting the JobManager process in case of node or process failures.
-In addition, Zookeeper needs to be configured like described in the [High Availability section of the Flink docs]({% link deployment/ha/index.md %}).
+Flink on Mesos Per-Job cluster can be started in the following way:
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+
+# (1) create Per-Job Flink on Mesos cluster
+./bin/mesos-appmaster-job.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dinternal.jobgraph-path=$JOB_GRAPH_FILE
+{% endhighlight %} 
+
+`JOB_GRAPH_FILE` in the command above refers to the path of the uploaded JobGraph file defining the 
+job that shall be executed on the Per-Job Flink cluster. The meaning of `MESOS_MASTER` and `FLINK_USER` 
+are described in the [Getting Started](#starting-a-flink-session-on-mesos) guide of this page.
+
+### Session Mode
+
+The [Getting Started](#starting-a-flink-session-on-mesos) guide at the top of this page describes 
+deploying Flink in Session Mode.
+
+## Flink on Mesos Reference
+
+### Flink on Mesos Architecture
+
+The Flink on Mesos implementation consists of two components: The application master and the workers. 
+The workers are simple TaskManagers parameterized by the environment which is set up through the 
+application master. The most sophisticated component of the Flink on Mesos implementation is the 
+application master. The application master currently hosts the following components:
+- **Mesos Scheduler**: The Scheduler is responsible for registering a framework with Mesos, requesting 
+  resources, and launching worker nodes. The Scheduler continuously needs to report back to Mesos to 
+  ensure the framework is in a healthy state. To verify the health of the cluster, the Scheduler 
+  monitors the spawned workers, marks them as failed and restarts them if necessary.
+
+  Flink's Mesos Scheduler itself is currently not highly available. However, it persists all necessary 
+  information about its state (e.g. configuration, list of workers) in [ZooKeeper](#high-availability-on-mesos). 
+  In the presence of a failure, it relies on an external system to bring up a new Scheduler (see the 
+  [Marathon subsection](#marathon) for further details). The Scheduler will then register with Mesos 
+  again and go through the reconciliation phase. In the reconciliation phase, the Scheduler receives 
+  a list of running workers nodes. It matches these against the recovered information from ZooKeeper 
+  and makes sure to bring back the cluster in the state before the failure.
+- **Artifact Server**: The Artifact Server is responsible for providing resources to the worker nodes. 
+  The resources can be anything from the Flink binaries to shared secrets or configuration files. 
+  For instance, in non-containerized environments, the Artifact Server will provide the Flink binaries. 
+  What files will be served depends on the configuration overlay used.
+
+Flink's Mesos startup scripts `bin/mesos-appmaster.sh` and `bin/mesos-appmaster-job.sh` provide a way 
+to configure and start the application master. The worker nodes inherit all further configuration. 
+They are deployed through `bin/mesos-taskmanager.sh`. The configuration inheritance is achieved using 
+configuration overlays. Configuration overlays provide a way to infer a configuration from environment 
+variables and config files which are shipped to the worker nodes.
+
+See [Mesos Architecture](http://mesos.apache.org/documentation/latest/architecture/) for a more details 
+on how frameworks are handled by Mesos.
+
+### Deploying User Libraries
+
+User libraries can be passed to the Mesos workers by placing them in Flink's `lib/` folder. This way, 
+they will be picked by Mesos' Fetcher and copied over into the worker's sandbox folders. Alternatively, 
+Docker containerization can be used as described in [Installing Flink on the Workers](#installing-flink-on-the-workers).
+
+### Installing Flink on the Workers
+
+Flink on Mesos offers two ways to distribute the Flink and user binaries within the Mesos cluster:
+1. **Using Mesos' Artifact Server**: The Artifact Server provides the resources which are moved by 
+   [Mesos' Fetcher](http://mesos.apache.org/documentation/latest/fetcher/) into the Mesos worker's 
+   [sandbox folders](http://mesos.apache.org/documentation/latest/sandbox/). It can be explicitly 
+   specified by setting [mesos.resourcemanager.tasks.container.type]({% link deployment/config.md %}#mesos-resourcemanager-tasks-container-type) 
+   to `mesos`. This is the default option and is used in the example commands of this page.
+2. **Using Docker containerization**: This enables the user to provide user libraries and other 
+   customizations as part of a Docker image. Docker utilization can be enabled by setting 
+   [mesos.resourcemanager.tasks.container.type]({% link deployment/config.md %}#mesos-resourcemanager-tasks-container-type) 
+   to `docker` and by providing the image name through [mesos.resourcemanager.tasks.container.image.name]({% link deployment/config.md %}#mesos-resourcemanager-tasks-container-image-name).
+
+### High Availability on Mesos
+
+You will need to run a service like Marathon or Apache Aurora which takes care of restarting the 
+JobManager process in case of node or process failures. In addition, Zookeeper needs to be configured 
+as described in the [High Availability section of the Flink docs]({% link deployment/ha/index.md %}).
 
 #### Marathon
 
-Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script.
-In particular, it should also adjust any configuration parameters for the Flink cluster.
+Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script. In particular, it should 
+also adjust any configuration parameters for the Flink cluster.
 
 Here is an example configuration for Marathon:
+{% highlight javascript %}
+{
+    "id": "flink",
+    "cmd": "/opt/flink-1.11.2/bin/mesos-appmaster.sh -Dmesos.resourcemanager.framework.user=root -Dmesos.resourcemanager.tasks.taskmanager-cmd=/opt/flink-1.11.2/bin/mesos-taskmanager.sh -Dmesos.master=master:5050 -Djobmanager.memory.process.size=1472m -Dtaskmanager.memory.process.size=3500m -Dtaskmanager.numberOfTaskSlots=2 -Dparallelism.default=2",

Review comment:
       Do we have to specify all the memory configurations or can we use the default values?

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against the
-recovered information from Zookeeper and makes sure to bring back the cluster in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to run the TaskManager 
+processes.
 
-The JobManager and the web interface provide a central point for monitoring,
-job submission, and other client interaction with the cluster
-(see [FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)).
-
-### Startup script and configuration overlays
-
-The startup script provide a way to configure and start the application
-master. All further configuration is then inherited by the workers nodes. This
-is achieved using configuration overlays. Configuration overlays provide a way
-to infer configuration from environment variables and config files which are
-shipped to the worker nodes.
-
-
-## DC/OS
-
-This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution
-with a sophisticated application management layer. It comes pre-installed with
-Marathon, a service to supervise applications and maintain their state in case
-of failures.
-
-If you don't have a running DC/OS cluster, please follow the
-[instructions on how to install DC/OS on the official website](https://dcos.io/install/).
-
-Once you have a DC/OS cluster, you may install Flink through the DC/OS
-Universe. In the search prompt, just search for Flink. Alternatively, you can use the DC/OS CLI:
-
-    dcos package install flink
-
-Further information can be found in the
-[DC/OS examples documentation](https://github.com/dcos/examples/tree/master/1.8/flink).
-
-
-## Mesos without DC/OS
-
-You can also run Mesos without DC/OS.
-
-### Installing Mesos
-
-Please follow the [instructions on how to setup Mesos on the official website](http://mesos.apache.org/getting-started/).
-
-After installation you have to configure the set of master and agent nodes by creating the files `MESOS_HOME/etc/mesos/masters` and `MESOS_HOME/etc/mesos/slaves`.
-These files contain in each row a single hostname on which the respective component will be started (assuming SSH access to these nodes).
-
-Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the template found in the same directory.
-In this file, you have to define
-
-    export MESOS_work_dir=WORK_DIRECTORY
-
-and it is recommended to uncommment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-
-
-In order to configure the Mesos agents, you have to create `MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same directory.
-You have to configure
-
-    export MESOS_master=MASTER_HOSTNAME:MASTER_PORT
-
-and uncomment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-    export MESOS_work_dir=WORK_DIRECTORY
-
-#### Mesos Library
-
-In order to run Java applications with Mesos you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux.
-Under Mac OS X you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`.
-
-#### Deploying Mesos
-
-In order to start your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-start-cluster.sh`.
-In order to stop your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-stop-cluster.sh`.
-More information about the deployment scripts can be found [here](http://mesos.apache.org/documentation/latest/deploy-scripts/).
-
-### Installing Marathon
-
-Optionally, you may also [install Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run Flink in [high availability (HA) mode](#high-availability).
-
-### Pre-installing Flink vs Docker/Mesos containers
-
-You may install Flink on all of your Mesos Master and Agent nodes.
-You can also pull the binaries from the Flink web site during deployment and apply your custom configuration before launching the application master.
-A more convenient and easier to maintain approach is to use Docker containers to manage the Flink binaries and configuration.
-
-This is controlled via the following configuration entries:
-
-    mesos.resourcemanager.tasks.container.type: mesos _or_ docker
-
-If set to 'docker', specify the image name:
-
-    mesos.resourcemanager.tasks.container.image.name: image_name
+For `bin/mesos-appmaster.sh` to work, you have to set the two variables `HADOOP_CLASSPATH` and 
+`MESOS_NATIVE_JAVA_LIBRARY`:
 
+{% highlight bash %}
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+{% endhighlight %}
 
-### Flink session cluster on Mesos
+`MESOS_NATIVE_JAVA_LIBRARY` needs to point to Mesos' native Java library. The library name `libmesos.so` 
+used above refers to Mesos' Linux library. Running Mesos on MacOS would require you to use 
+`libmesos.dylib` instead.
 
-A Flink session cluster is executed as a long-running Mesos Deployment. Note that you can run multiple Flink jobs on a session cluster. Each job needs to be submitted to the cluster after the cluster has been deployed.
+### Starting a Flink Session on Mesos
 
-In the `/bin` directory of the Flink distribution, you find two startup scripts
-which manage the Flink processes in a Mesos cluster:
+Connect to the Mesos workers, change into Flink's home directory and call `bin/mesos-appmaster.sh`:
 
-1. `mesos-appmaster.sh`
-   This starts the Mesos application master which will register the Mesos scheduler.
-   It is also responsible for starting up the worker nodes.
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
 
-2. `mesos-taskmanager.sh`
-   The entry point for the Mesos worker processes.
-   You don't need to explicitly execute this script.
-   It is automatically launched by the Mesos worker node to bring up a new TaskManager.
+# (1) create Flink on Mesos cluster
+./bin/mesos-appmaster.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dmesos.resourcemanager.tasks.cpus=6
+{% endhighlight %}
 
-In order to run the `mesos-appmaster.sh` script you have to define `mesos.master` in the `flink-conf.yaml` or pass it via `-Dmesos.master=...` to the Java process.
+The call above uses two variables not introduced, yet, as they depend on the cluster:
+* `MESOS_MASTER` refers to the Mesos master's IP address or hostname. It's important to not use `localhost` 
+  or `127.0.0.1` as the corresponding parameters are being shared with the Mesos cluster and the TaskManagers.
+* `FLINK_USER` refers to the user that owns the Mesos master's Flink installation directory (see Mesos' 
+documentation on [specifying a user](http://mesos.apache.org/documentation/latest/fetcher/#specifying-a-user-name)
+for further details).
 
-When executing `mesos-appmaster.sh`, it will create a job manager on the machine where you executed the script.
-In contrast to that, the task managers will be run as Mesos tasks in the Mesos cluster.
+The Flink on Mesos cluster is now deployed in [Session Mode]({% link deployment/index.md %}#session-mode).
+Note that you can run multiple Flink jobs on a Session cluster. Each job needs to be submitted to the 
+cluster. TaskManagers are deployed on the Mesos workers as needed. Keep in mind that you can only run as 
+many jobs as the Mesos cluster allows in terms of resources provided by the Mesos workers. Play around 
+with Flink's parameters to find the right resource utilization for your needs.
 
-### Flink job cluster on Mesos
+Check out [Flink's Mesos configuration]({% link deployment/config.md %}#mesos) to further influence 
+the resources Flink on Mesos is going to allocate.
 
-A Flink job cluster is a dedicated cluster which runs a single job.
-There is no extra job submission needed.
+## Deployment Modes Supported by Flink on Mesos
 
-In the `/bin` directory of the Flink distribution, you find one startup script
-which manage the Flink processes in a Mesos cluster:
+For production use, we recommend deploying Flink Applications in the 
+[Per-Job Mode]({% link deployment/index.md %}#per-job-mode), as it provides a better isolation 
+for each job.
 
-1. `mesos-appmaster-job.sh`
-   This starts the Mesos application master which will register the Mesos scheduler, retrieve the job graph and then launch the task managers accordingly.
+### Application Mode
 
-In order to run the `mesos-appmaster-job.sh` script you have to define `mesos.master` and `internal.jobgraph-path` in the `flink-conf.yaml`
-or pass it via `-Dmesos.master=... -Dinterval.jobgraph-path=...` to the Java process.
+Flink on Mesos does not support Application Mode.
 
-The job graph file may be generated like this way:
+### Per-Job Cluster Mode
 
+A job which is executed in Per-Job Cluster Mode spins up a dedicated Flink cluster that is only 
+used for that specific job. No extra job submission is needed. `bin/mesos-appmaster-job.sh` is used 
+as the startup script. It will start a Flink cluster for a dedicated job which is passed as a 
+JobGraph file. This file can be created by applying the following code to your Job source code:
 {% highlight java %}
 final JobGraph jobGraph = env.getStreamGraph().getJobGraph();
 final String jobGraphFilename = "job.graph";
 File jobGraphFile = new File(jobGraphFilename);
 try (FileOutputStream output = new FileOutputStream(jobGraphFile);
-	ObjectOutputStream obOutput = new ObjectOutputStream(output)){
-	obOutput.writeObject(jobGraph);
+    ObjectOutputStream obOutput = new ObjectOutputStream(output)){
+    obOutput.writeObject(jobGraph);
 }
 {% endhighlight %}
 
-<span class="label label-info">Note</span> Make sure that all Mesos processes have the user code jar on the classpath. There are two ways:
-
-1. One way is putting them in the `lib/` directory, which will result in the user code jar being loaded by the system classloader.
-1. The other way is creating a `usrlib/` directory in the parent directory of `lib/` and putting the user code jar in the `usrlib/` directory.
-After launching a job cluster via `bin/mesos-appmaster-job.sh ...`, the user code jar will be loaded by the user code classloader.
-
-#### General configuration
-
-It is possible to completely parameterize a Mesos application through Java properties passed to the Mesos application master.
-This also allows to specify general Flink configuration parameters.
-For example:
-
-    bin/mesos-appmaster.sh \
-        -Dmesos.master=master.foobar.org:5050 \
-        -Djobmanager.memory.process.size=1472m \
-        -Djobmanager.rpc.port=6123 \
-        -Drest.port=8081 \
-        -Dtaskmanager.memory.process.size=3500m \
-        -Dtaskmanager.numberOfTaskSlots=2 \
-        -Dparallelism.default=10
-
-### High Availability
-
-You will need to run a service like Marathon or Apache Aurora which takes care of restarting the JobManager process in case of node or process failures.
-In addition, Zookeeper needs to be configured like described in the [High Availability section of the Flink docs]({% link deployment/ha/index.md %}).
+Flink on Mesos Per-Job cluster can be started in the following way:
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+
+# (1) create Per-Job Flink on Mesos cluster
+./bin/mesos-appmaster-job.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dinternal.jobgraph-path=$JOB_GRAPH_FILE
+{% endhighlight %} 
+
+`JOB_GRAPH_FILE` in the command above refers to the path of the uploaded JobGraph file defining the 
+job that shall be executed on the Per-Job Flink cluster. The meaning of `MESOS_MASTER` and `FLINK_USER` 
+are described in the [Getting Started](#starting-a-flink-session-on-mesos) guide of this page.
+
+### Session Mode
+
+The [Getting Started](#starting-a-flink-session-on-mesos) guide at the top of this page describes 
+deploying Flink in Session Mode.
+
+## Flink on Mesos Reference
+
+### Flink on Mesos Architecture
+
+The Flink on Mesos implementation consists of two components: The application master and the workers. 
+The workers are simple TaskManagers parameterized by the environment which is set up through the 
+application master. The most sophisticated component of the Flink on Mesos implementation is the 
+application master. The application master currently hosts the following components:
+- **Mesos Scheduler**: The Scheduler is responsible for registering a framework with Mesos, requesting 
+  resources, and launching worker nodes. The Scheduler continuously needs to report back to Mesos to 
+  ensure the framework is in a healthy state. To verify the health of the cluster, the Scheduler 
+  monitors the spawned workers, marks them as failed and restarts them if necessary.
+
+  Flink's Mesos Scheduler itself is currently not highly available. However, it persists all necessary 
+  information about its state (e.g. configuration, list of workers) in [ZooKeeper](#high-availability-on-mesos). 
+  In the presence of a failure, it relies on an external system to bring up a new Scheduler (see the 
+  [Marathon subsection](#marathon) for further details). The Scheduler will then register with Mesos 
+  again and go through the reconciliation phase. In the reconciliation phase, the Scheduler receives 
+  a list of running workers nodes. It matches these against the recovered information from ZooKeeper 
+  and makes sure to bring back the cluster in the state before the failure.
+- **Artifact Server**: The Artifact Server is responsible for providing resources to the worker nodes. 
+  The resources can be anything from the Flink binaries to shared secrets or configuration files. 
+  For instance, in non-containerized environments, the Artifact Server will provide the Flink binaries. 
+  What files will be served depends on the configuration overlay used.
+
+Flink's Mesos startup scripts `bin/mesos-appmaster.sh` and `bin/mesos-appmaster-job.sh` provide a way 
+to configure and start the application master. The worker nodes inherit all further configuration. 
+They are deployed through `bin/mesos-taskmanager.sh`. The configuration inheritance is achieved using 
+configuration overlays. Configuration overlays provide a way to infer a configuration from environment 
+variables and config files which are shipped to the worker nodes.
+
+See [Mesos Architecture](http://mesos.apache.org/documentation/latest/architecture/) for a more details 
+on how frameworks are handled by Mesos.
+
+### Deploying User Libraries
+
+User libraries can be passed to the Mesos workers by placing them in Flink's `lib/` folder. This way, 
+they will be picked by Mesos' Fetcher and copied over into the worker's sandbox folders. Alternatively, 
+Docker containerization can be used as described in [Installing Flink on the Workers](#installing-flink-on-the-workers).
+
+### Installing Flink on the Workers
+
+Flink on Mesos offers two ways to distribute the Flink and user binaries within the Mesos cluster:
+1. **Using Mesos' Artifact Server**: The Artifact Server provides the resources which are moved by 
+   [Mesos' Fetcher](http://mesos.apache.org/documentation/latest/fetcher/) into the Mesos worker's 
+   [sandbox folders](http://mesos.apache.org/documentation/latest/sandbox/). It can be explicitly 
+   specified by setting [mesos.resourcemanager.tasks.container.type]({% link deployment/config.md %}#mesos-resourcemanager-tasks-container-type) 
+   to `mesos`. This is the default option and is used in the example commands of this page.
+2. **Using Docker containerization**: This enables the user to provide user libraries and other 
+   customizations as part of a Docker image. Docker utilization can be enabled by setting 
+   [mesos.resourcemanager.tasks.container.type]({% link deployment/config.md %}#mesos-resourcemanager-tasks-container-type) 
+   to `docker` and by providing the image name through [mesos.resourcemanager.tasks.container.image.name]({% link deployment/config.md %}#mesos-resourcemanager-tasks-container-image-name).
+
+### High Availability on Mesos
+
+You will need to run a service like Marathon or Apache Aurora which takes care of restarting the 
+JobManager process in case of node or process failures. In addition, Zookeeper needs to be configured 
+as described in the [High Availability section of the Flink docs]({% link deployment/ha/index.md %}).
 
 #### Marathon
 
-Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script.
-In particular, it should also adjust any configuration parameters for the Flink cluster.
+Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script. In particular, it should 
+also adjust any configuration parameters for the Flink cluster.
 
 Here is an example configuration for Marathon:
+{% highlight javascript %}
+{
+    "id": "flink",
+    "cmd": "/opt/flink-1.11.2/bin/mesos-appmaster.sh -Dmesos.resourcemanager.framework.user=root -Dmesos.resourcemanager.tasks.taskmanager-cmd=/opt/flink-1.11.2/bin/mesos-taskmanager.sh -Dmesos.master=master:5050 -Djobmanager.memory.process.size=1472m -Dtaskmanager.memory.process.size=3500m -Dtaskmanager.numberOfTaskSlots=2 -Dparallelism.default=2",

Review comment:
       Where comes `master:5050` from?

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against the
-recovered information from Zookeeper and makes sure to bring back the cluster in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the very same machine.

Review comment:
       Do I need to install Hadoop or is it enough to provide the Hadoop dependencies (e.g. via Flink's pre bundled Hadoop dependency (https://flink.apache.org/downloads.html))?

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against the
-recovered information from Zookeeper and makes sure to bring back the cluster in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to run the TaskManager 
+processes.
 
-The JobManager and the web interface provide a central point for monitoring,
-job submission, and other client interaction with the cluster
-(see [FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)).
-
-### Startup script and configuration overlays
-
-The startup script provide a way to configure and start the application
-master. All further configuration is then inherited by the workers nodes. This
-is achieved using configuration overlays. Configuration overlays provide a way
-to infer configuration from environment variables and config files which are
-shipped to the worker nodes.
-
-
-## DC/OS
-
-This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution
-with a sophisticated application management layer. It comes pre-installed with
-Marathon, a service to supervise applications and maintain their state in case
-of failures.
-
-If you don't have a running DC/OS cluster, please follow the
-[instructions on how to install DC/OS on the official website](https://dcos.io/install/).
-
-Once you have a DC/OS cluster, you may install Flink through the DC/OS
-Universe. In the search prompt, just search for Flink. Alternatively, you can use the DC/OS CLI:
-
-    dcos package install flink
-
-Further information can be found in the
-[DC/OS examples documentation](https://github.com/dcos/examples/tree/master/1.8/flink).
-
-
-## Mesos without DC/OS
-
-You can also run Mesos without DC/OS.
-
-### Installing Mesos
-
-Please follow the [instructions on how to setup Mesos on the official website](http://mesos.apache.org/getting-started/).
-
-After installation you have to configure the set of master and agent nodes by creating the files `MESOS_HOME/etc/mesos/masters` and `MESOS_HOME/etc/mesos/slaves`.
-These files contain in each row a single hostname on which the respective component will be started (assuming SSH access to these nodes).
-
-Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the template found in the same directory.
-In this file, you have to define
-
-    export MESOS_work_dir=WORK_DIRECTORY
-
-and it is recommended to uncommment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-
-
-In order to configure the Mesos agents, you have to create `MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same directory.
-You have to configure
-
-    export MESOS_master=MASTER_HOSTNAME:MASTER_PORT
-
-and uncomment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-    export MESOS_work_dir=WORK_DIRECTORY
-
-#### Mesos Library
-
-In order to run Java applications with Mesos you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux.
-Under Mac OS X you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`.
-
-#### Deploying Mesos
-
-In order to start your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-start-cluster.sh`.
-In order to stop your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-stop-cluster.sh`.
-More information about the deployment scripts can be found [here](http://mesos.apache.org/documentation/latest/deploy-scripts/).
-
-### Installing Marathon
-
-Optionally, you may also [install Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run Flink in [high availability (HA) mode](#high-availability).
-
-### Pre-installing Flink vs Docker/Mesos containers
-
-You may install Flink on all of your Mesos Master and Agent nodes.
-You can also pull the binaries from the Flink web site during deployment and apply your custom configuration before launching the application master.
-A more convenient and easier to maintain approach is to use Docker containers to manage the Flink binaries and configuration.
-
-This is controlled via the following configuration entries:
-
-    mesos.resourcemanager.tasks.container.type: mesos _or_ docker
-
-If set to 'docker', specify the image name:
-
-    mesos.resourcemanager.tasks.container.image.name: image_name
+For `bin/mesos-appmaster.sh` to work, you have to set the two variables `HADOOP_CLASSPATH` and 
+`MESOS_NATIVE_JAVA_LIBRARY`:
 
+{% highlight bash %}
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+{% endhighlight %}
 
-### Flink session cluster on Mesos
+`MESOS_NATIVE_JAVA_LIBRARY` needs to point to Mesos' native Java library. The library name `libmesos.so` 
+used above refers to Mesos' Linux library. Running Mesos on MacOS would require you to use 
+`libmesos.dylib` instead.
 
-A Flink session cluster is executed as a long-running Mesos Deployment. Note that you can run multiple Flink jobs on a session cluster. Each job needs to be submitted to the cluster after the cluster has been deployed.
+### Starting a Flink Session on Mesos
 
-In the `/bin` directory of the Flink distribution, you find two startup scripts
-which manage the Flink processes in a Mesos cluster:
+Connect to the Mesos workers, change into Flink's home directory and call `bin/mesos-appmaster.sh`:
 
-1. `mesos-appmaster.sh`
-   This starts the Mesos application master which will register the Mesos scheduler.
-   It is also responsible for starting up the worker nodes.
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
 
-2. `mesos-taskmanager.sh`
-   The entry point for the Mesos worker processes.
-   You don't need to explicitly execute this script.
-   It is automatically launched by the Mesos worker node to bring up a new TaskManager.
+# (1) create Flink on Mesos cluster
+./bin/mesos-appmaster.sh \

Review comment:
       Maybe state that this will start `MesosSessionClusterEntrypoint` as a local process.

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against the
-recovered information from Zookeeper and makes sure to bring back the cluster in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to run the TaskManager 
+processes.
 
-The JobManager and the web interface provide a central point for monitoring,
-job submission, and other client interaction with the cluster
-(see [FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)).
-
-### Startup script and configuration overlays
-
-The startup script provide a way to configure and start the application
-master. All further configuration is then inherited by the workers nodes. This
-is achieved using configuration overlays. Configuration overlays provide a way
-to infer configuration from environment variables and config files which are
-shipped to the worker nodes.
-
-
-## DC/OS
-
-This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution
-with a sophisticated application management layer. It comes pre-installed with
-Marathon, a service to supervise applications and maintain their state in case
-of failures.
-
-If you don't have a running DC/OS cluster, please follow the
-[instructions on how to install DC/OS on the official website](https://dcos.io/install/).
-
-Once you have a DC/OS cluster, you may install Flink through the DC/OS
-Universe. In the search prompt, just search for Flink. Alternatively, you can use the DC/OS CLI:
-
-    dcos package install flink
-
-Further information can be found in the
-[DC/OS examples documentation](https://github.com/dcos/examples/tree/master/1.8/flink).
-
-
-## Mesos without DC/OS
-
-You can also run Mesos without DC/OS.
-
-### Installing Mesos
-
-Please follow the [instructions on how to setup Mesos on the official website](http://mesos.apache.org/getting-started/).
-
-After installation you have to configure the set of master and agent nodes by creating the files `MESOS_HOME/etc/mesos/masters` and `MESOS_HOME/etc/mesos/slaves`.
-These files contain in each row a single hostname on which the respective component will be started (assuming SSH access to these nodes).
-
-Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the template found in the same directory.
-In this file, you have to define
-
-    export MESOS_work_dir=WORK_DIRECTORY
-
-and it is recommended to uncommment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-
-
-In order to configure the Mesos agents, you have to create `MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same directory.
-You have to configure
-
-    export MESOS_master=MASTER_HOSTNAME:MASTER_PORT
-
-and uncomment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-    export MESOS_work_dir=WORK_DIRECTORY
-
-#### Mesos Library
-
-In order to run Java applications with Mesos you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux.
-Under Mac OS X you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`.
-
-#### Deploying Mesos
-
-In order to start your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-start-cluster.sh`.
-In order to stop your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-stop-cluster.sh`.
-More information about the deployment scripts can be found [here](http://mesos.apache.org/documentation/latest/deploy-scripts/).
-
-### Installing Marathon
-
-Optionally, you may also [install Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run Flink in [high availability (HA) mode](#high-availability).
-
-### Pre-installing Flink vs Docker/Mesos containers
-
-You may install Flink on all of your Mesos Master and Agent nodes.
-You can also pull the binaries from the Flink web site during deployment and apply your custom configuration before launching the application master.
-A more convenient and easier to maintain approach is to use Docker containers to manage the Flink binaries and configuration.
-
-This is controlled via the following configuration entries:
-
-    mesos.resourcemanager.tasks.container.type: mesos _or_ docker
-
-If set to 'docker', specify the image name:
-
-    mesos.resourcemanager.tasks.container.image.name: image_name
+For `bin/mesos-appmaster.sh` to work, you have to set the two variables `HADOOP_CLASSPATH` and 
+`MESOS_NATIVE_JAVA_LIBRARY`:
 
+{% highlight bash %}
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+{% endhighlight %}
 
-### Flink session cluster on Mesos
+`MESOS_NATIVE_JAVA_LIBRARY` needs to point to Mesos' native Java library. The library name `libmesos.so` 
+used above refers to Mesos' Linux library. Running Mesos on MacOS would require you to use 
+`libmesos.dylib` instead.
 
-A Flink session cluster is executed as a long-running Mesos Deployment. Note that you can run multiple Flink jobs on a session cluster. Each job needs to be submitted to the cluster after the cluster has been deployed.
+### Starting a Flink Session on Mesos
 
-In the `/bin` directory of the Flink distribution, you find two startup scripts
-which manage the Flink processes in a Mesos cluster:
+Connect to the Mesos workers, change into Flink's home directory and call `bin/mesos-appmaster.sh`:
 
-1. `mesos-appmaster.sh`
-   This starts the Mesos application master which will register the Mesos scheduler.
-   It is also responsible for starting up the worker nodes.
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
 
-2. `mesos-taskmanager.sh`
-   The entry point for the Mesos worker processes.
-   You don't need to explicitly execute this script.
-   It is automatically launched by the Mesos worker node to bring up a new TaskManager.
+# (1) create Flink on Mesos cluster
+./bin/mesos-appmaster.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dmesos.resourcemanager.tasks.cpus=6
+{% endhighlight %}
 
-In order to run the `mesos-appmaster.sh` script you have to define `mesos.master` in the `flink-conf.yaml` or pass it via `-Dmesos.master=...` to the Java process.
+The call above uses two variables not introduced, yet, as they depend on the cluster:
+* `MESOS_MASTER` refers to the Mesos master's IP address or hostname. It's important to not use `localhost` 
+  or `127.0.0.1` as the corresponding parameters are being shared with the Mesos cluster and the TaskManagers.
+* `FLINK_USER` refers to the user that owns the Mesos master's Flink installation directory (see Mesos' 
+documentation on [specifying a user](http://mesos.apache.org/documentation/latest/fetcher/#specifying-a-user-name)
+for further details).
 
-When executing `mesos-appmaster.sh`, it will create a job manager on the machine where you executed the script.
-In contrast to that, the task managers will be run as Mesos tasks in the Mesos cluster.
+The Flink on Mesos cluster is now deployed in [Session Mode]({% link deployment/index.md %}#session-mode).
+Note that you can run multiple Flink jobs on a Session cluster. Each job needs to be submitted to the 
+cluster. TaskManagers are deployed on the Mesos workers as needed. Keep in mind that you can only run as 
+many jobs as the Mesos cluster allows in terms of resources provided by the Mesos workers. Play around 
+with Flink's parameters to find the right resource utilization for your needs.
 
-### Flink job cluster on Mesos
+Check out [Flink's Mesos configuration]({% link deployment/config.md %}#mesos) to further influence 
+the resources Flink on Mesos is going to allocate.
 
-A Flink job cluster is a dedicated cluster which runs a single job.
-There is no extra job submission needed.
+## Deployment Modes Supported by Flink on Mesos
 
-In the `/bin` directory of the Flink distribution, you find one startup script
-which manage the Flink processes in a Mesos cluster:
+For production use, we recommend deploying Flink Applications in the 
+[Per-Job Mode]({% link deployment/index.md %}#per-job-mode), as it provides a better isolation 
+for each job.
 
-1. `mesos-appmaster-job.sh`
-   This starts the Mesos application master which will register the Mesos scheduler, retrieve the job graph and then launch the task managers accordingly.
+### Application Mode
 
-In order to run the `mesos-appmaster-job.sh` script you have to define `mesos.master` and `internal.jobgraph-path` in the `flink-conf.yaml`
-or pass it via `-Dmesos.master=... -Dinterval.jobgraph-path=...` to the Java process.
+Flink on Mesos does not support Application Mode.
 
-The job graph file may be generated like this way:
+### Per-Job Cluster Mode
 
+A job which is executed in Per-Job Cluster Mode spins up a dedicated Flink cluster that is only 
+used for that specific job. No extra job submission is needed. `bin/mesos-appmaster-job.sh` is used 
+as the startup script. It will start a Flink cluster for a dedicated job which is passed as a 
+JobGraph file. This file can be created by applying the following code to your Job source code:
 {% highlight java %}
 final JobGraph jobGraph = env.getStreamGraph().getJobGraph();
 final String jobGraphFilename = "job.graph";
 File jobGraphFile = new File(jobGraphFilename);
 try (FileOutputStream output = new FileOutputStream(jobGraphFile);
-	ObjectOutputStream obOutput = new ObjectOutputStream(output)){
-	obOutput.writeObject(jobGraph);
+    ObjectOutputStream obOutput = new ObjectOutputStream(output)){
+    obOutput.writeObject(jobGraph);
 }
 {% endhighlight %}
 
-<span class="label label-info">Note</span> Make sure that all Mesos processes have the user code jar on the classpath. There are two ways:
-
-1. One way is putting them in the `lib/` directory, which will result in the user code jar being loaded by the system classloader.
-1. The other way is creating a `usrlib/` directory in the parent directory of `lib/` and putting the user code jar in the `usrlib/` directory.
-After launching a job cluster via `bin/mesos-appmaster-job.sh ...`, the user code jar will be loaded by the user code classloader.
-
-#### General configuration
-
-It is possible to completely parameterize a Mesos application through Java properties passed to the Mesos application master.
-This also allows to specify general Flink configuration parameters.
-For example:
-
-    bin/mesos-appmaster.sh \
-        -Dmesos.master=master.foobar.org:5050 \
-        -Djobmanager.memory.process.size=1472m \
-        -Djobmanager.rpc.port=6123 \
-        -Drest.port=8081 \
-        -Dtaskmanager.memory.process.size=3500m \
-        -Dtaskmanager.numberOfTaskSlots=2 \
-        -Dparallelism.default=10
-
-### High Availability
-
-You will need to run a service like Marathon or Apache Aurora which takes care of restarting the JobManager process in case of node or process failures.
-In addition, Zookeeper needs to be configured like described in the [High Availability section of the Flink docs]({% link deployment/ha/index.md %}).
+Flink on Mesos Per-Job cluster can be started in the following way:
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+
+# (1) create Per-Job Flink on Mesos cluster
+./bin/mesos-appmaster-job.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dinternal.jobgraph-path=$JOB_GRAPH_FILE
+{% endhighlight %} 
+
+`JOB_GRAPH_FILE` in the command above refers to the path of the uploaded JobGraph file defining the 
+job that shall be executed on the Per-Job Flink cluster. The meaning of `MESOS_MASTER` and `FLINK_USER` 
+are described in the [Getting Started](#starting-a-flink-session-on-mesos) guide of this page.
+
+### Session Mode
+
+The [Getting Started](#starting-a-flink-session-on-mesos) guide at the top of this page describes 
+deploying Flink in Session Mode.
+
+## Flink on Mesos Reference
+
+### Flink on Mesos Architecture

Review comment:
       I would move this to the back of the "Flink on Mesos Reference" section.

##########
File path: docs/deployment/resource-providers/mesos.md
##########
@@ -26,226 +26,217 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-## Background
+## Getting Started
 
-The Mesos implementation consists of two components: The Application Master and
-the Worker. The workers are simple TaskManagers which are parameterized by the environment
-set up by the application master. The most sophisticated component of the Mesos
-implementation is the application master. The application master currently hosts
-the following components:
+This *Getting Started* section guides you through setting up a fully functional Flink Cluster on Mesos.
 
-### Mesos Scheduler
+### Introduction
 
-The scheduler is responsible for registering the framework with Mesos,
-requesting resources, and launching worker nodes. The scheduler continuously
-needs to report back to Mesos to ensure the framework is in a healthy state. To
-verify the health of the cluster, the scheduler monitors the spawned workers and
-marks them as failed and restarts them if necessary.
+[Apache Mesos](http://mesos.apache.org/) is another resource provider supported by 
+Apache Flink. Flink utilizes the worker's provided by Mesos to run its TaskManagers.
+Apache Flink provides the script `bin/mesos-appmaster.sh` to create the Flink 
+on Mesos cluster.
 
-Flink's Mesos scheduler itself is currently not highly available. However, it
-persists all necessary information about its state (e.g. configuration, list of
-workers) in Zookeeper. In the presence of a failure, it relies on an external
-system to bring up a new scheduler. The scheduler will then register with Mesos
-again and go through the reconciliation phase. In the reconciliation phase, the
-scheduler receives a list of running workers nodes. It matches these against the
-recovered information from Zookeeper and makes sure to bring back the cluster in
-the state before the failure.
 
-### Artifact Server
+### Preparation
 
-The artifact server is responsible for providing resources to the worker
-nodes. The resources can be anything from the Flink binaries to shared secrets
-or configuration files. For instance, in non-containerized environments, the
-artifact server will provide the Flink binaries. What files will be served
-depends on the configuration overlay used.
+Flink on Mesos expects a Mesos cluster to be around. It also requires the Flink binaries being deployed
+ontothe the Mesos master. Additionally, Hadoop needs to be installed on the very same machine.
 
-### Flink's JobManager and Web Interface
+Flink provides `bin/mesos-appmaster.sh` to create a Flink on Mesos cluster. It will instantiate a 
+JobManager process on the Mesos master. The Mesos workers will be utilized to run the TaskManager 
+processes.
 
-The JobManager and the web interface provide a central point for monitoring,
-job submission, and other client interaction with the cluster
-(see [FLIP-6](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)).
-
-### Startup script and configuration overlays
-
-The startup script provide a way to configure and start the application
-master. All further configuration is then inherited by the workers nodes. This
-is achieved using configuration overlays. Configuration overlays provide a way
-to infer configuration from environment variables and config files which are
-shipped to the worker nodes.
-
-
-## DC/OS
-
-This section refers to [DC/OS](https://dcos.io) which is a Mesos distribution
-with a sophisticated application management layer. It comes pre-installed with
-Marathon, a service to supervise applications and maintain their state in case
-of failures.
-
-If you don't have a running DC/OS cluster, please follow the
-[instructions on how to install DC/OS on the official website](https://dcos.io/install/).
-
-Once you have a DC/OS cluster, you may install Flink through the DC/OS
-Universe. In the search prompt, just search for Flink. Alternatively, you can use the DC/OS CLI:
-
-    dcos package install flink
-
-Further information can be found in the
-[DC/OS examples documentation](https://github.com/dcos/examples/tree/master/1.8/flink).
-
-
-## Mesos without DC/OS
-
-You can also run Mesos without DC/OS.
-
-### Installing Mesos
-
-Please follow the [instructions on how to setup Mesos on the official website](http://mesos.apache.org/getting-started/).
-
-After installation you have to configure the set of master and agent nodes by creating the files `MESOS_HOME/etc/mesos/masters` and `MESOS_HOME/etc/mesos/slaves`.
-These files contain in each row a single hostname on which the respective component will be started (assuming SSH access to these nodes).
-
-Next you have to create `MESOS_HOME/etc/mesos/mesos-master-env.sh` or use the template found in the same directory.
-In this file, you have to define
-
-    export MESOS_work_dir=WORK_DIRECTORY
-
-and it is recommended to uncommment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-
-
-In order to configure the Mesos agents, you have to create `MESOS_HOME/etc/mesos/mesos-agent-env.sh` or use the template found in the same directory.
-You have to configure
-
-    export MESOS_master=MASTER_HOSTNAME:MASTER_PORT
-
-and uncomment
-
-    export MESOS_log_dir=LOGGING_DIRECTORY
-    export MESOS_work_dir=WORK_DIRECTORY
-
-#### Mesos Library
-
-In order to run Java applications with Mesos you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.so` on Linux.
-Under Mac OS X you have to export `MESOS_NATIVE_JAVA_LIBRARY=MESOS_HOME/lib/libmesos.dylib`.
-
-#### Deploying Mesos
-
-In order to start your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-start-cluster.sh`.
-In order to stop your mesos cluster, use the deployment script `MESOS_HOME/sbin/mesos-stop-cluster.sh`.
-More information about the deployment scripts can be found [here](http://mesos.apache.org/documentation/latest/deploy-scripts/).
-
-### Installing Marathon
-
-Optionally, you may also [install Marathon](https://mesosphere.github.io/marathon/docs/) which enables you to run Flink in [high availability (HA) mode](#high-availability).
-
-### Pre-installing Flink vs Docker/Mesos containers
-
-You may install Flink on all of your Mesos Master and Agent nodes.
-You can also pull the binaries from the Flink web site during deployment and apply your custom configuration before launching the application master.
-A more convenient and easier to maintain approach is to use Docker containers to manage the Flink binaries and configuration.
-
-This is controlled via the following configuration entries:
-
-    mesos.resourcemanager.tasks.container.type: mesos _or_ docker
-
-If set to 'docker', specify the image name:
-
-    mesos.resourcemanager.tasks.container.image.name: image_name
+For `bin/mesos-appmaster.sh` to work, you have to set the two variables `HADOOP_CLASSPATH` and 
+`MESOS_NATIVE_JAVA_LIBRARY`:
 
+{% highlight bash %}
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+{% endhighlight %}
 
-### Flink session cluster on Mesos
+`MESOS_NATIVE_JAVA_LIBRARY` needs to point to Mesos' native Java library. The library name `libmesos.so` 
+used above refers to Mesos' Linux library. Running Mesos on MacOS would require you to use 
+`libmesos.dylib` instead.
 
-A Flink session cluster is executed as a long-running Mesos Deployment. Note that you can run multiple Flink jobs on a session cluster. Each job needs to be submitted to the cluster after the cluster has been deployed.
+### Starting a Flink Session on Mesos
 
-In the `/bin` directory of the Flink distribution, you find two startup scripts
-which manage the Flink processes in a Mesos cluster:
+Connect to the Mesos workers, change into Flink's home directory and call `bin/mesos-appmaster.sh`:
 
-1. `mesos-appmaster.sh`
-   This starts the Mesos application master which will register the Mesos scheduler.
-   It is also responsible for starting up the worker nodes.
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
 
-2. `mesos-taskmanager.sh`
-   The entry point for the Mesos worker processes.
-   You don't need to explicitly execute this script.
-   It is automatically launched by the Mesos worker node to bring up a new TaskManager.
+# (1) create Flink on Mesos cluster
+./bin/mesos-appmaster.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dmesos.resourcemanager.tasks.cpus=6
+{% endhighlight %}
 
-In order to run the `mesos-appmaster.sh` script you have to define `mesos.master` in the `flink-conf.yaml` or pass it via `-Dmesos.master=...` to the Java process.
+The call above uses two variables not introduced, yet, as they depend on the cluster:
+* `MESOS_MASTER` refers to the Mesos master's IP address or hostname. It's important to not use `localhost` 
+  or `127.0.0.1` as the corresponding parameters are being shared with the Mesos cluster and the TaskManagers.
+* `FLINK_USER` refers to the user that owns the Mesos master's Flink installation directory (see Mesos' 
+documentation on [specifying a user](http://mesos.apache.org/documentation/latest/fetcher/#specifying-a-user-name)
+for further details).
 
-When executing `mesos-appmaster.sh`, it will create a job manager on the machine where you executed the script.
-In contrast to that, the task managers will be run as Mesos tasks in the Mesos cluster.
+The Flink on Mesos cluster is now deployed in [Session Mode]({% link deployment/index.md %}#session-mode).
+Note that you can run multiple Flink jobs on a Session cluster. Each job needs to be submitted to the 
+cluster. TaskManagers are deployed on the Mesos workers as needed. Keep in mind that you can only run as 
+many jobs as the Mesos cluster allows in terms of resources provided by the Mesos workers. Play around 
+with Flink's parameters to find the right resource utilization for your needs.
 
-### Flink job cluster on Mesos
+Check out [Flink's Mesos configuration]({% link deployment/config.md %}#mesos) to further influence 
+the resources Flink on Mesos is going to allocate.
 
-A Flink job cluster is a dedicated cluster which runs a single job.
-There is no extra job submission needed.
+## Deployment Modes Supported by Flink on Mesos
 
-In the `/bin` directory of the Flink distribution, you find one startup script
-which manage the Flink processes in a Mesos cluster:
+For production use, we recommend deploying Flink Applications in the 
+[Per-Job Mode]({% link deployment/index.md %}#per-job-mode), as it provides a better isolation 
+for each job.
 
-1. `mesos-appmaster-job.sh`
-   This starts the Mesos application master which will register the Mesos scheduler, retrieve the job graph and then launch the task managers accordingly.
+### Application Mode
 
-In order to run the `mesos-appmaster-job.sh` script you have to define `mesos.master` and `internal.jobgraph-path` in the `flink-conf.yaml`
-or pass it via `-Dmesos.master=... -Dinterval.jobgraph-path=...` to the Java process.
+Flink on Mesos does not support Application Mode.
 
-The job graph file may be generated like this way:
+### Per-Job Cluster Mode
 
+A job which is executed in Per-Job Cluster Mode spins up a dedicated Flink cluster that is only 
+used for that specific job. No extra job submission is needed. `bin/mesos-appmaster-job.sh` is used 
+as the startup script. It will start a Flink cluster for a dedicated job which is passed as a 
+JobGraph file. This file can be created by applying the following code to your Job source code:
 {% highlight java %}
 final JobGraph jobGraph = env.getStreamGraph().getJobGraph();
 final String jobGraphFilename = "job.graph";
 File jobGraphFile = new File(jobGraphFilename);
 try (FileOutputStream output = new FileOutputStream(jobGraphFile);
-	ObjectOutputStream obOutput = new ObjectOutputStream(output)){
-	obOutput.writeObject(jobGraph);
+    ObjectOutputStream obOutput = new ObjectOutputStream(output)){
+    obOutput.writeObject(jobGraph);
 }
 {% endhighlight %}
 
-<span class="label label-info">Note</span> Make sure that all Mesos processes have the user code jar on the classpath. There are two ways:
-
-1. One way is putting them in the `lib/` directory, which will result in the user code jar being loaded by the system classloader.
-1. The other way is creating a `usrlib/` directory in the parent directory of `lib/` and putting the user code jar in the `usrlib/` directory.
-After launching a job cluster via `bin/mesos-appmaster-job.sh ...`, the user code jar will be loaded by the user code classloader.
-
-#### General configuration
-
-It is possible to completely parameterize a Mesos application through Java properties passed to the Mesos application master.
-This also allows to specify general Flink configuration parameters.
-For example:
-
-    bin/mesos-appmaster.sh \
-        -Dmesos.master=master.foobar.org:5050 \
-        -Djobmanager.memory.process.size=1472m \
-        -Djobmanager.rpc.port=6123 \
-        -Drest.port=8081 \
-        -Dtaskmanager.memory.process.size=3500m \
-        -Dtaskmanager.numberOfTaskSlots=2 \
-        -Dparallelism.default=10
-
-### High Availability
-
-You will need to run a service like Marathon or Apache Aurora which takes care of restarting the JobManager process in case of node or process failures.
-In addition, Zookeeper needs to be configured like described in the [High Availability section of the Flink docs]({% link deployment/ha/index.md %}).
+Flink on Mesos Per-Job cluster can be started in the following way:
+{% highlight bash %}
+# (0) set required environment variables
+export HADOOP_CLASSPATH=$(hadoop classpath)
+export MESOS_NATIVE_JAVA_LIBRARY=/path/to/lib/libmesos.so
+
+# (1) create Per-Job Flink on Mesos cluster
+./bin/mesos-appmaster-job.sh \
+    -Dmesos.master=$MESOS_MASTER:5050 \
+    -Djobmanager.rpc.address=$MESOS_MASTER \
+    -Dmesos.resourcemanager.framework.user=$FLINK_USER \
+    -Dinternal.jobgraph-path=$JOB_GRAPH_FILE
+{% endhighlight %} 
+
+`JOB_GRAPH_FILE` in the command above refers to the path of the uploaded JobGraph file defining the 
+job that shall be executed on the Per-Job Flink cluster. The meaning of `MESOS_MASTER` and `FLINK_USER` 
+are described in the [Getting Started](#starting-a-flink-session-on-mesos) guide of this page.
+
+### Session Mode
+
+The [Getting Started](#starting-a-flink-session-on-mesos) guide at the top of this page describes 
+deploying Flink in Session Mode.
+
+## Flink on Mesos Reference
+
+### Flink on Mesos Architecture
+
+The Flink on Mesos implementation consists of two components: The application master and the workers. 
+The workers are simple TaskManagers parameterized by the environment which is set up through the 
+application master. The most sophisticated component of the Flink on Mesos implementation is the 
+application master. The application master currently hosts the following components:
+- **Mesos Scheduler**: The Scheduler is responsible for registering a framework with Mesos, requesting 
+  resources, and launching worker nodes. The Scheduler continuously needs to report back to Mesos to 
+  ensure the framework is in a healthy state. To verify the health of the cluster, the Scheduler 
+  monitors the spawned workers, marks them as failed and restarts them if necessary.
+
+  Flink's Mesos Scheduler itself is currently not highly available. However, it persists all necessary 
+  information about its state (e.g. configuration, list of workers) in [ZooKeeper](#high-availability-on-mesos). 
+  In the presence of a failure, it relies on an external system to bring up a new Scheduler (see the 
+  [Marathon subsection](#marathon) for further details). The Scheduler will then register with Mesos 
+  again and go through the reconciliation phase. In the reconciliation phase, the Scheduler receives 
+  a list of running workers nodes. It matches these against the recovered information from ZooKeeper 
+  and makes sure to bring back the cluster in the state before the failure.
+- **Artifact Server**: The Artifact Server is responsible for providing resources to the worker nodes. 
+  The resources can be anything from the Flink binaries to shared secrets or configuration files. 
+  For instance, in non-containerized environments, the Artifact Server will provide the Flink binaries. 
+  What files will be served depends on the configuration overlay used.
+
+Flink's Mesos startup scripts `bin/mesos-appmaster.sh` and `bin/mesos-appmaster-job.sh` provide a way 
+to configure and start the application master. The worker nodes inherit all further configuration. 
+They are deployed through `bin/mesos-taskmanager.sh`. The configuration inheritance is achieved using 
+configuration overlays. Configuration overlays provide a way to infer a configuration from environment 
+variables and config files which are shipped to the worker nodes.
+
+See [Mesos Architecture](http://mesos.apache.org/documentation/latest/architecture/) for a more details 
+on how frameworks are handled by Mesos.
+
+### Deploying User Libraries
+
+User libraries can be passed to the Mesos workers by placing them in Flink's `lib/` folder. This way, 
+they will be picked by Mesos' Fetcher and copied over into the worker's sandbox folders. Alternatively, 
+Docker containerization can be used as described in [Installing Flink on the Workers](#installing-flink-on-the-workers).
+
+### Installing Flink on the Workers
+
+Flink on Mesos offers two ways to distribute the Flink and user binaries within the Mesos cluster:
+1. **Using Mesos' Artifact Server**: The Artifact Server provides the resources which are moved by 
+   [Mesos' Fetcher](http://mesos.apache.org/documentation/latest/fetcher/) into the Mesos worker's 
+   [sandbox folders](http://mesos.apache.org/documentation/latest/sandbox/). It can be explicitly 
+   specified by setting [mesos.resourcemanager.tasks.container.type]({% link deployment/config.md %}#mesos-resourcemanager-tasks-container-type) 
+   to `mesos`. This is the default option and is used in the example commands of this page.
+2. **Using Docker containerization**: This enables the user to provide user libraries and other 
+   customizations as part of a Docker image. Docker utilization can be enabled by setting 
+   [mesos.resourcemanager.tasks.container.type]({% link deployment/config.md %}#mesos-resourcemanager-tasks-container-type) 
+   to `docker` and by providing the image name through [mesos.resourcemanager.tasks.container.image.name]({% link deployment/config.md %}#mesos-resourcemanager-tasks-container-image-name).
+
+### High Availability on Mesos
+
+You will need to run a service like Marathon or Apache Aurora which takes care of restarting the 
+JobManager process in case of node or process failures. In addition, Zookeeper needs to be configured 
+as described in the [High Availability section of the Flink docs]({% link deployment/ha/index.md %}).
 
 #### Marathon
 
-Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script.
-In particular, it should also adjust any configuration parameters for the Flink cluster.
+Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script. In particular, it should 
+also adjust any configuration parameters for the Flink cluster.
 
 Here is an example configuration for Marathon:
+{% highlight javascript %}
+{
+    "id": "flink",
+    "cmd": "/opt/flink-1.11.2/bin/mesos-appmaster.sh -Dmesos.resourcemanager.framework.user=root -Dmesos.resourcemanager.tasks.taskmanager-cmd=/opt/flink-1.11.2/bin/mesos-taskmanager.sh -Dmesos.master=master:5050 -Djobmanager.memory.process.size=1472m -Dtaskmanager.memory.process.size=3500m -Dtaskmanager.numberOfTaskSlots=2 -Dparallelism.default=2",
+    "cpus": 2,
+    "mem": 1024,
+    "disk": 0,
+    "instances": 1,
+    "env": {
+        "MESOS_NATIVE_JAVA_LIBRARY": "/usr/lib/libmesos.so",
+        "HADOOP_CLASSPATH": "/opt/hadoop-2.10.1/etc/hadoop:/opt/hadoop-2.10.1/share/hadoop/common/lib/*:/opt/hadoop-2.10.1/share/hadoop/common/*:/opt/hadoop-2.10.1/share/hadoop/hdfs:/opt/hadoop-2.10.1/share/hadoop/hdfs/lib/*:/opt/hadoop-2.10.1/share/hadoop/hdfs/*:/opt/hadoop-2.10.1/share/hadoop/yarn:/opt/hadoop-2.10.1/share/hadoop/yarn/lib/*:/opt/hadoop-2.10.1/share/hadoop/yarn/*:/opt/hadoop-2.10.1/share/hadoop/mapreduce/lib/*:/opt/hadoop-2.10.1/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar"

Review comment:
       Maybe it is simpler (for the example) to use Flink's bundled Hadoop dependency. That way we might work around the problem that Mesos relies on Hadoop even if you don't use it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org