You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/12/04 16:16:13 UTC

[GitHub] [flink] zentol commented on a change in pull request #14305: [FLINK-20355][docs] Add new native K8s documentation page

zentol commented on a change in pull request #14305:
URL: https://github.com/apache/flink/pull/14305#discussion_r536213406



##########
File path: docs/deployment/resource-providers/native_kubernetes.md
##########
@@ -24,394 +23,244 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-This page describes how to deploy a Flink session cluster natively on [Kubernetes](https://kubernetes.io).
+This page describes how to deploy Flink natively on [Kubernetes](https://kubernetes.io).
 
 * This will be replaced by the TOC
 {:toc}
 
-<div class="alert alert-warning">
-Flink's native Kubernetes integration is still experimental. There may be changes in the configuration and CLI flags in later versions.
-</div>
+## Getting Started
 
-## Requirements
+This *Getting Started* section guides you through setting up a fully functional Flink Cluster on Kubernetes.
 
-- Kubernetes 1.9 or above.
-- KubeConfig, which has access to list, create, delete pods and services, configurable via `~/.kube/config`. You can verify permissions by running `kubectl auth can-i <list|create|edit|delete> pods`.
-- Kubernetes DNS enabled.
-- A service Account with [RBAC](#rbac) permissions to create, delete pods.
-
-## Flink Kubernetes Session
+### Introduction
 
-### Start Flink Session
+Kubernetes is a popular container-orchestration system for automating computer application deployment, scaling, and management.
+Flink's native Kubernetes integration allows to directly deploy Flink on a running Kubernetes cluster.
+Moreover, Flink is able to dynamically allocate and de-allocate TaskManagers depending on the required resources because it can directly talk to Kubernetes.
 
-Follow these instructions to start a Flink Session within your Kubernetes cluster.
+### Preparation
 
-A session will start all required Flink services (JobManager and TaskManagers) so that you can submit programs to the cluster.
-Note that you can run multiple programs per session.
+This *Getting Started* section assumes a running Kubernetes cluster fulfilling the following requirements:
 
-{% highlight bash %}
-$ ./bin/kubernetes-session.sh
-{% endhighlight %}
+- Kubernetes >= 1.9.
+- KubeConfig, which has access to list, create, delete pods and services, configurable via `~/.kube/config`. You can verify permissions by running `kubectl auth can-i <list|create|edit|delete> pods`.
+- Enabled Kubernetes DNS.
+- `default` service account with [RBAC](#rbac) permissions to create, delete pods.
 
-All the Kubernetes configuration options can be found in our [configuration guide]({% link deployment/config.md %}#kubernetes).
+If you have problems setting up a Kubernetes cluster, then take a look at [how to setup a Kubernetes cluster](https://kubernetes.io/docs/setup/).
 
-**Example**: Issue the following command to start a session cluster with 4 GB of memory and 2 CPUs with 4 slots per TaskManager:
+### Starting a Flink Session on Kubernetes
 
-In this example we override the `resourcemanager.taskmanager-timeout` setting to make
-the pods with task managers remain for a longer period than the default of 30 seconds.
-Although this setting may cause more cloud cost it has the effect that starting new jobs is in some scenarios
-faster and during development you have more time to inspect the logfiles of your job.
+Once you have your Kubernetes cluster running and configured your `kubectl` to point to it, you can launch a Flink session cluster via
 
 {% highlight bash %}
-$ ./bin/kubernetes-session.sh \
-  -Dkubernetes.cluster-id=<ClusterId> \
-  -Dtaskmanager.memory.process.size=4096m \
-  -Dkubernetes.taskmanager.cpu=2 \
-  -Dtaskmanager.numberOfTaskSlots=4 \
-  -Dresourcemanager.taskmanager-timeout=3600000
-{% endhighlight %}
+# (1) Start Kubernetes session
+$ ./bin/kubernetes-session.sh -Dkubernetes.cluster-id=my-first-flink-cluster
 
-The system will use the configuration in `conf/flink-conf.yaml`.
-Please follow our [configuration guide]({% link deployment/config.md %}) if you want to change something.
+# (2) Submit example job
+$ ./bin/flink run --target kubernetes-session -Dkubernetes.cluster-id=my-first-flink-cluster ./examples/streaming/TopSpeedWindowing.jar
 
-If you do not specify a particular name for your session by `kubernetes.cluster-id`, the Flink client will generate a UUID name.
+# (3) Stop Kubernetes session by deleting cluster deployment
+$ kubectl delete deployment/my-first-flink-cluster
 
-<span class="label label-info">Note</span> A docker image with Python and PyFlink installed is required if you are going to start a session cluster for Python Flink Jobs.
-Please refer to the following [section](#custom-flink-docker-image).
+{% endhighlight %}
 
-### Custom Flink Docker image
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
+<span class="label label-info">Note</span> When using [Minikube](https://minikube.sigs.k8s.io/docs/), you need to call `minikube tunnel` in order to [expose Flink's LoadBalancer service on Minikube](https://minikube.sigs.k8s.io/docs/handbook/accessing/#using-minikube-tunnel).
 
-If you want to use a custom Docker image to deploy Flink containers, check [the Flink Docker image documentation]({% link deployment/resource-providers/standalone/docker.md %}),
-[its tags]({% link deployment/resource-providers/standalone/docker.md %}#image-tags), [how to customize the Flink Docker image]({% link deployment/resource-providers/standalone/docker.md %}#customize-flink-image) and [enable plugins]({% link deployment/resource-providers/standalone/docker.md %}#using-plugins).
-If you created a custom Docker image you can provide it by setting the [`kubernetes.container.image`]({% link deployment/config.md %}#kubernetes-container-image) configuration option:
+Congratulations! You have successfully run a Flink application by deploying Flink on Kubernetes.
 
-{% highlight bash %}
-$ ./bin/kubernetes-session.sh \
-  -Dkubernetes.cluster-id=<ClusterId> \
-  -Dtaskmanager.memory.process.size=4096m \
-  -Dkubernetes.taskmanager.cpu=2 \
-  -Dtaskmanager.numberOfTaskSlots=4 \
-  -Dresourcemanager.taskmanager-timeout=3600000 \
-  -Dkubernetes.container.image=<CustomImageName>
-{% endhighlight %}
-</div>
+{% top %}
 
-<div data-lang="python" markdown="1">
-To build a custom image which has Python and Pyflink prepared, you can refer to the following Dockerfile:
-{% highlight Dockerfile %}
-FROM flink
+## Deployment Modes Supported by Flink on Kubernetes
 
-# install python3 and pip3
-RUN apt-get update -y && \
-    apt-get install -y python3.7 python3-pip python3.7-dev && rm -rf /var/lib/apt/lists/*
-RUN ln -s /usr/bin/python3 /usr/bin/python
-    
-# install Python Flink
-RUN pip3 install apache-flink
-{% endhighlight %}
+For production use, we recommend deploying Flink Applications in the [Per-job or Application Mode]({% link deployment/index.md %}#deployment-modes), as these modes provide a better isolation for the Applications.
 
-Build the image named as **pyflink:latest**:
+### Application Mode
 
-{% highlight bash %}
-sudo docker build -t pyflink:latest .
-{% endhighlight %}
+The [Application Mode]({% link deployment/index.md %}#application-mode) requires that the user code is bundled together with the Flink image because it runs the user code's `main()` method on the cluster.
+The Application Mode makes sure that all Flink components are properly cleaned up after the termination of the application.
 
-Then you are able to start a PyFlink session cluster by setting the [`kubernetes.container.image`]({% link deployment/config.md %}#kubernetes-container-image) 
-configuration option value to be the name of custom image:
+The Flink community provides a [base Docker image]({% link deployment/resource-providers/standalone/docker.md %}#docker-hub-flink-images) which can be used to bundle the user code:
 
-{% highlight bash %}
-$ ./bin/kubernetes-session.sh \
-  -Dkubernetes.cluster-id=<ClusterId> \
-  -Dtaskmanager.memory.process.size=4096m \
-  -Dkubernetes.taskmanager.cpu=2 \
-  -Dtaskmanager.numberOfTaskSlots=4 \
-  -Dresourcemanager.taskmanager-timeout=3600000 \
-  -Dkubernetes.container.image=pyflink:latest
+{% highlight dockerfile %}
+FROM flink
+RUN mkdir -p $FLINK_HOME/usrlib
+COPY /path/of/my-flink-job.jar $FLINK_HOME/usrlib/my-flink-job.jar
 {% endhighlight %}
-</div>
-
-</div>
 
-### Submitting jobs to an existing Session
+After creating and publishing the Docker image under `custom-image-name`, you can start an Application cluster with the following command
 
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-Use the following command to submit a Flink Job to the Kubernetes cluster.
 {% highlight bash %}
-$ ./bin/flink run -d -t kubernetes-session -Dkubernetes.cluster-id=<ClusterId> examples/streaming/WindowJoin.jar
+$ ./bin/flink run-application --target kubernetes-application \
+-Dkubernetes.cluster-id=my-first-application-cluster \
+-Dkubernetes.container.image=custom-image-name \
+local:///opt/flink/usrlib/my-flink-job.jar
 {% endhighlight %}
-</div>
 
-<div data-lang="python" markdown="1">
-Use the following command to submit a PyFlink Job to the Kubernetes cluster.
-{% highlight bash %}
-$ ./bin/flink run -d -t kubernetes-session -Dkubernetes.cluster-id=<ClusterId> -pym scala_function -pyfs examples/python/table/udf
-{% endhighlight %}
-</div>
-</div>
+<span class="label label-info">Note</span> `local` is the only supported scheme in application mode.
 
-### Accessing Job Manager UI
+The configuration `kubernetes.cluster-id` specifies the cluster name and must unique.
+If you do not specify this option, then Flink will generate a random name.
 
-There are several ways to expose a Service onto an external (outside of your cluster) IP address.
-This can be configured using [`kubernetes.rest-service.exposed.type`]({% link deployment/config.md %}#kubernetes-rest-service-exposed-type).
+The configuration `kubernetes.container.image` specifies the image to start the pods with.
 
-- `ClusterIP`: Exposes the service on a cluster-internal IP.
-The Service is only reachable within the cluster. If you want to access the Job Manager ui or submit job to the existing session, you need to start a local proxy.
-You can then use `localhost:8081` to submit a Flink job to the session or view the dashboard.
+Once the application cluster is deployed you can interact with it:
 
 {% highlight bash %}
-$ kubectl port-forward service/<ServiceName> 8081
+# List running job on the cluster
+$ ./bin/flink list --target kubernetes-application -Dkubernetes.cluster-id=my-first-application-cluster
+# Cancel running job
+$ ./bin/flink cancel --target kubernetes-application -Dkubernetes.cluster-id=my-first-application-cluster <jobId>
 {% endhighlight %}
 
-- `NodePort`: Exposes the service on each Node’s IP at a static port (the `NodePort`). `<NodeIP>:<NodePort>` could be used to contact the Job Manager Service. `NodeIP` could be easily replaced with Kubernetes ApiServer address.
-You could find it in your kube config file.
+The system will use the configuration in `conf/flink-conf.yaml` and override these values with key-value pairs `-Dkey=value` which are provided to `bin/flink`.
 
-- `LoadBalancer`: Exposes the service externally using a cloud provider’s load balancer.
-Since the cloud provider and Kubernetes needs some time to prepare the load balancer, you may get a `NodePort` JobManager Web Interface in the client log.
-You can use `kubectl get services/<ClusterId>-rest` to get EXTERNAL-IP and then construct the load balancer JobManager Web Interface manually `http://<EXTERNAL-IP>:8081`.
+### Session Mode
 
-  <span class="label label-warning">Warning!</span> Your JobManager (which can run arbitary jar files) might be exposed to the public internet, without authentication.
+You have seen the deployment of a Session cluster in the [Getting Started](#getting-started) guide at the top of the page.
 
-- `ExternalName`: Map a service to a DNS name, not supported in current version.
+The Session Mode can be run in two modes:
 
-Please reference the official documentation on [publishing services in Kubernetes](https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types) for more information.
+* **detached mode** (default): The `kubernetes-session.sh` deploys the Flink cluster on Kubernetes and then terminates.
 
-### Attach to an existing Session
+* **attached mode** (`-Dexecution.attached=true`): The `kubernetes-session.sh` stays alive and allows entering commands controlling the running Flink cluster.
+  For example, `stop` stops the running Session cluster.
+  Type `help` for seeing all the supported commands.
 
-The Kubernetes session is started in detached mode by default, meaning the Flink client will exit after submitting all the resources to the Kubernetes cluster. Use the following command to attach to an existing session.
+In order to re-attach to a running Session cluster with the cluster id `my-first-flink-cluster` use the following command:
 
 {% highlight bash %}
-$ ./bin/kubernetes-session.sh -Dkubernetes.cluster-id=<ClusterId> -Dexecution.attached=true
+$ ./bin/kubernetes-session.sh -Dkubernetes.cluster-id=my-first-flink-cluster -Dexecution.attached=true
 {% endhighlight %}
 
-### Stop Flink Session
+The system will use the configuration in `conf/flink-conf.yaml` and override these values with key-value pairs `-Dkey=value` which are provided to `bin/kubernetes-session.sh`.
 
-To stop a Flink Kubernetes session, attach the Flink client to the cluster and type `stop`.
+#### Stop a Running Session Cluster
+
+In order to stop a running Session Cluster with cluster id `my-first-flink-cluster` you can either [delete the Flink deployment](#manual-resource-cleanup) or use:
 
 {% highlight bash %}
-$ echo 'stop' | ./bin/kubernetes-session.sh -Dkubernetes.cluster-id=<ClusterId> -Dexecution.attached=true
+$ echo 'stop' | ./bin/kubernetes-session.sh -Dkubernetes.cluster-id=my-first-flink-cluster -Dexecution.attached=true
 {% endhighlight %}
 
-#### Manual Resource Cleanup
+{% top %}
 
-Flink uses [Kubernetes OwnerReference's](https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/) to cleanup all cluster components.
-All the Flink created resources, including `ConfigMap`, `Service`, `Pod`, have been set the OwnerReference to `deployment/<ClusterId>`.
-When the deployment is deleted, all other resources will be deleted automatically.
+## Flink on Kubernetes Reference
 
-{% highlight bash %}
-$ kubectl delete deployment/<ClusterID>
-{% endhighlight %}
+### Configuring Flink on Kubernetes
 
-## Flink Kubernetes Application
+The Kubernetes-specific configurations are listed on the [configuration page]({% link deployment/config.md %}#kubernetes).
 
-### Start Flink Application
-<div class="codetabs" markdown="1">
-Application mode allows users to create a single image containing their Job and the Flink runtime, which will automatically create and destroy cluster components as needed. The Flink community provides base docker images [customized]({% link deployment/resource-providers/standalone/docker.md %}#customize-flink-image) for any use case.
-<div data-lang="java" markdown="1">
-{% highlight dockerfile %}
-FROM flink
-RUN mkdir -p $FLINK_HOME/usrlib
-COPY /path/of/my-flink-job-*.jar $FLINK_HOME/usrlib/my-flink-job.jar
-{% endhighlight %}
+### Accessing Flink's Web UI
+
+There are several ways to expose Flink's Web UI and REST endpoint.
+This can be configured using [kubernetes.rest-service.exposed.type]({% link deployment/config.md %}#kubernetes-rest-service-exposed-type).
+
+- **ClusterIP**: Exposes the service on a cluster-internal IP.
+  The Service is only reachable within the cluster.
+  If you want to access the Job Manager ui or submit job to the existing session, you need to start a local proxy.
+  You can then use `localhost:8081` to submit a Flink job to the session or view the dashboard.
 
-Use the following command to start a Flink application.
 {% highlight bash %}
-$ ./bin/flink run-application -p 8 -t kubernetes-application \
-  -Dkubernetes.cluster-id=<ClusterId> \
-  -Dtaskmanager.memory.process.size=4096m \
-  -Dkubernetes.taskmanager.cpu=2 \
-  -Dtaskmanager.numberOfTaskSlots=4 \
-  -Dkubernetes.container.image=<CustomImageName> \
-  local:///opt/flink/usrlib/my-flink-job.jar
+$ kubectl port-forward service/<ServiceName> 8081
 {% endhighlight %}
-</div>
 
-<div data-lang="python" markdown="1">
-{% highlight dockerfile %}
-FROM flink
+- **NodePort**: Exposes the service on each Node’s IP at a static port (the `NodePort`).
+  `<NodeIP>:<NodePort>` could be used to contact the Job Manager Service.
+  `NodeIP` could be easily replaced with Kubernetes ApiServer address.
+  You could find it in your kube config file.
 
-# install python3 and pip3
-RUN apt-get update -y && \
-    apt-get install -y python3.7 python3-pip python3.7-dev && rm -rf /var/lib/apt/lists/*
-RUN ln -s /usr/bin/python3 /usr/bin/python
+- **LoadBalancer**: Exposes the service externally using a cloud provider’s load balancer.
+  Since the cloud provider and Kubernetes needs some time to prepare the load balancer, you may get a `NodePort` JobManager Web Interface in the client log.
+  You can use `kubectl get services/<cluster-id>-rest` to get EXTERNAL-IP and then construct the load balancer JobManager Web Interface manually `http://<EXTERNAL-IP>:8081`.
 
-# install Python Flink
-RUN pip3 install apache-flink
-COPY /path/of/python/codes /opt/python_codes
+Please reference the official documentation on [publishing services in Kubernetes](https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types) for more information.
 
-# if there are third party python dependencies, users can install them when building the image
-COPY /path/to/requirements.txt /opt/requirements.txt
-RUN pip3 install -r requirements.txt
+### Accessing the Logs

Review comment:
       Documentation for configuring log4j, as in modifying the log4j properties file contents, should go into the logging documentation; that said we currently do not nor do I intend to add such documentation at the moment.
   
   _How_ to do this on Kubernetes (i.e., editing the config map) should stay here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org