You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by ki...@apache.org on 2022/04/20 02:37:08 UTC

[incubator-seatunnel] branch dev updated: [doc] Add guide on how to Set Up SeaTunnel with Kubernetes (#1712)

This is an automated email from the ASF dual-hosted git repository.

kirs pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/incubator-seatunnel.git


The following commit(s) were added to refs/heads/dev by this push:
     new ac7c8969 [doc] Add guide on how to Set Up SeaTunnel with Kubernetes (#1712)
ac7c8969 is described below

commit ac7c89693dfed0912f76f412b3bff75d44370280
Author: Gezim Sejdiu <g....@gmail.com>
AuthorDate: Wed Apr 20 04:37:02 2022 +0200

    [doc] Add guide on how to Set Up SeaTunnel with Kubernetes (#1712)
    
    * [doc] Add guide on how to Set Up SeaTunnel with Kubernetes
    
    * Update docs/en/start/kubernetes.mdx
    
    Use `Dockerfile` md syntax.
    
    Co-authored-by: Jiajie Zhong <zh...@gmail.com>
    
    * Update docs/en/start/kubernetes.mdx
    
    Use `bash` instead of `zsh`.
    
    Co-authored-by: Jiajie Zhong <zh...@gmail.com>
    
    * Add kubernetes item into sidebars.js
    
    * Update SeaTunnel-Flink image to use the right one on FlinkDeployment
    
    Co-authored-by: Jiajie Zhong <zh...@gmail.com>
---
 docs/en/start/kubernetes.mdx | 269 +++++++++++++++++++++++++++++++++++++++++++
 docs/sidebars.js             |   1 +
 2 files changed, 270 insertions(+)

diff --git a/docs/en/start/kubernetes.mdx b/docs/en/start/kubernetes.mdx
new file mode 100644
index 00000000..a73d36bc
--- /dev/null
+++ b/docs/en/start/kubernetes.mdx
@@ -0,0 +1,269 @@
+---
+sidebar_position: 4
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+# Set Up with Kubernetes
+
+This section provides a quick guide to using SeaTunnel with Kubernetes. 
+
+## Prerequisites
+
+We assume that you have a local installations of the following:
+
+- [docker](https://docs.docker.com/)
+- [kubernetes](https://kubernetes.io/)
+- [helm](https://helm.sh/docs/intro/quickstart/)
+
+So that the `kubectl` and `helm` commands are available on your local system.
+
+For kubernetes [minikube](https://minikube.sigs.k8s.io/docs/start/) is our choice, at the time of writing this we are using version v1.23.3. You can start a cluster with the following command:
+
+```bash
+minikube start --kubernetes-version=v1.23.3
+```
+
+## Installation
+
+### SeaTunnel docker image
+
+To run the image with SeaTunnel, first create a `Dockerfile`:
+
+<Tabs
+  groupId="engine-type"
+  defaultValue="flink"
+  values={[
+    {label: 'Flink', value: 'flink'},
+  ]}>
+<TabItem value="flink">
+
+```Dockerfile
+FROM flink:1.13
+
+ENV SEATUNNEL_VERSION="2.1.0"
+
+RUN wget https://archive.apache.org/dist/incubator/seatunnel/${SEATUNNEL_VERSION}/apache-seatunnel-incubating-${SEATUNNEL_VERSION}-bin.tar.gz
+RUN tar -xzvf apache-seatunnel-incubating-${SEATUNNEL_VERSION}-bin.tar.gz
+
+RUN mkdir -p $FLINK_HOME/usrlib
+RUN cp apache-seatunnel-incubating-${SEATUNNEL_VERSION}/lib/seatunnel-core-flink.jar $FLINK_HOME/usrlib/seatunnel-core-flink.jar
+
+RUN rm -fr apache-seatunnel-incubating-${SEATUNNEL_VERSION}*
+```
+
+Then run the following commands to build the image:
+```bash
+docker build -t seatunnel:2.1.0-flink-1.13 -f Dockerfile .
+```
+Image `seatunnel:2.1.0-flink-1.13` need to be present in the host (minikube) so that the deployment can take place.
+
+Load image to minikube via: 
+```bash
+minikube image load seatunnel:2.1.0-flink-1.13
+```
+
+</TabItem>
+</Tabs>
+
+### Deploying the operator
+
+<Tabs
+  groupId="engine-type"
+  defaultValue="flink"
+  values={[
+    {label: 'Flink', value: 'flink'},
+  ]}>
+<TabItem value="flink">
+
+The steps below provide a quick walk-through on setting up the Flink Kubernetes Operator. 
+
+Install the certificate manager on your Kubernetes cluster to enable adding the webhook component (only needed once per Kubernetes cluster):
+
+```bash
+kubectl create -f https://github.com/jetstack/cert-manager/releases/download/v1.7.1/cert-manager.yaml
+```
+Now you can deploy the latest stable Flink Kubernetes Operator version using the included Helm chart:
+
+```bash
+
+helm repo add flink-operator-repo https://downloads.apache.org/flink/flink-kubernetes-operator-0.1.0/
+
+helm install flink-kubernetes-operator flink-operator-repo/flink-kubernetes-operator
+```
+
+You may verify your installation via `kubectl`:
+
+```bash
+kubectl get pods
+NAME                                                   READY   STATUS    RESTARTS      AGE
+flink-kubernetes-operator-5f466b8549-mgchb             1/1     Running   3 (23h ago)   16d
+
+```
+
+</TabItem>
+</Tabs>
+
+## Run SeaTunnel Application
+
+**Run Application:**: SeaTunnel already providers out-of-the-box [configurations](https://github.com/apache/incubator-seatunnel/tree/dev/config).
+
+<Tabs
+  groupId="engine-type"
+  defaultValue="flink"
+  values={[
+    {label: 'Flink', value: 'flink'},
+  ]}>
+<TabItem value="flink">
+
+In this guide we are going to use [flink.streaming.conf](https://github.com/apache/incubator-seatunnel/blob/dev/config/flink.streaming.conf.template):
+ 
+ ```conf
+env {
+  execution.parallelism = 1
+}
+
+source {
+    FakeSourceStream {
+      result_table_name = "fake"
+      field_name = "name,age"
+    }
+}
+
+transform {
+    sql {
+      sql = "select name,age from fake"
+    }
+}
+
+sink {
+  ConsoleSink {}
+}
+ ```
+
+This configuration need to be present when we are going to deploy the application (SeaTunnel) to Flink cluster (on Kubernetes), we also need to configure a Pod to Use a PersistentVolume for Storage.
+- Create `/mnt/data` on your Node. Open a shell to the single Node in your cluster. How you open a shell depends on how you set up your cluster. For example, in our case weare using Minikube, you can open a shell to your Node by entering `minikube ssh`. 
+In your shell on that Node, create a /mnt/data directory:
+  ```bash
+  minikube ssh
+  
+  # This assumes that your Node uses "sudo" to run commands
+  # as the superuser
+  sudo mkdir /mnt/data
+  ```
+- Copy application (SeaTunnel) configuration files to your Node.
+  ```bash
+  minikube cp flink.streaming.conf /mnt/data/flink.streaming.conf
+  ```
+
+Once the Flink Kubernetes Operator is running as seen in the previous steps you are ready to submit a Flink (SeaTunnel) job:
+- Create `seatunnel-flink.yaml` FlinkDeployment manifest:
+  ```yaml
+  apiVersion: flink.apache.org/v1alpha1
+  kind: FlinkDeployment
+  metadata:
+    namespace: default
+    name: seatunnel-flink-streaming-example
+  spec:
+    image: seatunnel:2.1.0-flink-1.13
+    flinkVersion: v1_14
+    flinkConfiguration:
+      taskmanager.numberOfTaskSlots: "2"
+    serviceAccount: flink
+    jobManager:
+      replicas: 1
+      resource:
+        memory: "2048m"
+        cpu: 1
+    taskManager:
+      resource:
+        memory: "2048m"
+        cpu: 2
+    podTemplate:
+      spec:
+        containers:
+          - name: flink-main-container
+            volumeMounts:
+              - mountPath: /data
+                name: config-volume
+        #     volumeMounts:
+        #       - mountPath: /data
+        #         name: task-pv-storage
+        # volumes:
+        #   - name: task-pv-storage
+        #     persistentVolumeClaim:
+        #       claimName: task-pv-claim
+
+        volumes:
+          - name: config-volume
+            hostPath:
+              path: "/mnt/data"
+              type: Directory
+
+    job:
+      jarURI: local:///opt/flink/usrlib/seatunnel-core-flink.jar
+      entryClass: org.apache.seatunnel.SeatunnelFlink
+      args: ["--config", "/data/flink.streaming.conf"]
+      parallelism: 2
+      upgradeMode: stateless
+
+  ```
+- Run the example application:
+  ```bash
+  kubectl apply -f seatunnel-flink.yaml
+  ```
+</TabItem>
+</Tabs>
+
+**See The Output**
+
+<Tabs
+  groupId="engine-type"
+  defaultValue="flink"
+  values={[
+    {label: 'Flink', value: 'flink'},
+  ]}>
+<TabItem value="flink">
+
+You may follow the logs of your job, after a successful startup (which can take on the order of a minute in a fresh environment, seconds afterwards) you can:
+
+```bash
+kubectl logs -f deploy/seatunnel-flink-streaming-example
+```
+
+To expose the Flink Dashboard you may add a port-forward rule:
+```bash
+kubectl port-forward svc/seatunnel-flink-streaming-example-rest 8081
+```
+Now the Flink Dashboard is accessible at [localhost:8081](http://localhost:8081).
+
+Or launch `minikube dashboard` for a web-based Kubernetes user interface.
+
+The content printed in the TaskManager Stdout log:
+```bash
+kubectl logs \
+-l 'app in (seatunnel-flink-streaming-example), component in (taskmanager)' \
+--tail=-1 \
+-f
+```
+looks like the below (your content may be different since we use `FakeSourceStream` to automatically generate random stream data):
+
+```shell
++I[Kid Xiong, 1650316786086]
++I[Ricky Huo, 1650316787089]
++I[Ricky Huo, 1650316788089]
++I[Ricky Huo, 1650316789090]
++I[Kid Xiong, 1650316790090]
++I[Kid Xiong, 1650316791091]
++I[Kid Xiong, 1650316792092]
+```
+</TabItem>
+</Tabs>
+
+Happy SeaTunneling!
+
+## What's More
+
+For now, you are already taking a quick look at SeaTunnel, you could see [connector](/category/connector) to find all source and sink SeaTunnel supported. 
+Or see [deployment](../deployment.mdx) if you want to submit your application in another kind of your engine cluster.
diff --git a/docs/sidebars.js b/docs/sidebars.js
index 45e49dd3..d96db4c6 100644
--- a/docs/sidebars.js
+++ b/docs/sidebars.js
@@ -68,6 +68,7 @@ const sidebars = {
       items: [
         'start/local',
         'start/docker',
+        'start/kubernetes',
       ],
     },
     {