You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/02/21 18:45:12 UTC

[GitHub] [airflow] kaxil commented on a change in pull request #7406: [AIRFLOW-XXXX] Add architecture section to k8sexec docs

kaxil commented on a change in pull request #7406: [AIRFLOW-XXXX] Add architecture section to k8sexec docs
URL: https://github.com/apache/airflow/pull/7406#discussion_r382744565
 
 

 ##########
 File path: docs/executor/kubernetes.rst
 ##########
 @@ -34,3 +34,71 @@ The volumes are optional and depend on your configuration. There are two volumes
   - By storing logs onto a persistent disk, the files are accessible by workers and the webserver. If you don't configure this, the logs will be lost after the worker pods shuts down
 
   - Another option is to use S3/GCS/etc to store logs
+
+KubernetesExecutor Architecture
+################################
+
+The KubernetesExecutor runs as a process in the Scheduler that only requires access to the Kubernetes API (it does *not* need to run inside of a Kubernetes cluster). The KubernetesExecutor requires a non-sqlite database in the backend, but there are no external brokers or persistent workers needed.
+For these reasons, we recommend the KubernetesExecutor for deployments have long periods of dormancy between DAG execution.
+
+
+.. image:: ../img/k8s-0-worker.jpeg
+
+
+When a DAG submits a task, the KubernetesExecutor requests a worker pod from the Kubernetes API. The worker pod then runs the task, reports the result, and terminates.
+
+
+
+.. image:: ../img/k8s-3-worker.jpeg
+
+.. @startuml
+.. Airflow_Scheduler -> Kubernetes: Request a new pod with command "airflow run..."
+.. Kubernetes -> Airflow_Worker: Create Airflow worker with command "airflow run..."
+.. Airflow_Worker -> Airflow_DB: Report task passing or failure to DB
+.. Airflow_Worker -> Kubernetes: Pod completes with state "Succeeded" and k8s records in ETCD
+.. Kubernetes -> Airflow_Scheduler: Airflow scheduler reads "Succeeded" from k8s watcher thread
+.. @enduml
+.. image:: ../img/k8s-happy-path.png
+
+
+***************
+Fault Tolerance
+***************
+
+===========================
+Handling Worker Pod Crashes
+===========================
+
+When dealing with distributed systems, we need a system that assumes that any component can crash at any moment for reasons ranging from OOM errors to node upgrades.
+
+In the case where a worker dies before it can report its status to the backend DB, the executor can use a Kubernetes watcher thread to discover the failed pod.
+
+.. image:: ../img/k8s-watcher-2.jpeg
 
 Review comment:
   We don't have this image !

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services