You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/04/08 17:12:26 UTC

[GitHub] [airflow] jedcunningham commented on a change in pull request #15270: The KubernetesPodOperator Comprehensive Guide

jedcunningham commented on a change in pull request #15270:
URL: https://github.com/apache/airflow/pull/15270#discussion_r609907818



##########
File path: docs/apache-airflow-providers-cncf-kubernetes/operators.rst
##########
@@ -35,23 +36,52 @@ you to create and run Pods on a Kubernetes cluster.
   :ref:`GKEStartPodOperator <howto/operator:GKEStartPodOperator>` operator as it
   simplifies the Kubernetes authorization process.
 
-.. note::
-  The :doc:`Kubernetes executor <apache-airflow:executor/kubernetes>` is **not** required to use this operator.
-
 How does this operator work?
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
 The :class:`~airflow.providers.cncf.kubernetes.operators.kubernetes_pod.KubernetesPodOperator` uses the
 Kubernetes API to launch a pod in a Kubernetes cluster. By supplying an
 image URL and a command with optional arguments, the operator uses the Kube Python Client to generate a Kubernetes API
 request that dynamically launches those individual pods.
+Under the hood, :class:`~airflow.providers.cncf.kubernetes.hooks.kubernetes.KubernetesHook` creates the connection to
+the Kubernetes API server.
+Essentially, KubernetesPodOperator packages all the supplied parameters into a request object which is then shipped off
+to Kubernetes API Server so that the pod to execute your task is created. Whenever a task is triggered, a new worker pod
+is spun up to execute that task. And once the task is completed, by default the worker pod is deleted
+and the resources reclaimed.
 Users can specify a kubeconfig file using the ``config_file`` parameter, otherwise the operator will default
 to ``~/.kube/config``.
 
-The :class:`~airflow.providers.cncf.kubernetes.operators.kubernetes_pod.KubernetesPodOperator` enables task-level
-resource configuration and is optimal for custom Python
-dependencies that are not available through the public PyPI repository. It also allows users to supply a template
-YAML file using the ``pod_template_file`` parameter.
-Ultimately, it allows Airflow to act a job orchestrator - no matter the language those jobs are written in.
+How does the KubernetesPodOperator differ from the KubernetesExecutor
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. note::
+  The :doc:`Kubernetes executor <apache-airflow:executor/kubernetes>` is **not** required to use this operator.
+
+
+What problems does KubernetesPodOperator solve?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* The :class:`~airflow.providers.cncf.kubernetes.operators.kubernetes_pod.KubernetesPodOperator` enables task-level
+  resource configuration and is optimal for custom Python dependencies that are not available through the
+  public PyPI repository.
+
+* It allows users to supply a template YAML file using the ``pod_template_file`` parameter.
+
+* It allows isolation of deployments, configuration reuse, delegation and better management of secrets.
+
+* Ultimately, it allows Airflow to act a job orchestrator - no matter the language those jobs are written in.

Review comment:
       I think a nice succinct way to think of it is: 'Easy way to run any image on Kubernetes as a task', and I feel like this should be first! (related to the orchestrator and any language points imo)

##########
File path: docs/apache-airflow-providers-cncf-kubernetes/operators.rst
##########
@@ -35,23 +36,52 @@ you to create and run Pods on a Kubernetes cluster.
   :ref:`GKEStartPodOperator <howto/operator:GKEStartPodOperator>` operator as it
   simplifies the Kubernetes authorization process.
 
-.. note::
-  The :doc:`Kubernetes executor <apache-airflow:executor/kubernetes>` is **not** required to use this operator.
-
 How does this operator work?
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
 The :class:`~airflow.providers.cncf.kubernetes.operators.kubernetes_pod.KubernetesPodOperator` uses the
 Kubernetes API to launch a pod in a Kubernetes cluster. By supplying an
 image URL and a command with optional arguments, the operator uses the Kube Python Client to generate a Kubernetes API
 request that dynamically launches those individual pods.
+Under the hood, :class:`~airflow.providers.cncf.kubernetes.hooks.kubernetes.KubernetesHook` creates the connection to
+the Kubernetes API server.

Review comment:
       ```suggestion
   ```
   
   I think this is redundant with the beginning of the this paragraph.

##########
File path: docs/apache-airflow-providers-cncf-kubernetes/connections/kubernetes.rst
##########
@@ -20,7 +20,9 @@
 Kubernetes cluster Connection
 =============================
 
-The Kubernetes cluster Connection type enables connection to a Kubernetes cluster by :class:`~airflow.providers.cncf.kubernetes.operators.spark_kubernetes.SparkKubernetesOperator` tasks. They are not used by ``KubernetesPodOperator`` tasks.
+The Kubernetes cluster Connection type enables connection to a Kubernetes cluster by
+:class:`~airflow.providers.cncf.kubernetes.operators.spark_kubernetes.SparkKubernetesOperator` tasks.
+They are not used by ``KubernetesPodOperator`` tasks.

Review comment:
       ```suggestion
   They are not used by :class:`~airflow.providers.cncf.kubernetes.operators.kubernetes_pod.KubernetesPodOperator tasks.
   ```
   
   Probably worth linking it.

##########
File path: docs/apache-airflow-providers-cncf-kubernetes/operators.rst
##########
@@ -19,11 +19,12 @@
 
 .. _howto/operator:KubernetesPodOperator:
 
-KubernetesPodOperator
-=====================
+KubernetesPodOperator - The Comprehensive Guide
+===============================================
 
 The :class:`~airflow.providers.cncf.kubernetes.operators.kubernetes_pod.KubernetesPodOperator` allows
-you to create and run Pods on a Kubernetes cluster.
+you to create and run Pods on a Kubernetes cluster. The task wrapped in the KubernetesPodOperator is then executed in
+these pods.

Review comment:
       ```suggestion
   you to create and run a Pod on a Kubernetes cluster as a task.
   ```
   
   Maybe? 

##########
File path: docs/apache-airflow-providers-cncf-kubernetes/operators.rst
##########
@@ -35,23 +36,52 @@ you to create and run Pods on a Kubernetes cluster.
   :ref:`GKEStartPodOperator <howto/operator:GKEStartPodOperator>` operator as it
   simplifies the Kubernetes authorization process.
 
-.. note::
-  The :doc:`Kubernetes executor <apache-airflow:executor/kubernetes>` is **not** required to use this operator.
-
 How does this operator work?
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
 The :class:`~airflow.providers.cncf.kubernetes.operators.kubernetes_pod.KubernetesPodOperator` uses the
 Kubernetes API to launch a pod in a Kubernetes cluster. By supplying an
 image URL and a command with optional arguments, the operator uses the Kube Python Client to generate a Kubernetes API
 request that dynamically launches those individual pods.
+Under the hood, :class:`~airflow.providers.cncf.kubernetes.hooks.kubernetes.KubernetesHook` creates the connection to
+the Kubernetes API server.
+Essentially, KubernetesPodOperator packages all the supplied parameters into a request object which is then shipped off
+to Kubernetes API Server so that the pod to execute your task is created. Whenever a task is triggered, a new worker pod
+is spun up to execute that task. And once the task is completed, by default the worker pod is deleted
+and the resources reclaimed.

Review comment:
       I'd be careful with terminology here. "Worker pods" are a KubeExecutor thing, not KPO.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org