You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Hamid Mahmood (Jira)" <ji...@apache.org> on 2020/01/22 13:40:00 UTC
[jira] [Assigned] (AIRFLOW-6551) KubernetesExecutor does not create
dynamic pods for tasks inside subdag
[ https://issues.apache.org/jira/browse/AIRFLOW-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hamid Mahmood reassigned AIRFLOW-6551:
--------------------------------------
Assignee: (was: Daniel Imberman)
> KubernetesExecutor does not create dynamic pods for tasks inside subdag
> -----------------------------------------------------------------------
>
> Key: AIRFLOW-6551
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6551
> Project: Apache Airflow
> Issue Type: Bug
> Components: executor-kubernetes
> Affects Versions: 1.10.7
> Reporter: Hamid Mahmood
> Priority: Major
>
> KubernetesExecutor does not create dynamic pods for tasks inside subdag.
> I am running airflow 1.10.7 on eks 1.14. I have multiple subdags operators inside a main_dag. Following is the hierarchy
> * main_dag:
> ** subdagA
> *** taskA1
> *** taskA2
> *** taskA3
> ** subdagB
> *** subdagB_1
> **** taskB1
> **** taskB2
> ** task_main1
> ** task_main2
> ** task_main3
> I have tested the following three scenarios.
> *Scenario:1*
> I have set the following parameter
> {code:java}
> AIRFLOW__CORE__EXECUTOR = KubernetesExecutor
> {code}
> When I run the workflow main_dag, only a single pod is created with name subdagA-
> 309c4c564b9841529236a31dfaf135c5 and all the tasks (tasksA1,taskA2,taskA3) run inside that single pod.
> In theory there should be 3 separate pods for these 3 tasks inside subdagA.
> Same is the case for subdagB, only a single pod is created to run subdagB, subdagB_1 and its tasks runs inside that pod.
> But the tasks (task_main1, task_main2, task_main3) that are not inside further subdag runs in dynamic pods.
>
> *Scenario: 2*
> I have set the following
> {code:java}
> AIRFLOW__CORE__EXECUTOR = KubernetesExecutor
> {code}
> and passed executor=KubernetesExecutor() when creating the subdag.
> {code:java}
> SubDagOperator( task_id=task_id, subdag=subdag, dag=parent_dag, executor=KubernetesExecutor(), **kwargs )
> {code}
> Still one pod is created for SubtaskA but now the taskA1 inside subdagA got stuck in queue state, when I checked the logs of this pod I got the following
> {code:java}
> Running %s on host %s <TaskInstance: main_dag.subdagA 2020-01-12T02:15:00+00:00 [queued]> maindagsubdagA-1a31ef3e2aef489daf1b329160646{code}
> *Scenario: 3*
> I have set the following
> {code:java}
> AIRFLOW__CORE__EXECUTOR = LocalExecutor
> {code}
> and passed executor=KubernetesExecutor() when creating the subdag.
> {code:java}
> SubDagOperator( task_id=task_id, subdag=subdag, dag=parent_dag, executor=KubernetesExecutor(), **kwargs )
> {code}
> Now this time dynamic pods are created for single task inside subdagA and subdagB and I get the parallelism. In this case KubernetesExecutor shows the required behavior. But the downside of this is that these three tasks task_main1, task_main2 and task_main3 will use LocalExecutor and will run in scheduler pod.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)