You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/03/24 05:51:06 UTC

[GitHub] [airflow] yuqian90 commented on issue #14864: Using TaskGroup without context manager (Graph view visual bug)

yuqian90 commented on issue #14864:
URL: https://github.com/apache/airflow/issues/14864#issuecomment-805520556


   Hi @ryanahamilton and @wolfier , I can explain why this is happening, but I'll need some suggestions from you and others on how to improve this.
   There are two ways of putting a task into a `TaskGroup`:
   1. Create the task inside the contextmanager of the `TaskGroup`;
   2. Pass the `task_group` as an argument to the constructor of the task.
   
   The reproducing example is doing neither of these. Instead it's calling `TaskGroup.add` **after** the task has already been created. At the time of creating the task, there was no `TaskGroup` contextmanager nor was the `task_group` specified. So the tasks were already added to the root `task_group` of the dag, resulting in this inconsistent behaviour.
   
   To fix this, you just need to use a contextmanager. Or do this:
   ```python
   from airflow.models import DAG
   from airflow.operators.bash import BashOperator
   from airflow.operators.dummy import DummyOperator
   from airflow.utils.task_group import TaskGroup
   from datetime import datetime
   with DAG(dag_id="example_task_group", start_date=datetime(2021, 1, 1), tags=["example"], catchup=False) as dag:
       start = BashOperator(task_id="start", bash_command='echo 1; sleep 10; echo 2;')
       tg = TaskGroup("section_1", tooltip="Tasks for section_1")
       task_1 = DummyOperator(task_id="task_1", task_group=tg)
       task_2 = BashOperator(task_id="task_2", bash_command='echo 1', task_group=tg)
       task_3 = DummyOperator(task_id="task_3", task_group=tg)
   ```
   
   Some ideas for improvement, not sure what's best:
   1. Add a check somewhere to make sure a task is not added to more than one `TaskGroup`. Where should such a check be?
   2. Make `TaskGroup.add` a private method. The reason it currently looks like a public method is that it's called in places such as the constructor of `BaseOperator`. But strictly speaking, users of Airflow should not be encouraged to call `TaskGroup.add` directly because it doesn't do what the name says. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org