You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/02 11:19:06 UTC

[GitHub] [airflow] andreafromcape commented on issue #25165: Dynamic Tasks inside of TaskGroup do not have group_id prepended to task_id

andreafromcape commented on issue #25165:
URL: https://github.com/apache/airflow/issues/25165#issuecomment-1202350973

   No. I already went through this once with you in the issue I referenced but I must assume I was completely ignored. This is the complete snippet:
   
   ```
       task_group = task_group or TaskGroupContext.get_current_task_group(dag)
       tg_task_id = task_group.child_id(task_id) if task_group else task_id
   
       if tg_task_id not in dag.task_ids:
           return task_id
   ```
   
   This happens at the time `expand` is called: albeit `tg_task_id` indeed contains the correct label, it is ignored because `tg_task_id not in dag.task_ids` evaluates to `True` (for obvious reasons, I hope), so the naked `task_id` is returned by `get_unique_task_id`.
   
   And again as I said in the referenced issue, this was probably a rough two-liners to fix the obvious collision [with this other snippet](https://github.com/apache/airflow/blob/3138604b264878f27505223bd14c7814eacc1e57/airflow/models/baseoperator.py#L759-L762) inside `BaseOperator`'s `__init__` which does the exact same thing (prepending the task_group_id to the task_id):
   
   ```
           if task_group:
               self.task_id = task_group.child_id(task_id)
           else:
               self.task_id = task_id
   ```
   
   So while `get_unique_task_id` does **_not_** return an unique task_id in some cases, for most use cases it is fixed later by `BaseOperator`, but the fix above is not contained in `MappedOperator`'s `__init__`, and `MappedOperator` does not inherit from `BaseOperator` either so it's not going to fix the task_id and it's going to result in a duplicate.
   
   So, to summarize:
   
   - `get_unique_task_id` does _not_ return an unique task_id in case a task is inside a task group
   - This is normally fixed later by `BaseOperator` at initiation which will correctly prepend the task_group_id
   - **_But_** MappedOperator does not fix the task_id as `BaseOperator` does
   - Hence, we have a duplicate task_id in case the `@task` decorator is used inside a task_group
   
   I urge you to re-read the code, @uranusjr, because it does _not_ do what you think it does.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org