You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/14 10:50:34 UTC

[GitHub] [airflow] uranusjr opened a new pull request #20870: Fix task ID deduplication in @task_group

uranusjr opened a new pull request #20870:
URL: https://github.com/apache/airflow/pull/20870


   When appending `__{n}` for task ID deduplication, we are currently using `dag.task_ids`, which prepends the surrounding task group ID. This means that we must also use the group-prefixed *current* task ID to detect duplication, not the raw, un-prefixed, task ID.
   
   Fix #19902


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr merged pull request #20870: Fix task ID deduplication in @task_group

Posted by GitBox <gi...@apache.org>.
uranusjr merged pull request #20870:
URL: https://github.com/apache/airflow/pull/20870


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #20870: Fix task ID deduplication in @task_group

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #20870:
URL: https://github.com/apache/airflow/pull/20870#discussion_r784746579



##########
File path: airflow/decorators/base.py
##########
@@ -86,15 +89,18 @@ def get_unique_task_id(
 
     if tg_task_id not in dag.task_ids:
         return task_id
-    core = re.split(r'__\d+$', task_id)[0]
-    suffixes = sorted(
-        int(re.split(r'^.+__', task_id)[1])
-        for task_id in dag.task_ids
-        if re.match(rf'^{core}__\d+$', task_id)
-    )
-    if not suffixes:
-        return f'{core}__1'
-    return f'{core}__{suffixes[-1] + 1}'
+
+    def _find_id_suffixes(dag: DAG) -> Iterator[int]:
+        prefix = re.split(r"__\d+$", tg_task_id)[0]

Review comment:
       This is the main fix; the previous logic incorrectly used `task_id`. Everything else is just re-organising code to make variable names more straightforward for future maintenance.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #20870: Fix task ID deduplication in @task_group

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #20870:
URL: https://github.com/apache/airflow/pull/20870#issuecomment-1015217448


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #20870: Fix task ID deduplication in @task_group

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #20870:
URL: https://github.com/apache/airflow/pull/20870#discussion_r786574757



##########
File path: airflow/decorators/base.py
##########
@@ -46,7 +47,7 @@
 from airflow.utils.task_group import TaskGroup, TaskGroupContext
 
 
-def validate_python_callable(python_callable):
+def validate_python_callable(python_callable: Any) -> None:

Review comment:
       No because this function accepts non-callables; it validates the input is a callable, and throws an exception when it’s not. So we _don’t_ want Mypy to complain when we do (say) `validate_python_callable(1)`; it’s an expected usage.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kazanzhy commented on a change in pull request #20870: Fix task ID deduplication in @task_group

Posted by GitBox <gi...@apache.org>.
kazanzhy commented on a change in pull request #20870:
URL: https://github.com/apache/airflow/pull/20870#discussion_r786572141



##########
File path: airflow/decorators/base.py
##########
@@ -46,7 +47,7 @@
 from airflow.utils.task_group import TaskGroup, TaskGroupContext
 
 
-def validate_python_callable(python_callable):
+def validate_python_callable(python_callable: Any) -> None:

Review comment:
       I wonder if `Callable` type could be used here?
   
   ```suggestion
   def validate_python_callable(python_callable: Callable) -> None:
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org