You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/03 18:03:47 UTC

[GitHub] [airflow] Gollum999 commented on issue #23020: Names for expanded tasks

Gollum999 commented on issue #23020:
URL: https://github.com/apache/airflow/issues/23020#issuecomment-1204302199

   For the sake of discussion, how do people feel about letting these names entirely *replace* `map_index`?
   
   My primary motivation is that an integer-based `map_index` can get weird if the list of tasks changes when re-running a DAG.  In my experience, you can end up with missing and/or duplicate tasks due to the mismatched indices.  As a result you are generally forced to re-run all of the mapped task instances (if not the entire DAG Run), even if 99% of the tasks are unchanged.  I can give more concrete examples if anyone is interested.
   
   I'd propose letting the mapped args become (part of?) the primary key for the mapped task.  This would solve OP's goal of improving the UX of mapped tasks (no more abstract indices), but would also allow for some amount of consistency between runs (not just for re-runs, but potentially between separate DAG Runs as well).
   
   Conceptually I think this is doable since the args already must be JSON serializable, but I'm sure the implementation would be more complex than I am imagining.
   
   A couple considerations that come to mind:
   1. These IDs ultimately need to be unique.  So something like `map_index` might still be needed to augment the key and resolve duplicates.  I imagine this could be similar to how duplicate task_ids are resolved (`task`, `task__1`, etc.).
   2. The values used to generate the list of dynamic tasks could be very large and complex.  So it seems like you would need the ability to override the generated name with a "short name" when args are too complex.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org