You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "hussein-awala (via GitHub)" <gi...@apache.org> on 2023/02/02 18:51:47 UTC

[GitHub] [airflow] hussein-awala commented on issue #29084: Add `max_active_tis_per_dagrun` for Dynamic Task Mapping

hussein-awala commented on issue #29084:
URL: https://github.com/apache/airflow/issues/29084#issuecomment-1414209722

   I have two different use cases for this feature:
   - I have an API used by the users to submit ML processing requests for some data by a list of models (they choose a subset from ~30 models), this requests are sent to Kafka and processed by a stream app which create a `DagRun` for each request via the REST API, then the dag process the data in parallel by each chosen model. Currently the requests are processed in FIFO method, and I cannot set a quota for each DagRun, so if a client requested a big set of models, the other requests will stay in `queued` state until the end of its run. With the new conf, I can set a quota for each request (5 max parallel processing/request) which lead to a FAIR scheduling method.
   - Another use case is any parallel workload on a database sharded/partitioned by date, where we cannot create a pool for each partition, and reducing the `max_active_tis_per_dag` is not the best solution, where the other partitions can be on a different shard, and processing them is not a problem, so we just need to reduce parallelism for each partition.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org