You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "t oo (Jira)" <ji...@apache.org> on 2019/12/11 09:29:00 UTC
[jira] [Created] (AIRFLOW-6227) Ability to assign multiple pool
names to a single task
t oo created AIRFLOW-6227:
-----------------------------
Summary: Ability to assign multiple pool names to a single task
Key: AIRFLOW-6227
URL: https://issues.apache.org/jira/browse/AIRFLOW-6227
Project: Apache Airflow
Issue Type: New Feature
Components: scheduler
Affects Versions: 1.10.6
Reporter: t oo
Right now only a single pool name can be assigned to each task instance.
Ideally 2 different pool names can be assigned to a task_instance.
Use case:
I have 300 Spark tasks writing to 60 different tables (ie. there are multiple tasks writing to same table).
I want both:
# Maximum of 30 Spark tasks running in parallel
# Never more than 1 Spark task writing to the same table in parallel
If i have a 'spark' pool of 30 and assign 'spark' pool to those tasks then i risk having 2 tasks writing to same table.
But instead if i have a 'tableA' pool of 1, 'tableB' pool of 1, 'tableC' pool of 1...etc and assign relevant table name pool to each task then i risk having more than 30 spark tasks running in parallel.
I can't use 'parallelism' or other settings because I have other non-spark tasks that I don't want to limit
--
This message was sent by Atlassian Jira
(v8.3.4#803005)