You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/03 01:35:56 UTC

[GitHub] [airflow] HardikVijayPatel opened a new issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first

HardikVijayPatel opened a new issue #21951:
URL: https://github.com/apache/airflow/issues/21951


   ### Apache Airflow version
   
   2.2.3
   
   ### What happened
   
   What i see happening is that dag tasks are waiting in scheduled state for another dag scheduled first to complete all it's scheduled tasks before the later triggered tasks can proceed even though the parallelism and dag_concurrency is set to a higher value.
   
   I have the following configuration set in airflow for concurrency:
   parallelism = 64
   max_active_tasks_per_dag = 16
   dag_concurrency = 32
   
   consider 2 dags, A and B.
   dag A consists of 500 tasks which can be triggered in parallel with out dependency on other tasks
   dag B consists of 2 tasks, serial
   
   below is the senario:
   DAG A triggered first, all 500 get scheduled, and get to running 16 at a time (due to max_active_tasks_per_dag). 
   Next, DAG B is triggered and the task gets scheduled, but DAG B waits for all 500 of DAG A to complete before moving to running state even though the parallelism and fag concurrency is set higher
   
   ### What you expected to happen
   
   since the total number of active dags are less than the "dag_concurrency" value, the tasks of DAG B should move to running state without having to wait on DAG A scheduled tasks to complete
   
   ### How to reproduce
   
   by having multiple dags running one after the other with the one dag having more number tasks than allowed by "max_active_tasks_per_dag"  followed by other dags
   
   ### Operating System
   
   Linux
   
   ### Versions of Apache Airflow Providers
   
    pip freeze | grep apache-airflow-providers
   apache-airflow-providers-ftp==2.0.1
   apache-airflow-providers-http==2.0.2
   apache-airflow-providers-imap==2.1.0
   apache-airflow-providers-mysql==2.1.1
   apache-airflow-providers-sqlite==2.0.1
   
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1059085757


   max_active_tasks_per_dag == dag_concurrency.
   
   There was a rename of dag_concurrency to max_active_tasks_per_dag. Can you set this correctly? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] HardikVijayPatel commented on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first

Posted by GitBox <gi...@apache.org>.
HardikVijayPatel commented on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1057576440


   this behavior was working fine in Airflow V1, noticed this after upgrading to to V2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] HardikVijayPatel commented on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first

Posted by GitBox <gi...@apache.org>.
HardikVijayPatel commented on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1059458794


   > max_active_tasks_per_dag == dag_concurrency.
   > 
   > There was a rename of dag_concurrency to max_active_tasks_per_dag. Can you set this correctly?
   
   I do have the "max_active_tasks_per_dag" set as well, are you saying i should remove "dag_concurrency" setting itself and just have the "max_active_tasks_per_dag" and "parallelism" defined?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1066179851


   > > max_active_tasks_per_dag == dag_concurrency.
   > > There was a rename of dag_concurrency to max_active_tasks_per_dag. Can you set this correctly?
   > 
   > @ephraimbuddy I do have the "max_active_tasks_per_dag" set as well, are you saying i should remove "dag_concurrency" setting itself and just have the "max_active_tasks_per_dag" and "parallelism" defined?
   
   Yes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] SamWheating commented on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first

Posted by GitBox <gi...@apache.org>.
SamWheating commented on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1058491389


   This is the same issue as reported in https://github.com/apache/airflow/issues/19622 and should be fixed by the PR linked above. 
   
   As a temporary workaround, you can either:
   
    - Increase the value of [`max_tis_per_query`](https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#max-tis-per-query) to a number larger than the number of tasks in the first DAG
    - Increase the [`priority_weight`](https://airflow.apache.org/docs/apache-airflow/stable/concepts/priority-weight.html) of the tasks in the second DAG. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] tanelk commented on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first

Posted by GitBox <gi...@apache.org>.
tanelk commented on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1057826868


   It most likely the same issue: #19747
   Would like to get that PR merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] HardikVijayPatel edited a comment on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first

Posted by GitBox <gi...@apache.org>.
HardikVijayPatel edited a comment on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1059458794


   > max_active_tasks_per_dag == dag_concurrency.
   > 
   > There was a rename of dag_concurrency to max_active_tasks_per_dag. Can you set this correctly?
   
   @ephraimbuddy I do have the "max_active_tasks_per_dag" set as well, are you saying i should remove "dag_concurrency" setting itself and just have the "max_active_tasks_per_dag" and "parallelism" defined?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org