You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/03 01:35:56 UTC
[GitHub] [airflow] HardikVijayPatel opened a new issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first
HardikVijayPatel opened a new issue #21951:
URL: https://github.com/apache/airflow/issues/21951
### Apache Airflow version
2.2.3
### What happened
What i see happening is that dag tasks are waiting in scheduled state for another dag scheduled first to complete all it's scheduled tasks before the later triggered tasks can proceed even though the parallelism and dag_concurrency is set to a higher value.
I have the following configuration set in airflow for concurrency:
parallelism = 64
max_active_tasks_per_dag = 16
dag_concurrency = 32
consider 2 dags, A and B.
dag A consists of 500 tasks which can be triggered in parallel with out dependency on other tasks
dag B consists of 2 tasks, serial
below is the senario:
DAG A triggered first, all 500 get scheduled, and get to running 16 at a time (due to max_active_tasks_per_dag).
Next, DAG B is triggered and the task gets scheduled, but DAG B waits for all 500 of DAG A to complete before moving to running state even though the parallelism and fag concurrency is set higher
### What you expected to happen
since the total number of active dags are less than the "dag_concurrency" value, the tasks of DAG B should move to running state without having to wait on DAG A scheduled tasks to complete
### How to reproduce
by having multiple dags running one after the other with the one dag having more number tasks than allowed by "max_active_tasks_per_dag" followed by other dags
### Operating System
Linux
### Versions of Apache Airflow Providers
pip freeze | grep apache-airflow-providers
apache-airflow-providers-ftp==2.0.1
apache-airflow-providers-http==2.0.2
apache-airflow-providers-imap==2.1.0
apache-airflow-providers-mysql==2.1.1
apache-airflow-providers-sqlite==2.0.1
### Deployment
Virtualenv installation
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] ephraimbuddy commented on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first
Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1059085757
max_active_tasks_per_dag == dag_concurrency.
There was a rename of dag_concurrency to max_active_tasks_per_dag. Can you set this correctly?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] HardikVijayPatel commented on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first
Posted by GitBox <gi...@apache.org>.
HardikVijayPatel commented on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1057576440
this behavior was working fine in Airflow V1, noticed this after upgrading to to V2
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] HardikVijayPatel commented on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first
Posted by GitBox <gi...@apache.org>.
HardikVijayPatel commented on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1059458794
> max_active_tasks_per_dag == dag_concurrency.
>
> There was a rename of dag_concurrency to max_active_tasks_per_dag. Can you set this correctly?
I do have the "max_active_tasks_per_dag" set as well, are you saying i should remove "dag_concurrency" setting itself and just have the "max_active_tasks_per_dag" and "parallelism" defined?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] ephraimbuddy commented on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first
Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1066179851
> > max_active_tasks_per_dag == dag_concurrency.
> > There was a rename of dag_concurrency to max_active_tasks_per_dag. Can you set this correctly?
>
> @ephraimbuddy I do have the "max_active_tasks_per_dag" set as well, are you saying i should remove "dag_concurrency" setting itself and just have the "max_active_tasks_per_dag" and "parallelism" defined?
Yes
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] SamWheating commented on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first
Posted by GitBox <gi...@apache.org>.
SamWheating commented on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1058491389
This is the same issue as reported in https://github.com/apache/airflow/issues/19622 and should be fixed by the PR linked above.
As a temporary workaround, you can either:
- Increase the value of [`max_tis_per_query`](https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#max-tis-per-query) to a number larger than the number of tasks in the first DAG
- Increase the [`priority_weight`](https://airflow.apache.org/docs/apache-airflow/stable/concepts/priority-weight.html) of the tasks in the second DAG.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] tanelk commented on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first
Posted by GitBox <gi...@apache.org>.
tanelk commented on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1057826868
It most likely the same issue: #19747
Would like to get that PR merged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] HardikVijayPatel edited a comment on issue #21951: Tasks stuck in scheduled state from different dags waiting for other dag tasks to complete first
Posted by GitBox <gi...@apache.org>.
HardikVijayPatel edited a comment on issue #21951:
URL: https://github.com/apache/airflow/issues/21951#issuecomment-1059458794
> max_active_tasks_per_dag == dag_concurrency.
>
> There was a rename of dag_concurrency to max_active_tasks_per_dag. Can you set this correctly?
@ephraimbuddy I do have the "max_active_tasks_per_dag" set as well, are you saying i should remove "dag_concurrency" setting itself and just have the "max_active_tasks_per_dag" and "parallelism" defined?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org