You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Nidhi (Jira)" <ji...@apache.org> on 2019/12/17 17:54:01 UTC

[jira] [Updated] (AIRFLOW-6264) Airflow not scheduling tasks after staying in Running state

     [ https://issues.apache.org/jira/browse/AIRFLOW-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nidhi updated AIRFLOW-6264:
---------------------------
    Description: 
I am trying to run my tasks using Airflow and Celery Executor. I have created one DAG inside that I have approximately 60000 tasks. I am trying to run this and When I trigger a DAG it starts running but not scheduling my tasks. It stays in running state for 2 days without scheduling the tasks.

And the main bug is If i schedule 5000 tasks then it takes less than 1 minute to schedule it but if I am trying to schedule more than that limit it is scheduling my tasks.

I have tried different solutions also to solve this the first one is:
 * *{{PARALLELISM=1000}}*
 * NON_POOLED_TASK_SLOT_COUNT=1000
 * DAG_CONCURRENCY=10000

*But, this does not work as New Version of airflow does not support non_pooled_task_slot_count.*

*I have tried to use CELERYD_COUNT which is also not working in my case. I have tries mostly every changes which can be useful but none of them is working for me.*

*Can someone let me know how can schedule this much amount of tasks /*

  was:
I am trying to run my tasks using Airflow and Celery Executor. I have created one DAG inside that I have approximately 60000 tasks. I am trying to run this and When I trigger a DAG it starts running but not scheduling my tasks. It stays in running state for 2 days without scheduling the tasks.

And the main bug is If i schedule 5000 tasks then it takes less than 1 minute to schedule it but if I am trying to schedule more than that limit it is scheduling my tasks.

I have tried different solutions also to solve this the first one is:
 * *{{PARALLELISM=1000}}*
 * NON_POOLED_TASK_SLOT_COUNT=1000
 * DAG_CONCURRENCY=10000
 * 

*But, this does not work as New Version of airflow does not support non_pooled_task_slot_count.*

*I have tried to use CELERYD_COUNT which is also not working in my case. I have tries mostly every changes which can be useful but none of them is working for me.*

*Can someone let me know how can schedule this much amount of tasks /*


> Airflow not scheduling tasks after staying in Running state
> -----------------------------------------------------------
>
>                 Key: AIRFLOW-6264
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6264
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: celery, DAG, operators
>    Affects Versions: 1.10.6
>            Reporter: Nidhi
>            Priority: Major
>
> I am trying to run my tasks using Airflow and Celery Executor. I have created one DAG inside that I have approximately 60000 tasks. I am trying to run this and When I trigger a DAG it starts running but not scheduling my tasks. It stays in running state for 2 days without scheduling the tasks.
> And the main bug is If i schedule 5000 tasks then it takes less than 1 minute to schedule it but if I am trying to schedule more than that limit it is scheduling my tasks.
> I have tried different solutions also to solve this the first one is:
>  * *{{PARALLELISM=1000}}*
>  * NON_POOLED_TASK_SLOT_COUNT=1000
>  * DAG_CONCURRENCY=10000
> *But, this does not work as New Version of airflow does not support non_pooled_task_slot_count.*
> *I have tried to use CELERYD_COUNT which is also not working in my case. I have tries mostly every changes which can be useful but none of them is working for me.*
> *Can someone let me know how can schedule this much amount of tasks /*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)