You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Nidhi (Jira)" <ji...@apache.org> on 2019/12/17 17:54:01 UTC
[jira] [Updated] (AIRFLOW-6264) Airflow not scheduling tasks after
staying in Running state
[ https://issues.apache.org/jira/browse/AIRFLOW-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nidhi updated AIRFLOW-6264:
---------------------------
Description:
I am trying to run my tasks using Airflow and Celery Executor. I have created one DAG inside that I have approximately 60000 tasks. I am trying to run this and When I trigger a DAG it starts running but not scheduling my tasks. It stays in running state for 2 days without scheduling the tasks.
And the main bug is If i schedule 5000 tasks then it takes less than 1 minute to schedule it but if I am trying to schedule more than that limit it is scheduling my tasks.
I have tried different solutions also to solve this the first one is:
* *{{PARALLELISM=1000}}*
* NON_POOLED_TASK_SLOT_COUNT=1000
* DAG_CONCURRENCY=10000
*But, this does not work as New Version of airflow does not support non_pooled_task_slot_count.*
*I have tried to use CELERYD_COUNT which is also not working in my case. I have tries mostly every changes which can be useful but none of them is working for me.*
*Can someone let me know how can schedule this much amount of tasks /*
was:
I am trying to run my tasks using Airflow and Celery Executor. I have created one DAG inside that I have approximately 60000 tasks. I am trying to run this and When I trigger a DAG it starts running but not scheduling my tasks. It stays in running state for 2 days without scheduling the tasks.
And the main bug is If i schedule 5000 tasks then it takes less than 1 minute to schedule it but if I am trying to schedule more than that limit it is scheduling my tasks.
I have tried different solutions also to solve this the first one is:
* *{{PARALLELISM=1000}}*
* NON_POOLED_TASK_SLOT_COUNT=1000
* DAG_CONCURRENCY=10000
*
*But, this does not work as New Version of airflow does not support non_pooled_task_slot_count.*
*I have tried to use CELERYD_COUNT which is also not working in my case. I have tries mostly every changes which can be useful but none of them is working for me.*
*Can someone let me know how can schedule this much amount of tasks /*
> Airflow not scheduling tasks after staying in Running state
> -----------------------------------------------------------
>
> Key: AIRFLOW-6264
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6264
> Project: Apache Airflow
> Issue Type: Bug
> Components: celery, DAG, operators
> Affects Versions: 1.10.6
> Reporter: Nidhi
> Priority: Major
>
> I am trying to run my tasks using Airflow and Celery Executor. I have created one DAG inside that I have approximately 60000 tasks. I am trying to run this and When I trigger a DAG it starts running but not scheduling my tasks. It stays in running state for 2 days without scheduling the tasks.
> And the main bug is If i schedule 5000 tasks then it takes less than 1 minute to schedule it but if I am trying to schedule more than that limit it is scheduling my tasks.
> I have tried different solutions also to solve this the first one is:
> * *{{PARALLELISM=1000}}*
> * NON_POOLED_TASK_SLOT_COUNT=1000
> * DAG_CONCURRENCY=10000
> *But, this does not work as New Version of airflow does not support non_pooled_task_slot_count.*
> *I have tried to use CELERYD_COUNT which is also not working in my case. I have tries mostly every changes which can be useful but none of them is working for me.*
> *Can someone let me know how can schedule this much amount of tasks /*
--
This message was sent by Atlassian Jira
(v8.3.4#803005)