You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/04/29 08:57:20 UTC

[GitHub] [airflow] leeft95 opened a new issue, #23356: Tasks set to queued by a backfill get cleared and rescheduled by the kubernetes executor, breaking the backfill

leeft95 opened a new issue, #23356:
URL: https://github.com/apache/airflow/issues/23356

   ### Apache Airflow version
   
   2.2.5 (latest released)
   
   ### What happened
   
   A backfill launched from the scheduler pod, queues tasks as it should but while they are in the process of starting the kubernentes executor loop running in the scheduler clears these tasks and reschedules them via this function https://github.com/apache/airflow/blob/9449a107f092f2f6cfa9c8bbcf5fd62fadfa01be/airflow/executors/kubernetes_executor.py#L444
   
   This causes the backfill to not queue any more tasks and enters an endless loop of waiting for the task it has queued to complete.
   
   The way I have mitigated this is to set the `AIRFLOW__KUBERNETES__WORKER_PODS_QUEUED_CHECK_INTERVAL` to 3600, which is not ideal
   
   ### What you think should happen instead
   
   The function clear_not_launched_queued_tasks should respect tasks launched by a backfill process and not clear them. 
   
   ### How to reproduce
   
   start a backfill with large number of tasks and watch as they get queued and then subsequently rescheduled by the kubernetes executor running in the scheduler pod
   
   ### Operating System
   
   Debian GNU/Linux 10 (buster)
   
   ### Versions of Apache Airflow Providers
   
   ```
   
   apache-airflow            2.2.5            py38h578d9bd_0    
   apache-airflow-providers-cncf-kubernetes 3.0.2              pyhd8ed1ab_0    
   apache-airflow-providers-docker 2.4.1              pyhd8ed1ab_0    
   apache-airflow-providers-ftp 2.1.2              pyhd8ed1ab_0    
   apache-airflow-providers-http 2.1.2              pyhd8ed1ab_0    
   apache-airflow-providers-imap 2.2.3              pyhd8ed1ab_0    
   apache-airflow-providers-postgres 3.0.0              pyhd8ed1ab_0    
   apache-airflow-providers-sqlite 2.1.3              pyhd8ed1ab_0    
   
   ```
   
   ### Deployment
   
   Other 3rd-party Helm chart
   
   ### Deployment details
   
   Deployment is running the latest helm chart of Airflow Community Edition
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on issue #23356: Tasks set to queued by a backfill get cleared and rescheduled by the kubernetes executor, breaking the backfill

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #23356:
URL: https://github.com/apache/airflow/issues/23356#issuecomment-1135368285

   Merging into https://github.com/apache/airflow/issues/23145


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr closed issue #23356: Tasks set to queued by a backfill get cleared and rescheduled by the kubernetes executor, breaking the backfill

Posted by GitBox <gi...@apache.org>.
uranusjr closed issue #23356: Tasks set to queued by a backfill get cleared and rescheduled by the kubernetes executor, breaking the backfill
URL: https://github.com/apache/airflow/issues/23356


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] snjypl commented on issue #23356: Tasks set to queued by a backfill get cleared and rescheduled by the kubernetes executor, breaking the backfill

Posted by GitBox <gi...@apache.org>.
snjypl commented on issue #23356:
URL: https://github.com/apache/airflow/issues/23356#issuecomment-1127487263

   duplicate of [#23145](https://github.com/apache/airflow/issues/23145)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #23356: Tasks set to queued by a backfill get cleared and rescheduled by the kubernetes executor, breaking the backfill

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #23356: Tasks set to queued by a backfill get cleared and rescheduled by the kubernetes executor, breaking the backfill
URL: https://github.com/apache/airflow/issues/23356


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on issue #23356: Tasks set to queued by a backfill get cleared and rescheduled by the kubernetes executor, breaking the backfill

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #23356:
URL: https://github.com/apache/airflow/issues/23356#issuecomment-1113072923

   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] snjypl commented on issue #23356: Tasks set to queued by a backfill get cleared and rescheduled by the kubernetes executor, breaking the backfill

Posted by GitBox <gi...@apache.org>.
snjypl commented on issue #23356:
URL: https://github.com/apache/airflow/issues/23356#issuecomment-1119085253

   seems similar to #23048 . 
   
   https://github.com/apache/airflow/blob/83784d9e7b79d2400307454ccafdacddaee16769/airflow/executors/kubernetes_executor.py#L461-L464
   
   i think, we need to add a filter  `TaskInstance.queued_by_job_id == self.job_id`. so that the schedulerJob does not clear backfilljob's taskinstnace  and vice versa. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org