You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/19 09:12:33 UTC

[GitHub] [airflow] tcchong opened a new issue #20944: Unable to complete backfilling job with k8s executor

tcchong opened a new issue #20944:
URL: https://github.com/apache/airflow/issues/20944


   ### Apache Airflow version
   
   2.2.2
   
   ### What happened
   
   I tried to run a backfilling job with k8s executor, it's created the pod and running the job well.
   When the pod status mark as `Completed` in k8s, the status of task get stuck in `scheduled` state without any updates.
   
   ### What you expected to happen
   
   I would expect the state will change once the k8s pod had complete the job.
   
   This is the scheduler and executor logs:
   ```
   [2022-01-19 08:40:14,670] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:40:24,973] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:40:35,358] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:40:45,655] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:40:55,828] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:41:05,981] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:41:16,138] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:41:26,299] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:41:36,587] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:41:46,939] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:41:51,823] {scheduler_job.py:1114} INFO - Resetting orphaned tasks for active dag runs
   [2022-01-19 08:41:51,857] {kubernetes_executor.py:730} INFO - Attempting to adopt pod echostart.9babbb429d7f43a98ff44f2976b59343
   [2022-01-19 08:41:51,884] {kubernetes_executor.py:730} INFO - Attempting to adopt pod echostart.d902f94926784e27afcef857f90e7411
   [2022-01-19 08:41:51,887] {kubernetes_executor.py:147} INFO - Event: echostart.9babbb429d7f43a98ff44f2976b59343 had an event of type ADDED
   [2022-01-19 08:41:51,888] {kubernetes_executor.py:206} INFO - Event: echostart.9babbb429d7f43a98ff44f2976b59343 Succeeded
   [2022-01-19 08:41:51,905] {kubernetes_executor.py:147} INFO - Event: echostart.d902f94926784e27afcef857f90e7411 had an event of type ADDED
   [2022-01-19 08:41:51,905] {kubernetes_executor.py:206} INFO - Event: echostart.d902f94926784e27afcef857f90e7411 Succeeded
   [2022-01-19 08:41:52,019] {kubernetes_executor.py:374} INFO - Attempting to finish pod; pod_id: echostart.9babbb429d7f43a98ff44f2976b59343; state: None; annotations: {'dag_id': 'echo', 'task_id': 'start', 'execution_date': None, 'run_id': 'backfill__2022-01-17T00:00:00+00:00', 'try_number': '1'}
   [2022-01-19 08:41:52,020] {kubernetes_executor.py:374} INFO - Attempting to finish pod; pod_id: echostart.d902f94926784e27afcef857f90e7411; state: None; annotations: {'dag_id': 'echo', 'task_id': 'start', 'execution_date': None, 'run_id': 'scheduled__2022-01-18T00:00:00+00:00', 'try_number': '1'}
   [2022-01-19 08:41:52,021] {kubernetes_executor.py:576} INFO - Changing state of (TaskInstanceKey(dag_id='echo', task_id='start', run_id='backfill__2022-01-17T00:00:00+00:00', try_number=1), None, 'echostart.9babbb429d7f43a98ff44f2976b59343', 'airflow-staging', '352215076') to None
   [2022-01-19 08:41:52,022] {kubernetes_executor.py:576} INFO - Changing state of (TaskInstanceKey(dag_id='echo', task_id='start', run_id='scheduled__2022-01-18T00:00:00+00:00', try_number=1), None, 'echostart.d902f94926784e27afcef857f90e7411', 'airflow-staging', '352215077') to None
   [2022-01-19 08:41:52,023] {scheduler_job.py:504} INFO - Executor reports execution of echo.start run_id=backfill__2022-01-17T00:00:00+00:00 exited with status None for try_number 1
   [2022-01-19 08:41:52,024] {scheduler_job.py:504} INFO - Executor reports execution of echo.start run_id=scheduled__2022-01-18T00:00:00+00:00 exited with status None for try_number 1
   [2022-01-19 08:41:57,107] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:42:07,279] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:42:17,419] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:42:27,574] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:42:37,743] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:42:48,097] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   [2022-01-19 08:42:58,292] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
   ```
   
   Pod status:
   ```
   NAME                                         READY   STATUS      RESTARTS   AGE
   echostart.9babbb429d7f43a98ff44f2976b59343   0/1     Completed   0          3m55s
   echostart.d902f94926784e27afcef857f90e7411   0/1     Completed   0          3m55s
   ```
   
   Airflow Web:
   ![Screenshot 2022-01-19 at 4 45 56 PM](https://user-images.githubusercontent.com/2786720/150098813-999e445a-2d86-40e4-a96f-858998d9279a.png)
   
   
   
   ### How to reproduce
   
   1. Create a DAG as below
   
   ```python
   from datetime import timedelta
   
   from airflow import DAG
   from airflow.operators.dummy import DummyOperator
   from airflow.operators.bash import BashOperator
   from airflow.utils.dates import days_ago
   
   default_args = {
       'owner': 'echo',
       'depends_on_past': False,
       'start_date': days_ago(2),
       'email': [],
       'email_on_failure': False,
       'email_on_retry': False,
       'retries': 1,
       'retry_delay': timedelta(minutes=1),
       'on_failure_callback': failure_alert,
       'timeout': 60 * 60
   }
   
   with DAG('echo', default_args=default_args, schedule_interval='@daily') as dag:
       start = DummyOperator(task_id='start')
   
       hello = BashOperator(
           task_id='hello',
           bash_command='echo hello',
       )
   
       end = DummyOperator(task_id='end')
   
       start >> hello >> end
   ```
   
   2. Run backfill command
   ```
   $ airflow dags backfill echo -s 20220117
   ```
   
   ### Operating System
   
   Debian GNU/Linux 10 (buster)
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #20944: Unable to complete backfilling job with k8s executor

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #20944:
URL: https://github.com/apache/airflow/issues/20944#issuecomment-1016230716


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org