You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/19 09:12:33 UTC
[GitHub] [airflow] tcchong opened a new issue #20944: Unable to complete backfilling job with k8s executor
tcchong opened a new issue #20944:
URL: https://github.com/apache/airflow/issues/20944
### Apache Airflow version
2.2.2
### What happened
I tried to run a backfilling job with k8s executor, it's created the pod and running the job well.
When the pod status mark as `Completed` in k8s, the status of task get stuck in `scheduled` state without any updates.
### What you expected to happen
I would expect the state will change once the k8s pod had complete the job.
This is the scheduler and executor logs:
```
[2022-01-19 08:40:14,670] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:40:24,973] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:40:35,358] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:40:45,655] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:40:55,828] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:41:05,981] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:41:16,138] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:41:26,299] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:41:36,587] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:41:46,939] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:41:51,823] {scheduler_job.py:1114} INFO - Resetting orphaned tasks for active dag runs
[2022-01-19 08:41:51,857] {kubernetes_executor.py:730} INFO - Attempting to adopt pod echostart.9babbb429d7f43a98ff44f2976b59343
[2022-01-19 08:41:51,884] {kubernetes_executor.py:730} INFO - Attempting to adopt pod echostart.d902f94926784e27afcef857f90e7411
[2022-01-19 08:41:51,887] {kubernetes_executor.py:147} INFO - Event: echostart.9babbb429d7f43a98ff44f2976b59343 had an event of type ADDED
[2022-01-19 08:41:51,888] {kubernetes_executor.py:206} INFO - Event: echostart.9babbb429d7f43a98ff44f2976b59343 Succeeded
[2022-01-19 08:41:51,905] {kubernetes_executor.py:147} INFO - Event: echostart.d902f94926784e27afcef857f90e7411 had an event of type ADDED
[2022-01-19 08:41:51,905] {kubernetes_executor.py:206} INFO - Event: echostart.d902f94926784e27afcef857f90e7411 Succeeded
[2022-01-19 08:41:52,019] {kubernetes_executor.py:374} INFO - Attempting to finish pod; pod_id: echostart.9babbb429d7f43a98ff44f2976b59343; state: None; annotations: {'dag_id': 'echo', 'task_id': 'start', 'execution_date': None, 'run_id': 'backfill__2022-01-17T00:00:00+00:00', 'try_number': '1'}
[2022-01-19 08:41:52,020] {kubernetes_executor.py:374} INFO - Attempting to finish pod; pod_id: echostart.d902f94926784e27afcef857f90e7411; state: None; annotations: {'dag_id': 'echo', 'task_id': 'start', 'execution_date': None, 'run_id': 'scheduled__2022-01-18T00:00:00+00:00', 'try_number': '1'}
[2022-01-19 08:41:52,021] {kubernetes_executor.py:576} INFO - Changing state of (TaskInstanceKey(dag_id='echo', task_id='start', run_id='backfill__2022-01-17T00:00:00+00:00', try_number=1), None, 'echostart.9babbb429d7f43a98ff44f2976b59343', 'airflow-staging', '352215076') to None
[2022-01-19 08:41:52,022] {kubernetes_executor.py:576} INFO - Changing state of (TaskInstanceKey(dag_id='echo', task_id='start', run_id='scheduled__2022-01-18T00:00:00+00:00', try_number=1), None, 'echostart.d902f94926784e27afcef857f90e7411', 'airflow-staging', '352215077') to None
[2022-01-19 08:41:52,023] {scheduler_job.py:504} INFO - Executor reports execution of echo.start run_id=backfill__2022-01-17T00:00:00+00:00 exited with status None for try_number 1
[2022-01-19 08:41:52,024] {scheduler_job.py:504} INFO - Executor reports execution of echo.start run_id=scheduled__2022-01-18T00:00:00+00:00 exited with status None for try_number 1
[2022-01-19 08:41:57,107] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:42:07,279] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:42:17,419] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:42:27,574] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:42:37,743] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:42:48,097] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
[2022-01-19 08:42:58,292] {kubernetes_executor.py:454} INFO - Found 0 queued task instances
```
Pod status:
```
NAME READY STATUS RESTARTS AGE
echostart.9babbb429d7f43a98ff44f2976b59343 0/1 Completed 0 3m55s
echostart.d902f94926784e27afcef857f90e7411 0/1 Completed 0 3m55s
```
Airflow Web:
![Screenshot 2022-01-19 at 4 45 56 PM](https://user-images.githubusercontent.com/2786720/150098813-999e445a-2d86-40e4-a96f-858998d9279a.png)
### How to reproduce
1. Create a DAG as below
```python
from datetime import timedelta
from airflow import DAG
from airflow.operators.dummy import DummyOperator
from airflow.operators.bash import BashOperator
from airflow.utils.dates import days_ago
default_args = {
'owner': 'echo',
'depends_on_past': False,
'start_date': days_ago(2),
'email': [],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=1),
'on_failure_callback': failure_alert,
'timeout': 60 * 60
}
with DAG('echo', default_args=default_args, schedule_interval='@daily') as dag:
start = DummyOperator(task_id='start')
hello = BashOperator(
task_id='hello',
bash_command='echo hello',
)
end = DummyOperator(task_id='end')
start >> hello >> end
```
2. Run backfill command
```
$ airflow dags backfill echo -s 20220117
```
### Operating System
Debian GNU/Linux 10 (buster)
### Versions of Apache Airflow Providers
_No response_
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #20944: Unable to complete backfilling job with k8s executor
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #20944:
URL: https://github.com/apache/airflow/issues/20944#issuecomment-1016230716
Thanks for opening your first issue here! Be sure to follow the issue template!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org