You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/05/02 00:25:26 UTC

[GitHub] [airflow] zsmeijin opened a new issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

zsmeijin opened a new issue #8676:
URL: https://github.com/apache/airflow/issues/8676


   **Apache Airflow version**: 1.10.10
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`): Not using Kubernetes or docker
   
   **Environment**: CentOS Linux release 7.7.1908 (Core) Linux 3.10.0-1062.el7.x86_64
   
   **Python Version**:  3.7.6
   
   **Executor**:  LocalExecutor
   
   **What happened**:
   
   I write a simple dag to clean airflow logs. Everything is OK when I use 'airflow test' command to test it, I also trigger it manually in WebUI which use 'airflow run' command to start my task, it is still OK. 
   
   **But after I reboot my server and restart my webserver & scheduler service (in daemon mode),** every time I trigger **the exactly same dag**, it still get scheduled like usual, but exit with code 1 immediately after start a new process to run task. 
   
   I also use 'airflow test' command again to check if there is something wrong with my code now, but everything seems OK when using 'airflow test', but exit silently when using 'airflow run', it is really weird.
   
   Here's the task log when it's manually triggered in WebUI ( I've changed the log level to DEBUG, but still can't find anything useful), or you can read the attached log file: [task error log.txt](https://github.com/apache/airflow/files/4566767/task.error.log.txt)
   
   
   > Reading local file: /root/airflow/logs/airflow_log_cleanup/log_cleanup_worker_num_1/2020-04-29T13:51:44.071744+00:00/1.log
   > [2020-04-29 21:51:53,744] {base_task_runner.py:61} DEBUG - Planning to run as the  user
   > [2020-04-29 21:51:53,750] {taskinstance.py:686} DEBUG - <TaskInstance: airflow_log_cleanup.log_cleanup_worker_num_1 2020-04-29T13:51:44.071744+00:00 [queued]> dependency 'Previous Dagrun State' PASSED: True, The task did not have depends_on_past set.
   > [2020-04-29 21:51:53,754] {taskinstance.py:686} DEBUG - <TaskInstance: airflow_log_cleanup.log_cleanup_worker_num_1 2020-04-29T13:51:44.071744+00:00 [queued]> dependency 'Not In Retry Period' PASSED: True, The task instance was not marked for retrying.
   > [2020-04-29 21:51:53,754] {taskinstance.py:686} DEBUG - <TaskInstance: airflow_log_cleanup.log_cleanup_worker_num_1 2020-04-29T13:51:44.071744+00:00 [queued]> dependency 'Task Instance State' PASSED: True, Task state queued was valid.
   > [2020-04-29 21:51:53,754] {taskinstance.py:669} INFO - Dependencies all met for <TaskInstance: airflow_log_cleanup.log_cleanup_worker_num_1 2020-04-29T13:51:44.071744+00:00 [queued]>
   > [2020-04-29 21:51:53,757] {taskinstance.py:686} DEBUG - <TaskInstance: airflow_log_cleanup.log_cleanup_worker_num_1 2020-04-29T13:51:44.071744+00:00 [queued]> dependency 'Previous Dagrun State' PASSED: True, The task did not have depends_on_past set.
   > [2020-04-29 21:51:53,760] {taskinstance.py:686} DEBUG - <TaskInstance: airflow_log_cleanup.log_cleanup_worker_num_1 2020-04-29T13:51:44.071744+00:00 [queued]> dependency 'Pool Slots Available' PASSED: True, ('There are enough open slots in %s to execute the task', 'default_pool')
   > [2020-04-29 21:51:53,766] {taskinstance.py:686} DEBUG - <TaskInstance: airflow_log_cleanup.log_cleanup_worker_num_1 2020-04-29T13:51:44.071744+00:00 [queued]> dependency 'Not In Retry Period' PASSED: True, The task instance was not marked for retrying.
   > [2020-04-29 21:51:53,768] {taskinstance.py:686} DEBUG - <TaskInstance: airflow_log_cleanup.log_cleanup_worker_num_1 2020-04-29T13:51:44.071744+00:00 [queued]> dependency 'Task Concurrency' PASSED: True, Task concurrency is not set.
   > [2020-04-29 21:51:53,768] {taskinstance.py:669} INFO - Dependencies all met for <TaskInstance: airflow_log_cleanup.log_cleanup_worker_num_1 2020-04-29T13:51:44.071744+00:00 [queued]>
   > [2020-04-29 21:51:53,768] {taskinstance.py:879} INFO - 
   > --------------------------------------------------------------------------------
   > [2020-04-29 21:51:53,768] {taskinstance.py:880} INFO - Starting attempt 1 of 2
   > [2020-04-29 21:51:53,768] {taskinstance.py:881} INFO - 
   > --------------------------------------------------------------------------------
   > [2020-04-29 21:51:53,779] {taskinstance.py:900} INFO - Executing <Task(BashOperator): log_cleanup_worker_num_1> on 2020-04-29T13:51:44.071744+00:00
   > [2020-04-29 21:51:53,781] {standard_task_runner.py:53} INFO - Started process 29718 to run task
   > [2020-04-29 21:51:53,805] {logging_mixin.py:112} INFO - [2020-04-29 21:51:53,805] {cli_action_loggers.py:68} DEBUG - Calling callbacks: [<function default_action_log at 0x7fc9a62513b0>]
   > [2020-04-29 21:51:53,818] {logging_mixin.py:112} INFO - [2020-04-29 21:51:53,817] {cli_action_loggers.py:86} DEBUG - Calling callbacks: []
   > [2020-04-29 21:51:58,759] {logging_mixin.py:112} INFO - [2020-04-29 21:51:58,759] {base_job.py:200} DEBUG - [heartbeat]
   > [2020-04-29 21:51:58,759] {logging_mixin.py:112} INFO - [2020-04-29 21:51:58,759] {local_task_job.py:124} DEBUG - Time since last heartbeat(0.01 s) < heartrate(5.0 s), sleeping for 4.98824 s
   > [2020-04-29 21:52:03,753] {logging_mixin.py:112} INFO - [2020-04-29 21:52:03,753] {local_task_job.py:103} INFO - Task exited with return code 1
   
   **How to reproduce it**:
   
   I really don't know how to reproduce it. because it happens suddenly, and seems like permanently??
   
   **Anything else we need to know**:
   
   I try to figure out the difference between 'airflow test' and 'airflow run', it might have something to do with process fork I guess? 
   
   What I've tried to solve this problem but all failed:
   
   - clear all dag/dag run/task instance info, remove all files under /root/airflow except for the config file, and restart my service
   
   - reboot my server again
   
   - uninstall airflow and install it again
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] agails commented on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
agails commented on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-885717221


   > @zsmeijin is this also happens on Airflow 2?
   
   Yes I'm on version 2.1.0 and I just faced this problem, and thanks to zsmeijin I managed to solve it! For the more experienced, do we have any alternative or solution to this question?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-622630054


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] zsmeijin commented on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
zsmeijin commented on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-626074212


   I finally figure out how to reproduce this bug.
   
   If you config email in airflow.cfg and your dag contains email operator or use smtp serivce, and there's something wrong with the smtp service itself (I still need some time to figure it out) or the config (maybe it is because my stmp_user is 'airflow' since I use my own smtp service), the first task of your dag will 100% exited with return code 1 without any error information, in my case the first task is merely a python operator. When I use hotmail as my stmp server, everything is fine.
   
   Although I think it's my bad to mess up smtp service, there should be some reasonable hints, actually it takes me a whole week to debug this, I have to reset everything in my airflow environment and slowly change configuration to see when does this bug happens.
   
   Hope this information is helpful


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vietpm edited a comment on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
vietpm edited a comment on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-764417236






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-886224734


   Question @zsmeijin @agails @vietpm  -> could you please tell me how you configured the SMTP service? Was it via an SMTP connection or config file? I would love to reproduce that one and provide good error message in this case. I understanding how frustrating it could be. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vietpm edited a comment on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
vietpm edited a comment on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-764417236


   Thank you very much. I met the same issue when I reboot the server. My smtp password contains @ and #. Then i comment all rows in smtp section in airflow.cfg and restart scheduler and celery worker. Everything works fine then.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] zsmeijin edited a comment on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
zsmeijin edited a comment on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-626074212


   I finally figure out how to reproduce this bug.
   
   When you config email in airflow.cfg and your dag contains email operator or use smtp serivce, if your smtp password contains character like "^", the first task of your dag will 100% exited with return code 1 without any error information, in my case the first task is merely a python operator.
   
   Although I think it's my bad to mess up smtp service, there should be some reasonable hints, actually it takes me a whole week to debug this, I have to reset everything in my airflow environment and slowly change configuration to see when does this bug happens.
   
   Hope this information is helpful


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vietpm commented on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
vietpm commented on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-764417236


   Thank you very much. I met the same issue when I reboot the server. My smtp password contains @ and #.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vietpm edited a comment on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
vietpm edited a comment on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-764417236


   Thank you very much. I met the same issue when I reboot the server. My smtp password contains @ and #. Then I comment all rows in smtp section in airflow.cfg and restart scheduler and celery worker. Everything works fine then.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vietpm commented on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
vietpm commented on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-764417236


   Thank you very much. I met the same issue when I reboot the server. My smtp password contains @ and #.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-825869910


   @zsmeijin is this also happens on Airflow 2?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] agails edited a comment on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
agails edited a comment on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-885717221


   > @zsmeijin is this also happens on Airflow 2?
   
   Yes I'm on version 2.1.0 and I just faced this problem, and thanks to @zsmeijin I managed to solve it! For the more experienced, do we have any alternative or solution to this question?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] zkan commented on issue #8676: Task exited with return code 1 without any warning/error message after reboot server and restart service

Posted by GitBox <gi...@apache.org>.
zkan commented on issue #8676:
URL: https://github.com/apache/airflow/issues/8676#issuecomment-886402356


   This issue seems related to https://github.com/apache/airflow/issues/15133.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org