You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/17 08:07:48 UTC

[GitHub] [airflow] ajaymalikbol opened a new issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

ajaymalikbol opened a new issue #19635:
URL: https://github.com/apache/airflow/issues/19635


   ### Apache Airflow version
   
   2.0.2
   
   ### Operating System
   
   PRETTY_NAME="Debian GNU/Linux 10 (buster)" NAME="Debian GNU/Linux" VERSION_ID="10" VERSION="10 (buster)" VERSION_CODENAME=buster ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/"
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-backport-providers-google==2021.3.3
   apache-airflow-backport-providers-amazon==2021.3.3
   
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   We are using airflow to schedule ETL pipeline and for the same transformation we are using EMR step sensor.
   And we are seeing `Received SIGTERM. Terminating subprocesses.` error very frequently. [https://github.com/apache/airflow/blob/88583095c408ef9ea60f793e7072e3fd4b88e329/airflow/models/taskinstance.py#L1394](url)
   From logs it looks like Airflow task is setting the operator status = Success and again airflow is considering this change invalid `State of this instance has been externally set to success. Terminating instance.` [https://github.com/apache/airflow/blob/88583095c408ef9ea60f793e7072e3fd4b88e329/airflow/jobs/local_task_job.py#L211](url)
   
   From Airflow UI there is no Failures but we are seeing the below errors in the airflow logs.
   
   
   ```[2021-11-08 01:12:13,303] {taskinstance.py:1089} INFO - Executing <Task(EmrStepSensor): etl_step_sensor_task> on 2021-11-08T00:01:00+00:00
   [2021-11-08 01:12:13,405] {standard_task_runner.py:52} INFO - Started process 27548 to run task
   [2021-11-08 01:12:14,178] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', 'ETL_PIPELINE_JOB', 'etl_step_sensor_task', '2021-11-08T00:01:00+00:00', '--job-id', '100495', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/etl_pipeline_script.py', '--cfg-path', '/tmp/tmp_6kklqa5', '--error-file', '/tmp/tmpo2ug1xs3']
   [2021-11-08 01:12:14,279] {standard_task_runner.py:77} INFO - Job 100495: Subtask etl_step_sensor_task
   [2021-11-08 01:12:15,800] {logging_mixin.py:104} INFO - Running <TaskInstance: ETL_PIPELINE_JOB.etl_step_sensor_task 2021-11-08T00:01:00+00:00 [running]> on host e363e366c87b
   [2021-11-08 01:12:17,015] {taskinstance.py:1281} INFO - Exporting the following env vars:
   AIRFLOW_CTX_DAG_EMAIL=airflow@airflow.com
   AIRFLOW_CTX_DAG_OWNER=airflow
   AIRFLOW_CTX_DAG_ID=ETL_PIPELINE_JOB
   AIRFLOW_CTX_TASK_ID=etl_step_sensor_task
   AIRFLOW_CTX_EXECUTION_DATE=2021-11-08T00:01:00+00:00
   AIRFLOW_CTX_DAG_RUN_ID=scheduled__2021-11-08T00:01:00+00:00
   [2021-11-08 01:12:17,174] {base_aws.py:362} INFO - Airflow Connection: aws_conn_id=aws_default
   [2021-11-08 01:12:17,462] {base_aws.py:385} WARNING - Unable to use Airflow Connection for credentials.
   [2021-11-08 01:12:17,462] {base_aws.py:386} INFO - Fallback on boto3 credential strategy
   [2021-11-08 01:12:17,462] {base_aws.py:389} INFO - Creating session using boto3 credential strategy region_name=None
   [2021-11-08 01:12:18,035] {emr_step.py:75} INFO - Poking step step-id on cluster cluster_id
   [2021-11-08 01:12:18,395] {emr_base.py:68} INFO - Job flow currently COMPLETED
   [2021-11-08 01:12:18,395] {base.py:245} INFO - Success criteria met. Exiting.
   [2021-11-08 01:12:18,522] {taskinstance.py:1185} INFO - Marking task as SUCCESS. dag_id=ETL_PIPELINE_JOB, task_id=etl_step_sensor_task, execution_date=20211108T000100, start_date=20211108T011212, end_date=20211108T011218
   [2021-11-08 01:12:19,269] {local_task_job.py:187} WARNING - State of this instance has been externally set to success. Terminating instance.
   [2021-11-08 01:12:19,272] {process_utils.py:100} INFO - Sending Signals.SIGTERM to GPID 27548
   [2021-11-08 01:12:19,419] {taskinstance.py:1265} ERROR - Received SIGTERM. Terminating subprocesses.
   [2021-11-08 01:12:20,636] {process_utils.py:66} INFO - Process psutil.Process(pid=27548, status='terminated', exitcode=1, started='01:12:12') (27548) terminated with exit code 1
   airflow@e363e366c87b:/opt/airflow/logs/ETL_PIPELINE_JOB$ 
   
   ### What you expected to happen
   
   We expect No errors in the airflow task 
   ``` Received SIGTERM. Terminating subprocesses.
   
   And if this expected, How we can fix it in Airflow Build.
   
   
   ### How to reproduce
   
   This is an intermittent issue, Not sure if we can generate these errors with some pre-defined steps
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #19635:
URL: https://github.com/apache/airflow/issues/19635


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19635:
URL: https://github.com/apache/airflow/issues/19635#issuecomment-971367473


   BTW. You are not supposed to have backport providers installed with Airflow 2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19635:
URL: https://github.com/apache/airflow/issues/19635#issuecomment-971366734


   I believe this has been fixed already in 2.1 and 2.2. There could be differen t reasons for that, but several of them have been fixed to address similar issues. Please upgrade to 2.2.2 and comment here if the issue still remains.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #19635:
URL: https://github.com/apache/airflow/issues/19635#issuecomment-971367473


   BTW. You are not supposed to have backport providers installed with Airflow 2  - you shoul dhave regular ones (`apache-airlfow-providers-google` for example) instead


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19635:
URL: https://github.com/apache/airflow/issues/19635#issuecomment-975282863


   But as I wrote - there is not enough data in your report to be definitely sure. Please test if Airflow 2.2.2 solves your problem simply. This is by fare fastest way (for you but also for busy maintainers who are mostly investigating issues for free) to find out if the problem still exist.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lidalei commented on issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

Posted by GitBox <gi...@apache.org>.
lidalei commented on issue #19635:
URL: https://github.com/apache/airflow/issues/19635#issuecomment-1020360043


   Hi @potiuk , we have seen a similar issue since we upgraded from Airflow 2.0.2 to Airflow 2.2.3, for example in a BashOperator.
   ```
   [2022-01-24, 16:18:18 UTC] {subprocess.py:93} INFO - Command exited with return code 0
   [2022-01-24, 16:18:18 UTC] {taskinstance.py:1267} INFO - Marking task as SUCCESS. dag_id=XXX, task_id=XXX, execution_date=20220124T120500, start_date=20220124T161701, end_date=20220124T161818
   [2022-01-24, 16:18:18 UTC] {local_task_job.py:211} WARNING - State of this instance has been externally set to success. Terminating instance.
   [2022-01-24, 16:18:18 UTC] {process_utils.py:120} INFO - Sending Signals.SIGTERM to group 553436. PIDs of all processes in the group: [553436]
   [2022-01-24, 16:18:18 UTC] {process_utils.py:75} INFO - Sending the signal Signals.SIGTERM to group 553436
   [2022-01-24, 16:18:18 UTC] {taskinstance.py:1408} ERROR - Received SIGTERM. Terminating subprocesses.
   [2022-01-24, 16:18:18 UTC] {subprocess.py:99} INFO - Sending SIGTERM signal to process group
   [2022-01-24, 16:18:18 UTC] {process_utils.py:70} INFO - Process psutil.Process(pid=553436, status='terminated', exitcode=1, started='16:17:01') (553436) terminated with exit code 1
   ```
   Besides, we have seen a related error
   ```
   ProcessLookupError: [Errno 3] No such process
     File "airflow/executors/celery_executor.py", line 121, in _execute_in_fork
       args.func(args)
     File "airflow/cli/cli_parser.py", line 48, in command
       return func(*args, **kwargs)
     File "airflow/utils/cli.py", line 92, in wrapper
       return f(*args, **kwargs)
     File "airflow/cli/commands/task_command.py", line 298, in task_run
       _run_task_by_selected_method(args, dag, ti)
     File "airflow/cli/commands/task_command.py", line 105, in _run_task_by_selected_method
       _run_task_by_local_task_job(args, ti)
     File "airflow/cli/commands/task_command.py", line 163, in _run_task_by_local_task_job
       run_job.run()
     File "airflow/jobs/base_job.py", line 245, in run
       self._execute()
     File "airflow/jobs/local_task_job.py", line 103, in _execute
       self.task_runner.start()
     File "airflow/task/task_runner/standard_task_runner.py", line 41, in start
       self.process = self._start_by_fork()
     File "airflow/task/task_runner/standard_task_runner.py", line 96, in _start_by_fork
       Sentry.flush()
     File "airflow/sentry.py", line 188, in flush
       sentry_sdk.flush()
     File "threading.py", line 306, in wait
       gotit = waiter.acquire(True, timeout)
     File "airflow/models/taskinstance.py", line 1409, in signal_handler
       self.task.on_kill()
     File "airflow/operators/bash.py", line 193, in on_kill
       self.subprocess_hook.send_sigterm()
     File "airflow/hooks/subprocess.py", line 101, in send_sigterm
       os.killpg(os.getpgid(self.sub_process.pid), signal.SIGTERM)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19635:
URL: https://github.com/apache/airflow/issues/19635#issuecomment-1030841249


   > Hi @potiuk , we have seen a similar issue since we upgraded from Airflow 2.0.2 to Airflow 2.2.3, for example in a BashOperator.
   
   This issue is already closed. Without knowing your configuration, it's impossible to say what the problem is and whether it's the same or not. sending a TEMMR signal by something will usually result with this kind of problem, but the reasons for that might be multiple. \
   
   \I propose you open a new issue where you describe your circumstances or you could take a look at other issues that have described similar problems in various circusmstances to see if they are similar and if you find the issue is open, post details in it to help to diagnose/fix it with more certainty: 
   
   Here are all similar issues: https://github.com/apache/airflow/issues?q=is%3Aissue+SIGTERM+label%3Akind%3Abug+
   And here only the opened ones: https://github.com/apache/airflow/issues?q=is%3Aissue+SIGTERM+label%3Akind%3Abug+is%3Aopen
   
   If you find that circumstances are similar in one of those opened issues, I suggest you just describe a bit more details - your deployment details, when you experience the problems what remedies you used so far etc. everything that might help there.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lidalei commented on issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

Posted by GitBox <gi...@apache.org>.
lidalei commented on issue #19635:
URL: https://github.com/apache/airflow/issues/19635#issuecomment-1020360043


   Hi @potiuk , we have seen a similar issue since we upgraded from Airflow 2.0.2 to Airflow 2.2.3, for example in a BashOperator.
   ```
   [2022-01-24, 16:18:18 UTC] {subprocess.py:93} INFO - Command exited with return code 0
   [2022-01-24, 16:18:18 UTC] {taskinstance.py:1267} INFO - Marking task as SUCCESS. dag_id=XXX, task_id=XXX, execution_date=20220124T120500, start_date=20220124T161701, end_date=20220124T161818
   [2022-01-24, 16:18:18 UTC] {local_task_job.py:211} WARNING - State of this instance has been externally set to success. Terminating instance.
   [2022-01-24, 16:18:18 UTC] {process_utils.py:120} INFO - Sending Signals.SIGTERM to group 553436. PIDs of all processes in the group: [553436]
   [2022-01-24, 16:18:18 UTC] {process_utils.py:75} INFO - Sending the signal Signals.SIGTERM to group 553436
   [2022-01-24, 16:18:18 UTC] {taskinstance.py:1408} ERROR - Received SIGTERM. Terminating subprocesses.
   [2022-01-24, 16:18:18 UTC] {subprocess.py:99} INFO - Sending SIGTERM signal to process group
   [2022-01-24, 16:18:18 UTC] {process_utils.py:70} INFO - Process psutil.Process(pid=553436, status='terminated', exitcode=1, started='16:17:01') (553436) terminated with exit code 1
   ```
   Besides, we have seen a related error
   ```
   ProcessLookupError: [Errno 3] No such process
     File "airflow/executors/celery_executor.py", line 121, in _execute_in_fork
       args.func(args)
     File "airflow/cli/cli_parser.py", line 48, in command
       return func(*args, **kwargs)
     File "airflow/utils/cli.py", line 92, in wrapper
       return f(*args, **kwargs)
     File "airflow/cli/commands/task_command.py", line 298, in task_run
       _run_task_by_selected_method(args, dag, ti)
     File "airflow/cli/commands/task_command.py", line 105, in _run_task_by_selected_method
       _run_task_by_local_task_job(args, ti)
     File "airflow/cli/commands/task_command.py", line 163, in _run_task_by_local_task_job
       run_job.run()
     File "airflow/jobs/base_job.py", line 245, in run
       self._execute()
     File "airflow/jobs/local_task_job.py", line 103, in _execute
       self.task_runner.start()
     File "airflow/task/task_runner/standard_task_runner.py", line 41, in start
       self.process = self._start_by_fork()
     File "airflow/task/task_runner/standard_task_runner.py", line 96, in _start_by_fork
       Sentry.flush()
     File "airflow/sentry.py", line 188, in flush
       sentry_sdk.flush()
     File "threading.py", line 306, in wait
       gotit = waiter.acquire(True, timeout)
     File "airflow/models/taskinstance.py", line 1409, in signal_handler
       self.task.on_kill()
     File "airflow/operators/bash.py", line 193, in on_kill
       self.subprocess_hook.send_sigterm()
     File "airflow/hooks/subprocess.py", line 101, in send_sigterm
       os.killpg(os.getpgid(self.sub_process.pid), signal.SIGTERM)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #19635:
URL: https://github.com/apache/airflow/issues/19635#issuecomment-975283762


   If you find that it is still not fixed in 2.2.2 and report more details (detailed stack traces and exact reproduction steps). from 2.2.2 we can always reopen this issue).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ajaymalikbol commented on issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

Posted by GitBox <gi...@apache.org>.
ajaymalikbol commented on issue #19635:
URL: https://github.com/apache/airflow/issues/19635#issuecomment-975225255


   > I believe this has been fixed already in 2.1 and 2.2. There could be differen t reasons for that, but several of them have been fixed to address similar issues. Please upgrade to 2.2.2 and comment here if the issue still remains.
   
   > @potiuk Thanks for the replying, Can you please share any PR/ticket for the same fix?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19635:
URL: https://github.com/apache/airflow/issues/19635#issuecomment-975280603


   There were quite a few fixes related to similar behaviour- you can take a look at the changelog and try to see if any of those sounds familiar: https://github.com/apache/airflow/blob/main/CHANGELOG.txt


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #19635: Received SIGTERM. Terminating subprocesses (State of this instance has been externally set to success. Terminating instance.)

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #19635:
URL: https://github.com/apache/airflow/issues/19635#issuecomment-975283762


   If you find that it is still not fixed in 2.2.2 and report more details (detailed stack traces etc. from 2.2.2 we can always reopen this issue).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org