You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/09 21:35:28 UTC

[GitHub] [airflow] dstandish opened a new issue #13591: Graceful termination does not work with apache chart running 1.10.14

dstandish opened a new issue #13591:
URL: https://github.com/apache/airflow/issues/13591


   In apache/airflow, helm chart has worker default `terminationGracePeriodSeconds: 600`.
   
   I observed after deploy using 1.10.14 that worker was terminated immediately.  This reproduced consistently.
   
   Perhaps did something change in 2.0.0?
   
   Anyone have any hints of something to look into?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13591: Graceful termination does not work with apache chart running 1.10.14

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-757444101


   ### What _does_ work:
   
   I verified, at least with 2.0 that warm shutdown **_works_** with the following change for worker deployment:
   ```yaml
             command: ["airflow"]
             args: ["celery", "worker"]
   ```
   
   ### More things that don't work
   
   Before arriving at the above, I tried this (didn't work):
   ```yaml
             command: ["/usr/bin/dumb-init", "--", "airflow"]
             args: ["celery", "worker"]
   ```
    This also didn't work:
   ```
             command: ["/usr/bin/dumb-init"]
             args: ["airflow", "celery", "worker"]
   ```
   
   Produced more of these errors:
   ```
   [2021-01-10 09:36:09,037: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 15 (SIGTERM).')
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13591: Graceful termination does not work with apache chart running 1.10.14

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-757724804


   > I will take a look later this week. It also depends which command is used to run airflow components. You are.talking about the current master version of the 'chart' yeah ? No modification to the entrypoint or command ?
   
   correct
   
   > So if you expect the worker to terminate immeditaely you might have observed actually wrong behaviour where someone sent more than one SIGTERM to those workers (I've seen such setups) - but this is a rather bad idea IMHO
   
   No, I do not want worker to terminate immediately.  I want it to do what it is supposed to, namely warm shutdown -- i.e. stop taking tasks, and run until either all tasks done or grace period has elapsed
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #13591: Graceful termination does not work with apache chart running 1.10.14

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-757720643


   I will take a look later this week. It also depends which command is used to run airflow components. You are.talking about the current master version of the 'chart' yeah ? No modification to the entrypoint or command ?
   
   The dumb init and tini are equivalent and they are indeed there to forward signals to the running processes (this is really useful when you have a bash script as entrypoint (if you have bash as direct entrypoint then it will not forward signals to it's children. There are two solutions to solve it:
   
   A) dumb init or tini as entrypoint 
   b) exec 'binary' at the end of the bash script (exec another bash won't work)
   
   Default entrypoint in prod image is dumb-init so it should propagate the signals properly, but as @xinbinhuang mentioned when you have celery worker it has a number of config options when you send a SIGTERM to it celery worker it will stop spawning new processes and wait for all the running tasks to terminate. So by definition the worker might take quite some time to exit. There is the termination grace period that controls how long it will take for the celery to wait for all processes to terminate before it will 'kill -9' and exits 'non gracefully'. 
   
   Also there is another gotcha - if you send SECOND SIGTERM to such celery worker while it is waiting for tasks, it will terminate all the processes with 'kill -9' and will exit immediately.
   
   So if you expect the worker to terminate immeditaely you might have observed actually wrong behaviour where someone sent more than one SIGTERM to those workers (I've seen such setups) - but this is a rather bad idea IMHO.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13591: Graceful termination does not work with apache chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-757773879


   and to clarify @potiuk yes it is latest master
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] anitakar commented on issue #13591: Graceful termination does not work with apache chart

Posted by GitBox <gi...@apache.org>.
anitakar commented on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-822287818


   also this one does: https://github.com/celery/billiard/issues/273#issuecomment-453265476
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] auvipy commented on issue #13591: Graceful termination does not work with apache chart

Posted by GitBox <gi...@apache.org>.
auvipy commented on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-790278819


   this sounds interesting https://github.com/celery/billiard/issues/273#issuecomment-734461528


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13591: Graceful termination does not work with apache chart running 1.10.14

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-757526400


   @potiuk i think you are main architect of dockerfile.  do you know whats going on here?  i don't really understand this area very well... dumb-init / tini / gosu, and what happens when combined with entrypoints and args...  though i'd like to!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] xinbinhuang commented on issue #13591: Graceful termination does not work with apache chart running 1.10.14

Posted by GitBox <gi...@apache.org>.
xinbinhuang commented on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-757540492


   This issue from celery might be relevant: https://github.com/celery/billiard/issues/273
   
   Some people saying it's Linux distro related: https://github.com/celery/billiard/issues/273#issuecomment-472665083 ..


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13591: Graceful termination does not work with apache chart running 1.10.14

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-757526592


   astronomer's helm chart uses gosu and tini, it seems. fwiw, in my previous company we used astronomer EE and the termination did work


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13591: Graceful termination does not work with apache chart

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-852681806


   i think we can consider this resolved by https://github.com/apache/airflow/pull/16153
   
   graceful termination still does not work out of the box with released 1.0.0 chart but with that PR you can use the command / args combination that works, namely this:
   ```
             command: ["airflow"]
             args: ["celery", "worker"]
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13591: Graceful termination does not work with apache chart

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-852681806


   i think we can consider this resolved by https://github.com/apache/airflow/pull/16153


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13591: Graceful termination does not work with apache chart running 1.10.14

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-757444101


   ### What _does_ work:
   
   I verified, at least with 2.0 that warm shutdown **_works_** with the following change for worker deployment:
   ```yaml
             command: ["airflow"]
             args: ["celery", "worker"]
   ```
   
   ### More things that don't work
   
   Before arriving at the above, I tried this (didn't work):
   ```yaml
             command: ["/usr/bin/dumb-init", "--", "airflow"]
             args: ["celery", "worker"]
   ```
    This also didn't work:
   ```yaml
             command: ["/usr/bin/dumb-init"]
             args: ["airflow", "celery", "worker"]
   ```
   
   Produced more of these errors:
   ```
   [2021-01-10 09:36:09,037: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 15 (SIGTERM).')
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13591: Graceful termination does not work with apache chart running 1.10.14

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-757526592


   astronomer's helm chart uses gosu and tini, it seems. fwiw, in my previous company we used astronomer and the termination did work


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13591: Graceful termination does not work with apache chart running 1.10.14

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-757444101


   I verified, at least with 2.0 that warm shutdown **_works_** with the following change for worker deployment:
   ```yaml
             command: ["airflow"]
             args: ["celery", "worker"]
   ```
   
   Before arriving at the above, I tried this (didn't work):
   ```yaml
             command: ["/usr/bin/dumb-init", "--", "airflow"]
             args: ["celery", "worker"]
   ```
    
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13591: Graceful termination does not work with apache chart running 1.10.14

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13591:
URL: https://github.com/apache/airflow/issues/13591#issuecomment-757724804


   > I will take a look later this week. It also depends which command is used to run airflow components. You are.talking about the current master version of the 'chart' yeah ? No modification to the entrypoint or command ?
   
   correct, no mods to entrypoint.  You can see which things i tried in helm config above -- diff values of args or command.
   
   > So if you expect the worker to terminate immeditaely you might have observed actually wrong behaviour where someone sent more than one SIGTERM to those workers (I've seen such setups) - but this is a rather bad idea IMHO
   
   No, I do not want worker to terminate immediately.  I want it to do what it is supposed to, namely warm shutdown -- i.e. stop taking tasks, and run until either all tasks done or grace period has elapsed
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish closed issue #13591: Graceful termination does not work with apache chart

Posted by GitBox <gi...@apache.org>.
dstandish closed issue #13591:
URL: https://github.com/apache/airflow/issues/13591


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org