You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Ben Mahler <be...@gmail.com> on 2014/03/01 00:30:43 UTC

Re: Review Request 18604: Changed SIGKILL to SIGTERM in command executor shutdown.


> On Feb. 28, 2014, 6:25 p.m., Ben Mahler wrote:
> > src/launcher/executor.cpp, lines 215-216
> > <https://reviews.apache.org/r/18604/diff/1/?file=506639#file506639line215>
> >
> >     Do we still need this TODO, and/or should we document the fact that the slave will take care of kill escalation?
> 
> Niklas Nielsen wrote:
>     Ah - shoot. I should have removed that comment.
>     
>     How about leaving a comment along the lines of the RR description:
>     
>     // If the processes doesn't respond to SIGTERM, the slave will call
>     // containerizer->destroy() after EXECUTOR_SHUTDOWN_GRACE_PERIOD seconds.
>     // MesosContainerizer::destroy() calls SIGKILL on the process tree which
>     // effectively is the necessary signal escalation.

I think there's a bug here!

Consider killTask() when the child process swallows SIGTERM. The reaped() callback will never be invoked, and the driver will never be stopped. There is no grace period for killTask() as far as the slave is concerned so the command executor becomes effectively 'stuck'.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18604/#review35807
-----------------------------------------------------------


On Feb. 28, 2014, 2:07 a.m., Niklas Nielsen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18604/
> -----------------------------------------------------------
> 
> (Updated Feb. 28, 2014, 2:07 a.m.)
> 
> 
> Review request for mesos.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> If command executor processes doesn't respond to SIGTERM during shutdown, the slave will call containerizer->destroy() after EXECUTOR_SHUTDOWN_GRACE_PERIOD seconds.
> MesosContainerizer::destroy() calls SIGKILL on the process tree which effectively is the necessary signal escalation.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp e30d77a 
> 
> Diff: https://reviews.apache.org/r/18604/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Niklas Nielsen
> 
>