You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@aurora.apache.org by "Bill Farner (JIRA)" <ji...@apache.org> on 2015/07/09 19:12:06 UTC

[jira] [Commented] (AURORA-1388) If mesos_slave gets a SIGUSR1, thermos doesn't shutdown cleanly

    [ https://issues.apache.org/jira/browse/AURORA-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620863#comment-14620863 ] 

Bill Farner commented on AURORA-1388:
-------------------------------------

Relevant - you should consider using the maintenance commands in {{aurora_admin}} if you are doing things like fleet-wide maintenance.  This should safely drain hosts in a way that minimizes churn.  We should fix this bug regardless, however.

> If mesos_slave gets a SIGUSR1, thermos doesn't shutdown cleanly
> ---------------------------------------------------------------
>
>                 Key: AURORA-1388
>                 URL: https://issues.apache.org/jira/browse/AURORA-1388
>             Project: Aurora
>          Issue Type: Bug
>            Reporter: Brian Brazil
>
> https://issues.apache.org/jira/browse/MESOS-1475 allows for a SIGUSR1 to be sent to a mesos slave in order to shut it down and any processes cleanly, useful for changing slave attributes.
> I tried this with my aurora setup, and via tcpdump found that it sent the first {{/shutdown}} http request to the task - but nothing after it. The process also kept on running, holding onto a static port in my case that prevented things from working when a task is scheduled on that slave when it comes back up.
> We should ensure that thermos behaves correctly when the mesos slave gets a SIGUSR1, following the lifecycle policy and ultimately killing the processes if needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)