You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@mesos.apache.org by "QIHANG CHEN (JIRA)" <ji...@apache.org> on 2016/11/02 22:03:58 UTC

[jira] [Commented] (MESOS-2115) Improve recovering Docker containers when slave is contained

    [ https://issues.apache.org/jira/browse/MESOS-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630683#comment-15630683 ] 

QIHANG CHEN commented on MESOS-2115:
------------------------------------

Is this issue fixed? I wonder if there's any documentations on how to set the correct configurations to enable slave recovery for containerized mesos-slave? 

I'm using the latest release 1.0.2-rc1 of the mesos-slave container and the slave recovery still failed when I try to restart the `mesos-slave container`


> Improve recovering Docker containers when slave is contained
> ------------------------------------------------------------
>
>                 Key: MESOS-2115
>                 URL: https://issues.apache.org/jira/browse/MESOS-2115
>             Project: Mesos
>          Issue Type: Epic
>          Components: docker
>            Reporter: Timothy Chen
>            Assignee: Timothy Chen
>              Labels: docker
>             Fix For: 0.23.0
>
>
> Currently when docker containerizer is recovering it checks the checkpointed executor pids to recover which containers are still running, and remove the rest of the containers from docker ps that isn't recognized.
> This is problematic when the slave itself was in a docker container, as when the slave container dies all the forked processes are removed as well, so the checkpointed executor pids are no longer valid.
> We have to assume the docker containers might be still running even though the checkpointed executor pids are not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)