You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Vinod Kone (JIRA)" <ji...@apache.org> on 2018/02/15 17:26:00 UTC

[jira] [Assigned] (MESOS-8574) Docker executor makes no progress when 'docker inspect' hangs

     [ https://issues.apache.org/jira/browse/MESOS-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod Kone reassigned MESOS-8574:
---------------------------------

    Assignee: Andrei Budnik

> Docker executor makes no progress when 'docker inspect' hangs
> -------------------------------------------------------------
>
>                 Key: MESOS-8574
>                 URL: https://issues.apache.org/jira/browse/MESOS-8574
>             Project: Mesos
>          Issue Type: Improvement
>          Components: docker, executor
>    Affects Versions: 1.5.0
>            Reporter: Greg Mann
>            Assignee: Andrei Budnik
>            Priority: Major
>              Labels: mesosphere
>
> In the Docker executor, many calls later in the executor's lifecycle are gated on an initial {{docker inspect}} call returning: https://github.com/apache/mesos/blob/bc6b61bca37752689cffa40a14c53ad89f24e8fc/src/docker/executor.cpp#L223
> If that first call to {{docker inspect}} never returns, the executor becomes stuck in a state where it makes no progress and cannot be killed.
> It's tempting for the executor to simply commit suicide after a timeout, but we must be careful of the case in which the executor's Docker container is actually running successfully, but the Docker daemon is unresponsive. In such a case, we do not want to send TASK_FAILED or TASK_KILLED if the task's container is running successfully.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)