You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "haosdent (JIRA)" <ji...@apache.org> on 2015/04/25 14:18:38 UTC

[jira] [Commented] (MESOS-2656) Slave should send status update immediately when container launch fails.

    [ https://issues.apache.org/jira/browse/MESOS-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512474#comment-14512474 ] 

haosdent commented on MESOS-2656:
---------------------------------

[~jieyu] Do you mean add containerizer->destroy in slave.cpp like this?

{code}
if (!future.isReady()) {
    // The containerizer will clean up if the launch fails we'll just log this
    LOG(ERROR) << "Container '" << containerId
               << "' for executor '" << executorId
               << "' of framework '" << frameworkId
               << "' failed to start: "
               << (future.isFailed() ? future.failure() : " future discarded");
    containerizer->destroy(containerId);
    return;
  } else if (!future.get()) {
    LOG(ERROR) << "Container '" << containerId
               << "' for executor '" << executorId
               << "' of framework '" << frameworkId
               << "' failed to start: None of the enabled containerizers ("
               << flags.containerizers << ") could create a container for the "
               << "provided TaskInfo/ExecutorInfo message.";
    containerizer->destroy(containerId);
    return;
  }
{code}

> Slave should send status update immediately when container launch fails.
> ------------------------------------------------------------------------
>
>                 Key: MESOS-2656
>                 URL: https://issues.apache.org/jira/browse/MESOS-2656
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.22.1
>            Reporter: Jie Yu
>
> Right now, the slave doesn't send status update to the scheduler if containerizer launch fails until executor reregistration timeout happens. Since for docker containerizer, someone might use a very large timeout value, ideally, the slave should send a status update to the scheduler right after containerizer launch fails.
> The simplest solution is to add a containerizer->destroy(..) in executorLaunched when containerizer->launch fails. In that way, it's going to trigger containerizer->wait and thus send status update to the scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)