You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Vinod Kone (JIRA)" <ji...@apache.org> on 2013/08/16 20:34:49 UTC

[jira] [Updated] (MESOS-644) Slave doesn't correctly handle checkpointed terminal update whose ack doesn't reach the executor

     [ https://issues.apache.org/jira/browse/MESOS-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod Kone updated MESOS-644:
-----------------------------

    Issue Type: Bug  (was: Task)
    
> Slave doesn't correctly handle checkpointed terminal update whose ack doesn't reach the executor
> ------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-644
>                 URL: https://issues.apache.org/jira/browse/MESOS-644
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Vinod Kone
>            Assignee: Vinod Kone
>             Fix For: 0.14.0
>
>
> This is the scenario.
> Slave dies after checkpointing a terminal update but before the ACK reached the executor.
> Recovered slave/status update manager retries the update and cleans it up after it gets an ACK from the scheduler.
> When the executor re-registers after this point, it still has a pending update but the slave cannot find the executor for this update because the task is completed! Currently the slave forwards this update to the SUM anyway but never acks the executor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira