You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Benjamin Bannier (JIRA)" <ji...@apache.org> on 2018/01/04 12:29:00 UTC

[jira] [Updated] (MESOS-8350) Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration

     [ https://issues.apache.org/jira/browse/MESOS-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Bannier updated MESOS-8350:
------------------------------------
    Fix Version/s: 1.5.0

> Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-8350
>                 URL: https://issues.apache.org/jira/browse/MESOS-8350
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>            Reporter: Benjamin Bannier
>            Assignee: Benjamin Bannier
>            Priority: Critical
>             Fix For: 1.5.0, 1.6.0
>
>
> For resource provider-capable agents the master does not re-send checkpointed resources on agent reregistration; instead the checkpointed resources sent as part of the {{ReregisterSlaveMessage}} should be used.
> This is not what happens in reality. If e.g., checkpointing of an offer operation fails and the agent fails over the checkpointed resources would, as expected, not be reflected in the agent, but would still be assumed in the master.
> A workaround is to fail over the master which would lead to the newly elected master bootstrapping agent state from {{ReregisterSlaveMessage}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)