You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Yongqiao Wang (JIRA)" <ji...@apache.org> on 2015/12/21 09:10:46 UTC

[jira] [Assigned] (MESOS-3403) Add support for removing no re-registered slaves with timeout(--slave_reregister_timeout) from an external allocator

     [ https://issues.apache.org/jira/browse/MESOS-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yongqiao Wang reassigned MESOS-3403:
------------------------------------

    Assignee: Yongqiao Wang  (was: James Wang)

> Add support for removing no re-registered slaves with timeout(--slave_reregister_timeout) from an external allocator
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-3403
>                 URL: https://issues.apache.org/jira/browse/MESOS-3403
>             Project: Mesos
>          Issue Type: Improvement
>          Components: master
>            Reporter: James Wang
>            Assignee: Yongqiao Wang
>
> For an external Mesos allocator which does not run with Mesos master in the same OS process, and maybe this allocator can be deployed in the different host with Mesos master, then the Mesos allocator module should be implemented as a proxy, which delegates calls to an actual allocator.
> For this external allocator, the total resources and allocated resources will be stored in it. After Mesos master recovery (such as fail-over), it needs to sync up with Mesos master. Under normal circumstances, all slaves will reregister after Mesos master recovery, so we can sync up the total resources and used resource of each slave in allocator->addSlave function call. But for the abnormal case, a slave does not reregister after Mesos master recovery, then master will call function Master::removeSlave(const Registry::Slave& slave) to remove this slave from Registry after timeout(slave_reregister_timeout), but this function does not call allocator to remove the related resources. So in order to support the resources sync up with the external allocator in this abnormal case, it needs to enhance function Master::removeSlave(const Registry::Slave& slave) to call allocator->removeSlave to remove the related resources from external allocator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)