You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Jacob Janco (JIRA)" <ji...@apache.org> on 2017/01/11 03:24:58 UTC

[jira] [Commented] (MESOS-6904) Track resource allocation candidates and batch allocation work

    [ https://issues.apache.org/jira/browse/MESOS-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15817026#comment-15817026 ] 

Jacob Janco commented on MESOS-6904:
------------------------------------

Reviews currently in progress: 
https://reviews.apache.org/r/51027/
https://reviews.apache.org/r/51028/
https://reviews.apache.org/r/52534/
WIP from [~gyliu]
https://reviews.apache.org/r/51621/

> Track resource allocation candidates and batch allocation work
> --------------------------------------------------------------
>
>                 Key: MESOS-6904
>                 URL: https://issues.apache.org/jira/browse/MESOS-6904
>             Project: Mesos
>          Issue Type: Bug
>          Components: allocation
>            Reporter: Jacob Janco
>            Assignee: Jacob Janco
>              Labels: allocator
>
> "Our deployment environments have a lot of churn, with many short-live frameworks that often revive offers. Running the allocator takes a long time (from seconds up to minutes).
> In this situation, event-triggered allocation causes the event queue in the allocator process to get very long, and the allocator effectively becomes unresponsive (eg. a revive offers message takes too long to come to the head of the queue)." - MESOS-3157 
> To remedy the above scenario, it is proposed to track allocation candidates and only dispatch allocation work if there is no pending allocation in the allocator queue. When an enqueued allocation is processed, the tracked set of candidates is cleared. 
> Current behavior will trigger allocation work on cluster events (e.g. `addSlave()`, `addFramework()`, etc) as well as during the periodic batched allocation running at a defined time interval. 
> This ticket tracks the new direction the work has taken since discussion in MESOS-3157 where a previous solution by [~jamespeach] introduced batched allocation only (which we currently run) as well as an approach to reduce redundancy of work in the queue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)