You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Joerg Schad (JIRA)" <ji...@apache.org> on 2015/08/04 21:32:05 UTC

[jira] [Updated] (MESOS-3202) Avoid frameworks starving in DRF allocator.

     [ https://issues.apache.org/jira/browse/MESOS-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joerg Schad updated MESOS-3202:
-------------------------------
    Description: 
We currently run into issues with the DRF scheduler that frameworks do not receive offers (see https://github.com/mesosphere/marathon/issues/1931 for details). 

Imagine that we have 10 frameworks and unallocated resources from a single slave.
Allocation interval is 1 sec, and refuse_seconds (i.e. the time for which a declined resource is filtered) is 3 sec across all frameworks. 
Allocator offers resources to framework 1 (according to DRF) which declines the offer immediately. 
In the next allocation interval framework 1 is skipped due to the declined offer before. Hence the next framework 2 is offered the resources, which it also declines.
The same procedure in the next allocation interval (with framework 3). 

In the next allocation interval the refuse_seconds for framework 1 are over, and as it still has the lowest DRF share it gets the resource offered again, which it again declines. And the cycle begins again....

Framework 4 (which is actually waiting for this resource) is never offered this resource.


 

  was:We currently run into issues with the DRF scheduler that frameworks do not receive offers (see https://github.com/mesosphere/marathon/issues/1931 for details). 


> Avoid frameworks starving in DRF allocator.
> -------------------------------------------
>
>                 Key: MESOS-3202
>                 URL: https://issues.apache.org/jira/browse/MESOS-3202
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Joerg Schad
>
> We currently run into issues with the DRF scheduler that frameworks do not receive offers (see https://github.com/mesosphere/marathon/issues/1931 for details). 
> Imagine that we have 10 frameworks and unallocated resources from a single slave.
> Allocation interval is 1 sec, and refuse_seconds (i.e. the time for which a declined resource is filtered) is 3 sec across all frameworks. 
> Allocator offers resources to framework 1 (according to DRF) which declines the offer immediately. 
> In the next allocation interval framework 1 is skipped due to the declined offer before. Hence the next framework 2 is offered the resources, which it also declines.
> The same procedure in the next allocation interval (with framework 3). 
> In the next allocation interval the refuse_seconds for framework 1 are over, and as it still has the lowest DRF share it gets the resource offered again, which it again declines. And the cycle begins again....
> Framework 4 (which is actually waiting for this resource) is never offered this resource.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)