You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Bartek Plotka (JIRA)" <ji...@apache.org> on 2015/09/09 14:57:45 UTC

[jira] [Commented] (MESOS-2930) Allow the Resource Estimator to express over-allocation of revocable resources.

    [ https://issues.apache.org/jira/browse/MESOS-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736795#comment-14736795 ] 

Bartek Plotka commented on MESOS-2930:
--------------------------------------

1. Do we need that? What is the worst case scenario? With proper QoS Controller, the new BE tasks using out-dated revocable offers will be just killed the same moment they started. And we have to use the proper QoS Controller anyway to have Resource Estimator aware of over-allocation state of slave.

2. Some idea to mitigate above issue: 
* What about rescinding/removing out-dated oversubscription offers? As far as i know, rescinding is only happening when slave is deactivated or disconnected, and allocator hasn't any handler for that. Only master has and we could try to trigger it in Master code. However, it would be really nice, because we would be able to out-date that offer. Basically, the offer with revocable resources can be treated as _best-effort_ offer as well, right? (: Is there any other solution? 
* Second issue is with returning negative resources:
** We could create new "SignedResources" class and use that in ResourceEstimator and whole flow up-to Master. We could create some converters to normal Resources as well. (I've created something like that some time ago, while writing my own allocation module). I'm not sure if currently there is any other use case for such special Resources class, so...
** We could just add flag to Estimator API (optionall) making the slack resources negative virtually.

Or maybe it's already designed?  (:

> Allow the Resource Estimator to express over-allocation of revocable resources.
> -------------------------------------------------------------------------------
>
>                 Key: MESOS-2930
>                 URL: https://issues.apache.org/jira/browse/MESOS-2930
>             Project: Mesos
>          Issue Type: Improvement
>          Components: slave
>            Reporter: Benjamin Mahler
>            Assignee: Klaus Ma
>
> Currently the resource estimator returns the amount of oversubscription resources that are available, since resources cannot be negative, this allows the resource estimator to express the following:
> (1) Return empty resources: We are fully allocated for oversubscription resources.
> (2) Return non-empty resources: We are under-allocated for oversubscription resources. In other words, some are available.
> However, there is an additional situation that we cannot express:
> (3) Analogous to returning non-empty "negative" resources: We are over-allocated for oversubscription resources. Do not re-offer any of the over-allocated oversubscription resources that are recovered.
> Without (3), the slave can only shrink the total pool of oversubscription resources by returning (1) as resources are recovered, until the pool is shrunk to the desired size. However, this approach is only best-effort, it's possible for a framework to launch more tasks in the window of time (15 seconds by default) that the slave polls the estimator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)