You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Guangya Liu (JIRA)" <ji...@apache.org> on 2016/09/28 13:40:20 UTC

[jira] [Comment Edited] (MESOS-5524) Expose resource allocation constraints (quota, shares) to schedulers.

    [ https://issues.apache.org/jira/browse/MESOS-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15522430#comment-15522430 ] 

Guangya Liu edited comment on MESOS-5524 at 9/28/16 1:39 PM:
-------------------------------------------------------------

[~bmahler] one question want to discuss with you is when exposing the resource allocation constraints, do we need to expose the resources as {{role}} level or {{framework}} level? 

If expose as {{role}} level, then there may be problems when one role has multiple frameworks as each framework with same role will have same resource constraints, and we cannot guarantee if one framework can always get the exposed resources.

{{framework}} level is also not good, the problem is how we define {{framework}} level, just expose the resources evenly to all {{frameworks}} under the same {{role}} or some other ways?  expose the resources evenly to all {{frameworks}} under the same {{role}} is also not accurate, as there maybe a {{framework}} have quite a lot of tasks while others may not have tasks, and the framework with lot of tasks will use up all of the resources.


was (Author: gyliu):
[~bmahler] one question want to discuss with you is when exposing the resource allocation constraints, do we need to expose the resources as {{role}} level or {{framework}} level? 

If expose as {{role}} level, then there may be problems when one role has multiple frameworks as each framework with same role will have same resource constraints, and we cannot guarantee if one framework can always get the exposed resources.

Seems {{framework}} level is more accurate, but even with {{framework}} level, it may still not accurate because of the allocator coarse-grained mode for resource allocation when there are more frameworks than agents in cluster. any comments?

> Expose resource allocation constraints (quota, shares) to schedulers.
> ---------------------------------------------------------------------
>
>                 Key: MESOS-5524
>                 URL: https://issues.apache.org/jira/browse/MESOS-5524
>             Project: Mesos
>          Issue Type: Epic
>          Components: allocation, scheduler api
>            Reporter: Benjamin Mahler
>
> Currently, schedulers do not have visibility into their quota or shares of the cluster. By providing this information, we give the scheduler the ability to make better decisions. As we start to allow schedulers to decide how they'd like to use a particular resource (e.g. as non-revocable or revocable), schedulers need visibility into their quota and shares to make an effective decision (otherwise they may accidentally exceed their quota and will not find out until mesos replies with TASK_LOST REASON_QUOTA_EXCEEDED).
> We would start by exposing the following information:
> * quota: e.g. cpus:10, mem:20, disk:40
> * shares: e.g. cpus:20, mem:40, disk:80
> Currently, quota is used for non-revocable resources and the idea is to use shares only for consuming revocable resources since the number of shares available to a role changes dynamically as resources come and go, frameworks come and go, or the operator manipulates the amount of resources sectioned off for quota.
> By exposing quota and shares, the framework knows when it can consume additional non-revocable resources (i.e. when it has fewer non-revocable resources allocated to it than its quota) or when it can consume revocable resources (always! but in the future, it cannot revoke another user's revocable resources if the framework is above its fair share).
> This also allows schedulers to determine whether they have sufficient quota assigned to them, and to alert the operator if they need more to run safely. Also, by viewing their fair share, the framework can expose monitoring information that shows the discrepancy between how much it would like and its fair share (note that the framework can actually exceed its fair share but in the future this will mean increased potential for revocation).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)