You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Meng Zhu <mz...@mesosphere.com> on 2019/01/21 04:07:01 UTC

Quota 2.0 proposal

Hi folks:

I am excited to propose Quota 2.0 for better resource management on Mesos,
with explicit limits (decoupled from guarantee), generic quota (which can
be set on resources with metadata and on more generic resources such as the
number of containers) and bright shiny new APIs.

You can find the design doc here
<https://docs.google.com/document/d/13vG5uH4YVwM79ErBPYAZfnqYFOBbUy2Lym0_9iAQ5Uk/edit?usp=sharing>.
Please feel free to leave comments and suggestions.

I have also put an agenda item for the upcoming API working group meeting
on Tuesday (Jan 22nd, 11am PST), please join if you are interested.

Thanks,
Meng

Fwd: Quota 2.0 proposal

Posted by Meng Zhu <mz...@mesosphere.com>.
Hi folks:

During the design review meetings, the main discussion point is around
whether we should allow setting quota guarantees for resources with
specific meta-data. And the main use case is for disks with profiles.

The current proposal in the doc is to only allow setting guarantees on
top-level resources such as "cpus" and "disk". And limits can be set on any
resources even with meta-data. it has a caveat though that if limits are
set "underneath" the guarantee (e.g. a guarantee of disk co-exists with a
limit of disk with a specific profile), guarantees might not be satisfied
depending on the cluster usage.

This proposal does not support the use case of setting quota for disks with
profiles. This limitation and the caveat mentioned above are both due to
the quota propagation issue when there is a resource meta-data hierarchy as
explained in these related sections in the design doc
<https://docs.google.com/document/d/13vG5uH4YVwM79ErBPYAZfnqYFOBbUy2Lym0_9iAQ5Uk/edit#heading=h.i4lsj45vylfu>
.

It looks like there are a few options here with regard to setting quotas on
disks with profiles:

1. Stick to the current proposal, but treat disks with a profile as a
top-level resource (think about this as something completely unrelated to
"disk" e.g. "cpus"), so that guarantees can be set on it.

2. Add support for setting guarantees on any meta-data resource, but with
restrictions such that once a guarantee or a limit is set on a resource
with meta data, no more quotas can be configured for resources on the same
path in the meta-data hierarchy. For example, disk, disk with a fast
profile, and disk comes from vendor A are all considered resources on the
same path. Once one type of resource has a quota, no other resource types
on the same path can have quotas.

3. Add support for setting guarantees and limits on any meta-data
resources, but running the risk of guarantees might not get satisfied.

4. Add support for setting guarantees and limits on any meta-data
resources, and use the linear programming model to figure out how to
satisfy all the quotas.

5. Stick to the current proposal and does not support setting quotas on
disks with profiles.

Option 1 raises the question that should we treat resources like EBS as
something completely different from vanilla local disk? And if not (as the
option suggests), we need to update other parts of the system accordingly.
For example, endpoints, metrics, the allocator and etc. should stop
treating disk profile as "disk".

Option 2 seems to be too restrictive. It can be hard to reason and unwieldy
for the user.

Option 3 would certainly be easy to use. But after setting up the
guarantees, users would expect the guarantees can be satisfied which Mesos
may not be able to deliver. And when that happens there is no easy
explanation to why the guarantees are not satisfied.

Option 4 allows and enforces all the guarantees optimally. However, it is
not clear what is the performance implication of going through all the
optimization solvers. Also, since guarantees are not part of the long term
plan as we introduce priority tiers, we should ask whether it is worth the
complexity and effort.

Option 5 essentially kicks the can down the road, as the use case for
setting quotas on disk with profile is not immediate. For MVP, we could
stick to the design proposal and prepare to extend that when needs arise
(likely in the medium term).

Thoughts?

Thanks,
Meng

On Thu, Jan 24, 2019 at 9:58 AM Meng Zhu <mz...@mesosphere.com> wrote:

> After the API WG sync, we want to schedule a follow up meeting to discuss
> Quota 2.0 further. If you are interested, please join us at 12:30pm PST
> today (Jan 24th) with the zoom link below. Sorry for the short notice.
>
> -Meng
>
> Join Zoom Meeting https://zoom.us/j/574632536
> <https://www.google.com/url?q=https%3A%2F%2Fzoom.us%2Fj%2F574632536&sa=D&ust=1548784417513000&usg=AFQjCNEiLMZoqWW2x5X0oH-AhrN2GlLAiQ>
> One tap mobile +16699006833,,574632536# US (San Jose)
> +16465588656,,574632536# US (New York) Dial by your location +1 669 900
> 6833 US (San Jose) +1 646 558 8656 US (New York) Meeting ID: 574 632 536
> Find your local number: https://zoom.us/u/acZYnvuO63
> <https://www.google.com/url?q=https%3A%2F%2Fzoom.us%2Fu%2FacZYnvuO63&sa=D&ust=1548784417513000&usg=AFQjCNGCJXDosuVT9iEhjg_KeyoBZT4XxQ>
>
> On Sun, Jan 20, 2019 at 8:07 PM Meng Zhu <mz...@mesosphere.com> wrote:
>
>> Hi folks:
>>
>> I am excited to propose Quota 2.0 for better resource management on
>> Mesos, with explicit limits (decoupled from guarantee), generic quota
>> (which can be set on resources with metadata and on more generic resources
>> such as the number of containers) and bright shiny new APIs.
>>
>> You can find the design doc here
>> <https://docs.google.com/document/d/13vG5uH4YVwM79ErBPYAZfnqYFOBbUy2Lym0_9iAQ5Uk/edit?usp=sharing>.
>> Please feel free to leave comments and suggestions.
>>
>> I have also put an agenda item for the upcoming API working group meeting
>> on Tuesday (Jan 22nd, 11am PST), please join if you are interested.
>>
>> Thanks,
>> Meng
>>
>

Re: Quota 2.0 proposal

Posted by Meng Zhu <mz...@mesosphere.com>.
After the API WG sync, we want to schedule a follow up meeting to discuss
Quota 2.0 further. If you are interested, please join us at 12:30pm PST
today (Jan 24th) with the zoom link below. Sorry for the short notice.

-Meng

Join Zoom Meeting https://zoom.us/j/574632536
<https://www.google.com/url?q=https%3A%2F%2Fzoom.us%2Fj%2F574632536&sa=D&ust=1548784417513000&usg=AFQjCNEiLMZoqWW2x5X0oH-AhrN2GlLAiQ>
One tap mobile +16699006833,,574632536# US (San Jose)
+16465588656,,574632536# US (New York) Dial by your location +1 669 900
6833 US (San Jose) +1 646 558 8656 US (New York) Meeting ID: 574 632 536
Find your local number: https://zoom.us/u/acZYnvuO63
<https://www.google.com/url?q=https%3A%2F%2Fzoom.us%2Fu%2FacZYnvuO63&sa=D&ust=1548784417513000&usg=AFQjCNGCJXDosuVT9iEhjg_KeyoBZT4XxQ>

On Sun, Jan 20, 2019 at 8:07 PM Meng Zhu <mz...@mesosphere.com> wrote:

> Hi folks:
>
> I am excited to propose Quota 2.0 for better resource management on Mesos,
> with explicit limits (decoupled from guarantee), generic quota (which can
> be set on resources with metadata and on more generic resources such as the
> number of containers) and bright shiny new APIs.
>
> You can find the design doc here
> <https://docs.google.com/document/d/13vG5uH4YVwM79ErBPYAZfnqYFOBbUy2Lym0_9iAQ5Uk/edit?usp=sharing>.
> Please feel free to leave comments and suggestions.
>
> I have also put an agenda item for the upcoming API working group meeting
> on Tuesday (Jan 22nd, 11am PST), please join if you are interested.
>
> Thanks,
> Meng
>