You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Meng Zhu <mz...@mesosphere.io> on 2019/05/29 15:02:34 UTC

Review Request 70750: Removed quota role sorter in the allocator.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70750/
-----------------------------------------------------------

Review request for mesos, Andrei Sekretenko and Benjamin Mahler.


Bugs: MESOS-9802
    https://issues.apache.org/jira/browse/MESOS-9802


Repository: mesos


Description
-------

This patch removes the dedicated quota role sorter
in favor of using the same sorting between satisfying
guarantees and bursting above guarantees up to limits.
The dedicated quota role sorter is tech debt from when
a "quota role" was considered different from a "non-quota"
role. However, they are the same, one just has a default quota.

This helps to simplify the logic in the allocator.
Benchmark result shows negligible performance change for
clusters with small roles (e.g. 30 and 300 roles). For
large number of roles (e.g. 3000 roles), a 5% performance
degradation is observed.

The patch would result in some behavior change if a cluster is
using oversubscribed resources with quota under DRF.
Previously, in the quota allocation stage, revocable resources
are counted towards neither the total resource pool nor a role's
allocated resources when sorting with DRF. This is arguably the
right behavior. However, after this patch, all resources,
both revocable and non-revocable ones, will be counted when
calculating DRF shares in the quota allocation stage. This means,
for a quota role that also consumes a lot of revocable resources
but no-so-much non-revocable ones, previously it would be sorted
towards the head of the queue, now it is likely to be sorted towards
the tail of the queue.


Diffs
-----

  src/master/allocator/mesos/hierarchical.hpp 71a9656fb934bf9ac58e3165254ea49cb09efa8b 
  src/master/allocator/mesos/hierarchical.cpp 40c8363afddccdd5275ca06318a8cc2cc6fa21af 


Diff: https://reviews.apache.org/r/70750/diff/1/


Testing
-------

make check

Benchmark using `QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota` with random sorter shows:

No performance change for 30 and 300 roles
About 5% performance degradation for 3000 roles.


Thanks,

Meng Zhu


Re: Review Request 70750: Removed quota role sorter in the allocator.

Posted by Meng Zhu <mz...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70750/
-----------------------------------------------------------

(Updated June 2, 2019, 5:15 p.m.)


Review request for mesos, Andrei Sekretenko and Benjamin Mahler.


Changes
-------

Updated benchmark result per Ben's suggestion.


Bugs: MESOS-9802
    https://issues.apache.org/jira/browse/MESOS-9802


Repository: mesos


Description
-------

This patch removes the dedicated quota role sorter
in favor of using the same sorting between satisfying
guarantees and bursting above guarantees up to limits.
The dedicated quota role sorter is tech debt from when
a "quota role" was considered different from a "non-quota"
role. However, they are the same, one just has a default quota.

This helps to simplify the logic in the allocator.
Benchmark result shows negligible performance change for
clusters with small roles (e.g. 30 and 300 roles). For
large number of roles (e.g. 3000 roles), a 5% performance
degradation is observed.

The patch would result in some behavior change if a cluster is
using oversubscribed resources with quota under DRF.
Previously, in the quota allocation stage, revocable resources
are counted towards neither the total resource pool nor a role's
allocated resources when sorting with DRF. This is arguably the
right behavior. However, after this patch, all resources,
both revocable and non-revocable ones, will be counted when
calculating DRF shares in the quota allocation stage. This means,
for a quota role that also consumes a lot of revocable resources
but no-so-much non-revocable ones, previously it would be sorted
towards the head of the queue, now it is likely to be sorted towards
the tail of the queue.


Diffs
-----

  src/master/allocator/mesos/hierarchical.hpp 71a9656fb934bf9ac58e3165254ea49cb09efa8b 
  src/master/allocator/mesos/hierarchical.cpp 40c8363afddccdd5275ca06318a8cc2cc6fa21af 


Diff: https://reviews.apache.org/r/70750/diff/1/


Testing (updated)
-------

make check

Benchmark using `NonQuotaVsQuotaParam/BENCHMARK_HierarchicalAllocator_WithNonQuotaVsQuotaParam.NonQuotaVsQuota/10` with random sorter shows:

**50% performance slowdown** with 2000 agents, 1000 roles, 2000 frameworks, with random sorter


Thanks,

Meng Zhu


Re: Review Request 70750: Removed quota role sorter in the allocator.

Posted by Benjamin Mahler <bm...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70750/#review215582
-----------------------------------------------------------


Ship it!




Can you run a benchmark where there are a lot of roles without any quota set? That would show the worst behavior of this change: that the first loop will have to loop over all agents, call sort for each, and skip all roles.

- Benjamin Mahler


On May 29, 2019, 3:02 p.m., Meng Zhu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70750/
> -----------------------------------------------------------
> 
> (Updated May 29, 2019, 3:02 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9802
>     https://issues.apache.org/jira/browse/MESOS-9802
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch removes the dedicated quota role sorter
> in favor of using the same sorting between satisfying
> guarantees and bursting above guarantees up to limits.
> The dedicated quota role sorter is tech debt from when
> a "quota role" was considered different from a "non-quota"
> role. However, they are the same, one just has a default quota.
> 
> This helps to simplify the logic in the allocator.
> Benchmark result shows negligible performance change for
> clusters with small roles (e.g. 30 and 300 roles). For
> large number of roles (e.g. 3000 roles), a 5% performance
> degradation is observed.
> 
> The patch would result in some behavior change if a cluster is
> using oversubscribed resources with quota under DRF.
> Previously, in the quota allocation stage, revocable resources
> are counted towards neither the total resource pool nor a role's
> allocated resources when sorting with DRF. This is arguably the
> right behavior. However, after this patch, all resources,
> both revocable and non-revocable ones, will be counted when
> calculating DRF shares in the quota allocation stage. This means,
> for a quota role that also consumes a lot of revocable resources
> but no-so-much non-revocable ones, previously it would be sorted
> towards the head of the queue, now it is likely to be sorted towards
> the tail of the queue.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.hpp 71a9656fb934bf9ac58e3165254ea49cb09efa8b 
>   src/master/allocator/mesos/hierarchical.cpp 40c8363afddccdd5275ca06318a8cc2cc6fa21af 
> 
> 
> Diff: https://reviews.apache.org/r/70750/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> Benchmark using `QuotaParam/BENCHMARK_HierarchicalAllocator_WithQuotaParam.LargeAndSmallQuota` with random sorter shows:
> 
> No performance change for 30 and 300 roles
> About 5% performance degradation for 3000 roles.
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>