You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Mahler <bm...@apache.org> on 2018/09/04 04:59:59 UTC
Re: Review Request 68490: Optimized `class Resources` with
copy-on-write.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208286
-----------------------------------------------------------
Thanks! Can you also include some commentary on the rest of the allocation benchmarks as well as the overhead within the Resources micro-benchmarks?
include/mesos/resources.hpp
Lines 590-591 (patched)
<https://reviews.apache.org/r/68490/#comment292011>
```
// We use `boost::indirect_iterator` to expose `Resource_` iteration,
// while actually storing `shared_ptr<Resource_>`.
```
include/mesos/resources.hpp
Lines 592 (patched)
<https://reviews.apache.org/r/68490/#comment292010>
Remove the newline?
include/mesos/resources.hpp
Lines 588-590 (original), 597-599 (patched)
<https://reviews.apache.org/r/68490/#comment292009>
This comment now looks a little weird, how about:
```
// NOTE: Non-const `begin()` and `end()` intentionally return const
// iterators to prevent mutable access to the `Resource` objects.
```
include/mesos/resources.hpp
Lines 678-679 (patched)
<https://reviews.apache.org/r/68490/#comment292012>
What's this referring to? Reading this I see it already has an rvalue reference?
include/mesos/resources.hpp
Lines 682 (patched)
<https://reviews.apache.org/r/68490/#comment292017>
Why do we need this? It seems like the callers should just be de-referencing their shared_ptr prior to calling?
include/mesos/resources.hpp
Line 676 (original), 691-693 (patched)
<https://reviews.apache.org/r/68490/#comment292013>
A comment would be helpful for the reader:
```
// Resources are stored using copy-on-write:
//
// (1) Copies are done by copying the `shared_ptr`. This
// makes read-only filtering (e.g. `unreserved()`)
// inexpensive as we do not have to perform copies
// of the resource objects.
//
// (2) When a write occurs:
// (a) If there's a single reference to the resource
// object, we mutate directly.
// (b) If there's more than a single reference to the
// resource object, we copy first, then mutate the
// copy.
```
src/common/resources.cpp
Line 1471 (original), 1473 (patched)
<https://reviews.apache.org/r/68490/#comment292020>
I'm a little puzzled about these `const shared_ptr<...>` foreach loops. Why don't the const ones do `const Resource_&` loops?
```
foreach (const Resource_& resource_, that) {
```
Something I'm missing?
src/common/resources.cpp
Line 1503 (original), 1505 (patched)
<https://reviews.apache.org/r/68490/#comment292014>
It seems ok to slip this in here, but generally please send unrelated cleanups as separate patches. :)
src/common/resources.cpp
Line 1514 (original), 1516-1518 (patched)
<https://reviews.apache.org/r/68490/#comment292016>
I'm not sure how obvious these will be to readers, maybe a comment on all of these?
```
// Copy-on-write (if more than 1 reference).
if (resource_.use_count() > 1) {
resource_ = make_shared<Resource_>(*resource_);
}
```
Maybe something like (since we can't use `mutable`):
```
make_mutable(resource_);
```
Maybe also make it composable?
```
make_mutable(resource_)->resource.mutable_allocation_info()->set_role(role);
```
Still, seems rather error prone and easy to forget to check for copies prior to mutating? Wonder if there's a way to make it less brittle.
src/common/resources.cpp
Lines 1543-1544 (patched)
<https://reviews.apache.org/r/68490/#comment292015>
Interesting suggestion!
src/common/resources.cpp
Lines 1667-1670 (patched)
<https://reviews.apache.org/r/68490/#comment292018>
Seems odd, why don't we use `Resource` directly here?
```
Resource r = *resource_;
r.clear_reservations();
result.add(std::move(r));
```
(I left a comment above about removing the add that takes a shared_ptr, since it doesn't seem needed?)
src/common/resources.cpp
Line 1676 (original), 1692 (patched)
<https://reviews.apache.org/r/68490/#comment292019>
Feel free to pull some of these unrelated improvements out in front of this, it would help make the diff easier for reviewers.
src/common/resources.cpp
Lines 1924-1925 (original), 1940-1941 (patched)
<https://reviews.apache.org/r/68490/#comment292021>
Ditto here and elsewhere for the question above about const shared pointer loops vs `const Resource_&` loops
src/common/resources.cpp
Line 1954 (original), 1972-1979 (patched)
<https://reviews.apache.org/r/68490/#comment292022>
Why is this needed? Isn't the point of this change that toUnreserved avoids copying now?
src/common/resources.cpp
Lines 1958-1963 (original), 1983-1993 (patched)
<https://reviews.apache.org/r/68490/#comment292023>
Hm.. is there a way to simplify this? This seems to be just adding in the reserved resources? Could we call pushReservation or something instead of touching the resources field directly?
Right now `find()` is not performance critical, so it doesn't need to be optimal if it's simpler not to be.
- Benjamin Mahler
On Aug. 23, 2018, 10:13 p.m., Meng Zhu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68490/
> -----------------------------------------------------------
>
> (Updated Aug. 23, 2018, 10:13 p.m.)
>
>
> Review request for mesos and Benjamin Mahler.
>
>
> Bugs: MESOS-6765
> https://issues.apache.org/jira/browse/MESOS-6765
>
>
> Repository: mesos
>
>
> Description
> -------
>
> This patch lets `class Resources` only store shared
> pointers to the underlying resource objects, so that
> read-only filter operations such as `reserved()`,
> `unreserved()` and etc. can avoid making copies of
> the whole resource objects. Instead, only shared pointers
> are copied.
>
> In write operations, we check if there are more than one
> references to the resource object. If so, a copy is made
> for safe mutation without affecting owners.
>
> To maintain the usage abstraction that `class Resources`
> still holds resource objects, we utilize
> `boost::indirect_iterator` iterator adapter to deference
> the shared pointers as we iterate.
>
>
> Diffs
> -----
>
> include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d
> include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a
> src/common/resources.cpp 0110c0ee3e810ad1c29dfa5507b13ebd5d0222a2
> src/v1/resources.cpp 228a7327ffe7934d37b56ee67b8be9ae1e119ca8
>
>
> Diff: https://reviews.apache.org/r/68490/diff/1/
>
>
> Testing
> -------
>
> make check
>
> Did a quick test on Mac with an optimized build, running benchmark `HierarchicalAllocator_BENCHMARK_Test.ResourceLabels`, here are the results of comparing "before" and "after". Note, DVFS is not fixed. And we only did a partial run to verify the validity of the patch, full evaluation coming soon.
>
> **Overall, 33% performance improvement for the 1st steup (1000 agents and 1 frameworks) and 32% improvement for the first 50 iterarions of the 2nd setup (1000 agents and 50 frameworks).**
>
> Before:
>
> [ RUN ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/0
>
> Using 1000 agents and 1 frameworks
> Added 1 frameworks in 1.220022ms
> Added 1000 agents in 465.045382ms
> round 0 allocate() took 54.803773ms to make 0 offers after filtering 1000 offers
> round 1 allocate() took 66.690456ms to make 0 offers after filtering 1000 offers
>
> Using 1000 agents and 50 frameworks
> Added 50 frameworks in 1.40774ms
> Added 1000 agents in 460.562488ms
> round 0 allocate() took 155.500548ms to make 1000 offers after filtering 1000 offers
> round 1 allocate() took 155.194986ms to make 1000 offers after filtering 2000 offers
> round 2 allocate() took 160.706712ms to make 1000 offers after filtering 3000 offers
> round 3 allocate() took 156.701883ms to make 1000 offers after filtering 4000 offers
> round 4 allocate() took 164.246987ms to make 1000 offers after filtering 5000 offers
> round 5 allocate() took 160.791809ms to make 1000 offers after filtering 6000 offers
> round 6 allocate() took 166.064652ms to make 1000 offers after filtering 7000 offers
> round 7 allocate() took 162.177213ms to make 1000 offers after filtering 8000 offers
> round 8 allocate() took 166.620163ms to make 1000 offers after filtering 9000 offers
> round 9 allocate() took 167.970045ms to make 1000 offers after filtering 10000 offers
> round 10 allocate() took 166.189463ms to make 1000 offers after filtering 11000 offers
> round 11 allocate() took 170.863462ms to make 1000 offers after filtering 12000 offers
> round 12 allocate() took 172.260473ms to make 1000 offers after filtering 13000 offers
> round 13 allocate() took 170.553911ms to make 1000 offers after filtering 14000 offers
> round 14 allocate() took 235.764593ms to make 1000 offers after filtering 15000 offers
> round 15 allocate() took 247.250433ms to make 1000 offers after filtering 16000 offers
> round 16 allocate() took 276.932824ms to make 1000 offers after filtering 17000 offers
> round 17 allocate() took 248.888469ms to make 1000 offers after filtering 18000 offers
> round 18 allocate() took 193.890556ms to make 1000 offers after filtering 19000 offers
> round 19 allocate() took 195.105346ms to make 1000 offers after filtering 20000 offers
> round 20 allocate() took 194.447428ms to make 1000 offers after filtering 21000 offers
> round 21 allocate() took 205.486287ms to make 1000 offers after filtering 22000 offers
> round 22 allocate() took 199.241922ms to make 1000 offers after filtering 23000 offers
> round 23 allocate() took 200.885488ms to make 1000 offers after filtering 24000 offers
> round 24 allocate() took 216.132361ms to make 1000 offers after filtering 25000 offers
> round 25 allocate() took 210.638273ms to make 1000 offers after filtering 26000 offers
> round 26 allocate() took 232.397778ms to make 1000 offers after filtering 27000 offers
> round 27 allocate() took 239.633708ms to make 1000 offers after filtering 28000 offers
> round 28 allocate() took 225.829677ms to make 1000 offers after filtering 29000 offers
> round 29 allocate() took 245.143272ms to make 1000 offers after filtering 30000 offers
> round 30 allocate() took 248.630295ms to make 1000 offers after filtering 31000 offers
> round 31 allocate() took 262.147804ms to make 1000 offers after filtering 32000 offers
> round 32 allocate() took 251.109969ms to make 1000 offers after filtering 33000 offers
> round 33 allocate() took 263.92273ms to make 1000 offers after filtering 34000 offers
> round 34 allocate() took 268.422275ms to make 1000 offers after filtering 35000 offers
> round 35 allocate() took 273.830163ms to make 1000 offers after filtering 36000 offers
> round 36 allocate() took 283.25481ms to make 1000 offers after filtering 37000 offers
> round 37 allocate() took 366.400781ms to make 1000 offers after filtering 38000 offers
> round 38 allocate() took 382.202755ms to make 1000 offers after filtering 39000 offers
> round 39 allocate() took 337.266121ms to make 1000 offers after filtering 40000 offers
> round 40 allocate() took 381.033696ms to make 1000 offers after filtering 41000 offers
> round 41 allocate() took 365.941946ms to make 1000 offers after filtering 42000 offers
> round 42 allocate() took 379.527886ms to make 1000 offers after filtering 43000 offers
> round 43 allocate() took 425.661181ms to make 1000 offers after filtering 44000 offers
> round 44 allocate() took 455.86657ms to make 1000 offers after filtering 45000 offers
> round 45 allocate() took 506.943117ms to make 1000 offers after filtering 46000 offers
> round 46 allocate() took 681.880233ms to make 1000 offers after filtering 47000 offers
> round 47 allocate() took 860.932974ms to make 1000 offers after filtering 48000 offers
> round 48 allocate() took 960.272209ms to make 1000 offers after filtering 49000 offers
> round 49 allocate() took 1.907427057secs to make 0 offers after filtering 50000 offers
> round 50 allocate() took 2.157864654secs to make 0 offers after filtering 50000 offers
>
>
> After:
>
> Using 1000 agents and 1 frameworks
> Added 1 frameworks in 404709ns
> Added 1000 agents in 245.162138ms
> round 0 allocate() took 38.498883ms to make 0 offers after filtering 1000 offers
> round 1 allocate() took 42.234898ms to make 0 offers after filtering 1000 offers
>
> Using 1000 agents and 50 frameworks
> Added 50 frameworks in 1.441281ms
> Added 1000 agents in 261.923711ms
> round 0 allocate() took 121.706074ms to make 1000 offers after filtering 1000 offers
> round 1 allocate() took 150.465925ms to make 1000 offers after filtering 2000 offers
> round 2 allocate() took 273.549893ms to make 1000 offers after filtering 3000 offers
> round 3 allocate() took 199.248799ms to make 1000 offers after filtering 4000 offers
> round 4 allocate() took 133.865752ms to make 1000 offers after filtering 5000 offers
> round 5 allocate() took 130.828599ms to make 1000 offers after filtering 6000 offers
> round 6 allocate() took 134.385597ms to make 1000 offers after filtering 7000 offers
> round 7 allocate() took 135.810198ms to make 1000 offers after filtering 8000 offers
> round 8 allocate() took 126.128592ms to make 1000 offers after filtering 9000 offers
> round 9 allocate() took 144.794829ms to make 1000 offers after filtering 10000 offers
> round 10 allocate() took 162.533506ms to make 1000 offers after filtering 11000 offers
> round 11 allocate() took 159.22141ms to make 1000 offers after filtering 12000 offers
> round 12 allocate() took 174.739823ms to make 1000 offers after filtering 13000 offers
> round 13 allocate() took 171.095423ms to make 1000 offers after filtering 14000 offers
> round 14 allocate() took 186.876661ms to make 1000 offers after filtering 15000 offers
> round 15 allocate() took 177.021603ms to make 1000 offers after filtering 16000 offers
> round 16 allocate() took 165.970722ms to make 1000 offers after filtering 17000 offers
> round 17 allocate() took 162.016338ms to make 1000 offers after filtering 18000 offers
> round 18 allocate() took 138.698917ms to make 1000 offers after filtering 19000 offers
> round 19 allocate() took 144.556913ms to make 1000 offers after filtering 20000 offers
> round 20 allocate() took 155.689926ms to make 1000 offers after filtering 21000 offers
> round 21 allocate() took 149.952025ms to make 1000 offers after filtering 22000 offers
> round 22 allocate() took 135.98823ms to make 1000 offers after filtering 23000 offers
> round 23 allocate() took 132.520992ms to make 1000 offers after filtering 24000 offers
> round 24 allocate() took 143.325635ms to make 1000 offers after filtering 25000 offers
> round 25 allocate() took 153.313423ms to make 1000 offers after filtering 26000 offers
> round 26 allocate() took 169.889066ms to make 1000 offers after filtering 27000 offers
> round 27 allocate() took 188.969694ms to make 1000 offers after filtering 28000 offers
> round 28 allocate() took 176.132259ms to make 1000 offers after filtering 29000 offers
> round 29 allocate() took 186.754676ms to make 1000 offers after filtering 30000 offers
> round 30 allocate() took 166.346508ms to make 1000 offers after filtering 31000 offers
> round 31 allocate() took 172.557665ms to make 1000 offers after filtering 32000 offers
> round 32 allocate() took 169.874406ms to make 1000 offers after filtering 33000 offers
> round 33 allocate() took 190.470692ms to make 1000 offers after filtering 34000 offers
> round 34 allocate() took 184.328221ms to make 1000 offers after filtering 35000 offers
> round 35 allocate() took 222.081892ms to make 1000 offers after filtering 36000 offers
> round 36 allocate() took 203.134216ms to make 1000 offers after filtering 37000 offers
> round 37 allocate() took 217.490016ms to make 1000 offers after filtering 38000 offers
> round 38 allocate() took 257.449904ms to make 1000 offers after filtering 39000 offers
> round 39 allocate() took 252.468529ms to make 1000 offers after filtering 40000 offers
> round 40 allocate() took 229.433398ms to make 1000 offers after filtering 41000 offers
> round 41 allocate() took 251.920859ms to make 1000 offers after filtering 42000 offers
> round 42 allocate() took 273.529747ms to make 1000 offers after filtering 43000 offers
> round 43 allocate() took 315.08445ms to make 1000 offers after filtering 44000 offers
> round 44 allocate() took 354.758003ms to make 1000 offers after filtering 45000 offers
> round 45 allocate() took 350.378463ms to make 1000 offers after filtering 46000 offers
> round 46 allocate() took 415.070355ms to make 1000 offers after filtering 47000 offers
> round 47 allocate() took 519.922944ms to make 1000 offers after filtering 48000 offers
> round 48 allocate() took 710.598546ms to make 1000 offers after filtering 49000 offers
> round 49 allocate() took 1.267395251secs to make 0 offers after filtering 50000 offers
> round 50 allocate() took 1.210188235secs to make 0 offers after filtering 50000 offers
>
>
> Thanks,
>
> Meng Zhu
>
>
Re: Review Request 68490: Optimized `class Resources` with
copy-on-write.
Posted by Meng Zhu <mz...@mesosphere.io>.
> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > Thanks! Can you also include some commentary on the rest of the allocation benchmarks as well as the overhead within the Resources micro-benchmarks?
Posted the updated patches. Will summarize and post more comprehensive results soon.
> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > include/mesos/resources.hpp
> > Lines 678-679 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076881#file2076881line680>
> >
> > What's this referring to? Reading this I see it already has an rvalue reference?
Referring to `void add(std::shared_ptr<Resource_>&& that);`. Moved the comment down to clarifiy it.
> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > include/mesos/resources.hpp
> > Lines 682 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076881#file2076881line684>
> >
> > Why do we need this? It seems like the callers should just be de-referencing their shared_ptr prior to calling?
Consider adding two non-mergeble `Resources`(e.g. 1 cpu + 1 mem), if we only have `Resources::add(const Resource_& that)`, we will always need to `make_shared(Resource_)`. If we directly pass `shared_ptr`, we can then just `push_back`. For any non-mergable `Resource_`, we will be able to save one `make_shared`.
One rule of thumb I figured is that, we should avoid `make_shared` when possible (because it makes a copy). This means if we already have a `shared_ptr` we should just use it and pass it around. This also explains why I use `shared_ptr` for foreach loops.
> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1471 (original), 1473 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1473>
> >
> > I'm a little puzzled about these `const shared_ptr<...>` foreach loops. Why don't the const ones do `const Resource_&` loops?
> >
> > ```
> > foreach (const Resource_& resource_, that) {
> > ```
> >
> > Something I'm missing?
See my comment above regarding `add(const std::shared_ptr<Resource_>& that)`.
Actual benefits aside, I also want to make it consistent that inside the `class Resources`, we are speaking `shared_ptr` unless we want to make an implicit copy as in `push/popReservation`.
> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1503 (original), 1505 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1505>
> >
> > It seems ok to slip this in here, but generally please send unrelated cleanups as separate patches. :)
Got it.
> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1514 (original), 1516-1518 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1516>
> >
> > I'm not sure how obvious these will be to readers, maybe a comment on all of these?
> >
> > ```
> > // Copy-on-write (if more than 1 reference).
> > if (resource_.use_count() > 1) {
> > resource_ = make_shared<Resource_>(*resource_);
> > }
> > ```
> >
> > Maybe something like (since we can't use `mutable`):
> >
> > ```
> > make_mutable(resource_);
> > ```
> >
> > Maybe also make it composable?
> >
> > ```
> > make_mutable(resource_)->resource.mutable_allocation_info()->set_role(role);
> > ```
> >
> > Still, seems rather error prone and easy to forget to check for copies prior to mutating? Wonder if there's a way to make it less brittle.
Added the comment. Added a TODO regarding introducing a more controlled mutation interface.
> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Lines 1667-1670 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1667>
> >
> > Seems odd, why don't we use `Resource` directly here?
> >
> > ```
> > Resource r = *resource_;
> > r.clear_reservations();
> >
> > result.add(std::move(r));
> > ```
> >
> > (I left a comment above about removing the add that takes a shared_ptr, since it doesn't seem needed?)
See my comments regarding `add` shared_ptr above.
> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1676 (original), 1692 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1692>
> >
> > Feel free to pull some of these unrelated improvements out in front of this, it would help make the diff easier for reviewers.
Got it.
> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1954 (original), 1972-1979 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1972>
> >
> > Why is this needed? Isn't the point of this change that toUnreserved avoids copying now?
You are right, we can just do
```
Resources unreserved;
unreserved.add(resource_);
unreserved = unreserved.toUnreserved();
```
> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Lines 1958-1963 (original), 1983-1993 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1983>
> >
> > Hm.. is there a way to simplify this? This seems to be just adding in the reserved resources? Could we call pushReservation or something instead of touching the resources field directly?
> >
> > Right now `find()` is not performance critical, so it doesn't need to be optimal if it's simpler not to be.
Done. We will just make a copy then.
- Meng
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208286
-----------------------------------------------------------
On Sept. 5, 2018, 12:17 p.m., Meng Zhu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68490/
> -----------------------------------------------------------
>
> (Updated Sept. 5, 2018, 12:17 p.m.)
>
>
> Review request for mesos, Benjamin Mahler and Gastón Kleiman.
>
>
> Bugs: MESOS-6765
> https://issues.apache.org/jira/browse/MESOS-6765
>
>
> Repository: mesos
>
>
> Description
> -------
>
> This patch lets `class Resources` only store shared
> pointers to the underlying resource objects, so that
> read-only filter operations such as `reserved()`,
> `unreserved()` and etc. can avoid making copies of
> the whole resource objects. Instead, only shared pointers
> are copied.
>
> In write operations, we check if there are more than one
> references to the resource object. If so, a copy is made
> for safe mutation without affecting owners.
>
> To maintain the usage abstraction that `class Resources`
> still holds resource objects, we utilize
> `boost::indirect_iterator` iterator adapter to deference
> the shared pointers as we iterate.
>
>
> Diffs
> -----
>
> include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d
> include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a
> src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7
> src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d
>
>
> Diff: https://reviews.apache.org/r/68490/diff/2/
>
>
> Testing
> -------
>
> make check
>
> Did a quick test on Mac with an optimized build, running benchmark `HierarchicalAllocator_BENCHMARK_Test.ResourceLabels`, here are the results of comparing "before" and "after". Note, DVFS is not fixed. And we only did a partial run to verify the validity of the patch, full evaluation coming soon.
>
> **Overall, 33% performance improvement for the 1st steup (1000 agents and 1 frameworks) and 32% improvement for the first 50 iterarions of the 2nd setup (1000 agents and 50 frameworks).**
>
> Before:
>
> [ RUN ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/0
>
> Using 1000 agents and 1 frameworks
> Added 1 frameworks in 1.220022ms
> Added 1000 agents in 465.045382ms
> round 0 allocate() took 54.803773ms to make 0 offers after filtering 1000 offers
> round 1 allocate() took 66.690456ms to make 0 offers after filtering 1000 offers
>
> Using 1000 agents and 50 frameworks
> Added 50 frameworks in 1.40774ms
> Added 1000 agents in 460.562488ms
> round 0 allocate() took 155.500548ms to make 1000 offers after filtering 1000 offers
> round 1 allocate() took 155.194986ms to make 1000 offers after filtering 2000 offers
> round 2 allocate() took 160.706712ms to make 1000 offers after filtering 3000 offers
> round 3 allocate() took 156.701883ms to make 1000 offers after filtering 4000 offers
> round 4 allocate() took 164.246987ms to make 1000 offers after filtering 5000 offers
> round 5 allocate() took 160.791809ms to make 1000 offers after filtering 6000 offers
> round 6 allocate() took 166.064652ms to make 1000 offers after filtering 7000 offers
> round 7 allocate() took 162.177213ms to make 1000 offers after filtering 8000 offers
> round 8 allocate() took 166.620163ms to make 1000 offers after filtering 9000 offers
> round 9 allocate() took 167.970045ms to make 1000 offers after filtering 10000 offers
> round 10 allocate() took 166.189463ms to make 1000 offers after filtering 11000 offers
> round 11 allocate() took 170.863462ms to make 1000 offers after filtering 12000 offers
> round 12 allocate() took 172.260473ms to make 1000 offers after filtering 13000 offers
> round 13 allocate() took 170.553911ms to make 1000 offers after filtering 14000 offers
> round 14 allocate() took 235.764593ms to make 1000 offers after filtering 15000 offers
> round 15 allocate() took 247.250433ms to make 1000 offers after filtering 16000 offers
> round 16 allocate() took 276.932824ms to make 1000 offers after filtering 17000 offers
> round 17 allocate() took 248.888469ms to make 1000 offers after filtering 18000 offers
> round 18 allocate() took 193.890556ms to make 1000 offers after filtering 19000 offers
> round 19 allocate() took 195.105346ms to make 1000 offers after filtering 20000 offers
> round 20 allocate() took 194.447428ms to make 1000 offers after filtering 21000 offers
> round 21 allocate() took 205.486287ms to make 1000 offers after filtering 22000 offers
> round 22 allocate() took 199.241922ms to make 1000 offers after filtering 23000 offers
> round 23 allocate() took 200.885488ms to make 1000 offers after filtering 24000 offers
> round 24 allocate() took 216.132361ms to make 1000 offers after filtering 25000 offers
> round 25 allocate() took 210.638273ms to make 1000 offers after filtering 26000 offers
> round 26 allocate() took 232.397778ms to make 1000 offers after filtering 27000 offers
> round 27 allocate() took 239.633708ms to make 1000 offers after filtering 28000 offers
> round 28 allocate() took 225.829677ms to make 1000 offers after filtering 29000 offers
> round 29 allocate() took 245.143272ms to make 1000 offers after filtering 30000 offers
> round 30 allocate() took 248.630295ms to make 1000 offers after filtering 31000 offers
> round 31 allocate() took 262.147804ms to make 1000 offers after filtering 32000 offers
> round 32 allocate() took 251.109969ms to make 1000 offers after filtering 33000 offers
> round 33 allocate() took 263.92273ms to make 1000 offers after filtering 34000 offers
> round 34 allocate() took 268.422275ms to make 1000 offers after filtering 35000 offers
> round 35 allocate() took 273.830163ms to make 1000 offers after filtering 36000 offers
> round 36 allocate() took 283.25481ms to make 1000 offers after filtering 37000 offers
> round 37 allocate() took 366.400781ms to make 1000 offers after filtering 38000 offers
> round 38 allocate() took 382.202755ms to make 1000 offers after filtering 39000 offers
> round 39 allocate() took 337.266121ms to make 1000 offers after filtering 40000 offers
> round 40 allocate() took 381.033696ms to make 1000 offers after filtering 41000 offers
> round 41 allocate() took 365.941946ms to make 1000 offers after filtering 42000 offers
> round 42 allocate() took 379.527886ms to make 1000 offers after filtering 43000 offers
> round 43 allocate() took 425.661181ms to make 1000 offers after filtering 44000 offers
> round 44 allocate() took 455.86657ms to make 1000 offers after filtering 45000 offers
> round 45 allocate() took 506.943117ms to make 1000 offers after filtering 46000 offers
> round 46 allocate() took 681.880233ms to make 1000 offers after filtering 47000 offers
> round 47 allocate() took 860.932974ms to make 1000 offers after filtering 48000 offers
> round 48 allocate() took 960.272209ms to make 1000 offers after filtering 49000 offers
> round 49 allocate() took 1.907427057secs to make 0 offers after filtering 50000 offers
> round 50 allocate() took 2.157864654secs to make 0 offers after filtering 50000 offers
>
>
> After:
>
> Using 1000 agents and 1 frameworks
> Added 1 frameworks in 404709ns
> Added 1000 agents in 245.162138ms
> round 0 allocate() took 38.498883ms to make 0 offers after filtering 1000 offers
> round 1 allocate() took 42.234898ms to make 0 offers after filtering 1000 offers
>
> Using 1000 agents and 50 frameworks
> Added 50 frameworks in 1.441281ms
> Added 1000 agents in 261.923711ms
> round 0 allocate() took 121.706074ms to make 1000 offers after filtering 1000 offers
> round 1 allocate() took 150.465925ms to make 1000 offers after filtering 2000 offers
> round 2 allocate() took 273.549893ms to make 1000 offers after filtering 3000 offers
> round 3 allocate() took 199.248799ms to make 1000 offers after filtering 4000 offers
> round 4 allocate() took 133.865752ms to make 1000 offers after filtering 5000 offers
> round 5 allocate() took 130.828599ms to make 1000 offers after filtering 6000 offers
> round 6 allocate() took 134.385597ms to make 1000 offers after filtering 7000 offers
> round 7 allocate() took 135.810198ms to make 1000 offers after filtering 8000 offers
> round 8 allocate() took 126.128592ms to make 1000 offers after filtering 9000 offers
> round 9 allocate() took 144.794829ms to make 1000 offers after filtering 10000 offers
> round 10 allocate() took 162.533506ms to make 1000 offers after filtering 11000 offers
> round 11 allocate() took 159.22141ms to make 1000 offers after filtering 12000 offers
> round 12 allocate() took 174.739823ms to make 1000 offers after filtering 13000 offers
> round 13 allocate() took 171.095423ms to make 1000 offers after filtering 14000 offers
> round 14 allocate() took 186.876661ms to make 1000 offers after filtering 15000 offers
> round 15 allocate() took 177.021603ms to make 1000 offers after filtering 16000 offers
> round 16 allocate() took 165.970722ms to make 1000 offers after filtering 17000 offers
> round 17 allocate() took 162.016338ms to make 1000 offers after filtering 18000 offers
> round 18 allocate() took 138.698917ms to make 1000 offers after filtering 19000 offers
> round 19 allocate() took 144.556913ms to make 1000 offers after filtering 20000 offers
> round 20 allocate() took 155.689926ms to make 1000 offers after filtering 21000 offers
> round 21 allocate() took 149.952025ms to make 1000 offers after filtering 22000 offers
> round 22 allocate() took 135.98823ms to make 1000 offers after filtering 23000 offers
> round 23 allocate() took 132.520992ms to make 1000 offers after filtering 24000 offers
> round 24 allocate() took 143.325635ms to make 1000 offers after filtering 25000 offers
> round 25 allocate() took 153.313423ms to make 1000 offers after filtering 26000 offers
> round 26 allocate() took 169.889066ms to make 1000 offers after filtering 27000 offers
> round 27 allocate() took 188.969694ms to make 1000 offers after filtering 28000 offers
> round 28 allocate() took 176.132259ms to make 1000 offers after filtering 29000 offers
> round 29 allocate() took 186.754676ms to make 1000 offers after filtering 30000 offers
> round 30 allocate() took 166.346508ms to make 1000 offers after filtering 31000 offers
> round 31 allocate() took 172.557665ms to make 1000 offers after filtering 32000 offers
> round 32 allocate() took 169.874406ms to make 1000 offers after filtering 33000 offers
> round 33 allocate() took 190.470692ms to make 1000 offers after filtering 34000 offers
> round 34 allocate() took 184.328221ms to make 1000 offers after filtering 35000 offers
> round 35 allocate() took 222.081892ms to make 1000 offers after filtering 36000 offers
> round 36 allocate() took 203.134216ms to make 1000 offers after filtering 37000 offers
> round 37 allocate() took 217.490016ms to make 1000 offers after filtering 38000 offers
> round 38 allocate() took 257.449904ms to make 1000 offers after filtering 39000 offers
> round 39 allocate() took 252.468529ms to make 1000 offers after filtering 40000 offers
> round 40 allocate() took 229.433398ms to make 1000 offers after filtering 41000 offers
> round 41 allocate() took 251.920859ms to make 1000 offers after filtering 42000 offers
> round 42 allocate() took 273.529747ms to make 1000 offers after filtering 43000 offers
> round 43 allocate() took 315.08445ms to make 1000 offers after filtering 44000 offers
> round 44 allocate() took 354.758003ms to make 1000 offers after filtering 45000 offers
> round 45 allocate() took 350.378463ms to make 1000 offers after filtering 46000 offers
> round 46 allocate() took 415.070355ms to make 1000 offers after filtering 47000 offers
> round 47 allocate() took 519.922944ms to make 1000 offers after filtering 48000 offers
> round 48 allocate() took 710.598546ms to make 1000 offers after filtering 49000 offers
> round 49 allocate() took 1.267395251secs to make 0 offers after filtering 50000 offers
> round 50 allocate() took 1.210188235secs to make 0 offers after filtering 50000 offers
>
>
> Thanks,
>
> Meng Zhu
>
>