You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@mesos.apache.org by Benjamin Mahler <bm...@apache.org> on 2018/09/04 04:59:59 UTC

Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208286
-----------------------------------------------------------



Thanks! Can you also include some commentary on the rest of the allocation benchmarks as well as the overhead within the Resources micro-benchmarks?


include/mesos/resources.hpp
Lines 590-591 (patched)
<https://reviews.apache.org/r/68490/#comment292011>

    ```
      // We use `boost::indirect_iterator` to expose `Resource_` iteration,
      // while actually storing `shared_ptr<Resource_>`.
    ```



include/mesos/resources.hpp
Lines 592 (patched)
<https://reviews.apache.org/r/68490/#comment292010>

    Remove the newline?



include/mesos/resources.hpp
Lines 588-590 (original), 597-599 (patched)
<https://reviews.apache.org/r/68490/#comment292009>

    This comment now looks a little weird, how about:
    
    ```
      // NOTE: Non-const `begin()` and `end()` intentionally return const
      // iterators to prevent mutable access to the `Resource` objects.
    ```



include/mesos/resources.hpp
Lines 678-679 (patched)
<https://reviews.apache.org/r/68490/#comment292012>

    What's this referring to? Reading this I see it already has an rvalue reference?



include/mesos/resources.hpp
Lines 682 (patched)
<https://reviews.apache.org/r/68490/#comment292017>

    Why do we need this? It seems like the callers should just be de-referencing their shared_ptr prior to calling?



include/mesos/resources.hpp
Line 676 (original), 691-693 (patched)
<https://reviews.apache.org/r/68490/#comment292013>

    A comment would be helpful for the reader:
    
    ```
    // Resources are stored using copy-on-write:
    //
    //   (1) Copies are done by copying the `shared_ptr`. This
    //       makes read-only filtering (e.g. `unreserved()`)
    //       inexpensive as we do not have to perform copies
    //       of the resource objects.
    //
    //   (2) When a write occurs:
    //      (a) If there's a single reference to the resource
    //          object, we mutate directly.
    //      (b) If there's more than a single reference to the
    //          resource object, we copy first, then mutate the
    //          copy.
    ```



src/common/resources.cpp
Line 1471 (original), 1473 (patched)
<https://reviews.apache.org/r/68490/#comment292020>

    I'm a little puzzled about these `const shared_ptr<...>` foreach loops. Why don't the const ones do `const Resource_&` loops?
    
    ```
    foreach (const Resource_& resource_, that) {
    ```
    
    Something I'm missing?



src/common/resources.cpp
Line 1503 (original), 1505 (patched)
<https://reviews.apache.org/r/68490/#comment292014>

    It seems ok to slip this in here, but generally please send unrelated cleanups as separate patches. :)



src/common/resources.cpp
Line 1514 (original), 1516-1518 (patched)
<https://reviews.apache.org/r/68490/#comment292016>

    I'm not sure how obvious these will be to readers, maybe a comment on all of these?
    
    ```
    // Copy-on-write (if more than 1 reference).
    if (resource_.use_count() > 1) {
      resource_ = make_shared<Resource_>(*resource_);
    }
    ```
    
    Maybe something like (since we can't use `mutable`):
    
    ```
    make_mutable(resource_);
    ```
    
    Maybe also make it composable?
    
    ```
    make_mutable(resource_)->resource.mutable_allocation_info()->set_role(role);
    ```
    
    Still, seems rather error prone and easy to forget to check for copies prior to mutating? Wonder if there's a way to make it less brittle.



src/common/resources.cpp
Lines 1543-1544 (patched)
<https://reviews.apache.org/r/68490/#comment292015>

    Interesting suggestion!



src/common/resources.cpp
Lines 1667-1670 (patched)
<https://reviews.apache.org/r/68490/#comment292018>

    Seems odd, why don't we use `Resource` directly here?
    
    ```
          Resource r = *resource_;
          r.clear_reservations();
    
          result.add(std::move(r));
    ```
    
    (I left a comment above about removing the add that takes a shared_ptr, since it doesn't seem needed?)



src/common/resources.cpp
Line 1676 (original), 1692 (patched)
<https://reviews.apache.org/r/68490/#comment292019>

    Feel free to pull some of these unrelated improvements out in front of this, it would help make the diff easier for reviewers.



src/common/resources.cpp
Lines 1924-1925 (original), 1940-1941 (patched)
<https://reviews.apache.org/r/68490/#comment292021>

    Ditto here and elsewhere for the question above about const shared pointer loops vs `const Resource_&` loops



src/common/resources.cpp
Line 1954 (original), 1972-1979 (patched)
<https://reviews.apache.org/r/68490/#comment292022>

    Why is this needed? Isn't the point of this change that toUnreserved avoids copying now?



src/common/resources.cpp
Lines 1958-1963 (original), 1983-1993 (patched)
<https://reviews.apache.org/r/68490/#comment292023>

    Hm.. is there a way to simplify this? This seems to be just adding in the reserved resources? Could we call pushReservation or something instead of touching the resources field directly?
    
    Right now `find()` is not performance critical, so it doesn't need to be optimal if it's simpler not to be.


- Benjamin Mahler


On Aug. 23, 2018, 10:13 p.m., Meng Zhu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68490/
> -----------------------------------------------------------
> 
> (Updated Aug. 23, 2018, 10:13 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-6765
>     https://issues.apache.org/jira/browse/MESOS-6765
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch lets `class Resources` only store shared
> pointers to the underlying resource objects, so that
> read-only filter operations such as `reserved()`,
> `unreserved()` and etc. can avoid making copies of
> the whole resource objects. Instead, only shared pointers
> are copied.
> 
> In write operations, we check if there are more than one
> references to the resource object. If so, a copy is made
> for safe mutation without affecting owners.
> 
> To maintain the usage abstraction that `class Resources`
> still holds resource objects, we utilize
> `boost::indirect_iterator` iterator adapter to deference
> the shared pointers as we iterate.
> 
> 
> Diffs
> -----
> 
>   include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
>   include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
>   src/common/resources.cpp 0110c0ee3e810ad1c29dfa5507b13ebd5d0222a2 
>   src/v1/resources.cpp 228a7327ffe7934d37b56ee67b8be9ae1e119ca8 
> 
> 
> Diff: https://reviews.apache.org/r/68490/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> Did a quick test on Mac with an optimized build, running benchmark `HierarchicalAllocator_BENCHMARK_Test.ResourceLabels`, here are the results of comparing "before" and "after". Note, DVFS is not fixed. And we only did a partial run to verify the validity of the patch, full evaluation coming soon.
> 
> **Overall, 33% performance improvement for the 1st steup (1000 agents and 1 frameworks) and 32% improvement for the first 50 iterarions of the 2nd setup (1000 agents and 50 frameworks).**
> 
> Before:
> 
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/0
> 
> Using 1000 agents and 1 frameworks
> Added 1 frameworks in 1.220022ms
> Added 1000 agents in 465.045382ms
> round 0 allocate() took 54.803773ms to make 0 offers after filtering 1000 offers
> round 1 allocate() took 66.690456ms to make 0 offers after filtering 1000 offers
> 
> Using 1000 agents and 50 frameworks
> Added 50 frameworks in 1.40774ms
> Added 1000 agents in 460.562488ms
> round 0 allocate() took 155.500548ms to make 1000 offers after filtering 1000 offers
> round 1 allocate() took 155.194986ms to make 1000 offers after filtering 2000 offers
> round 2 allocate() took 160.706712ms to make 1000 offers after filtering 3000 offers
> round 3 allocate() took 156.701883ms to make 1000 offers after filtering 4000 offers
> round 4 allocate() took 164.246987ms to make 1000 offers after filtering 5000 offers
> round 5 allocate() took 160.791809ms to make 1000 offers after filtering 6000 offers
> round 6 allocate() took 166.064652ms to make 1000 offers after filtering 7000 offers
> round 7 allocate() took 162.177213ms to make 1000 offers after filtering 8000 offers
> round 8 allocate() took 166.620163ms to make 1000 offers after filtering 9000 offers
> round 9 allocate() took 167.970045ms to make 1000 offers after filtering 10000 offers
> round 10 allocate() took 166.189463ms to make 1000 offers after filtering 11000 offers
> round 11 allocate() took 170.863462ms to make 1000 offers after filtering 12000 offers
> round 12 allocate() took 172.260473ms to make 1000 offers after filtering 13000 offers
> round 13 allocate() took 170.553911ms to make 1000 offers after filtering 14000 offers
> round 14 allocate() took 235.764593ms to make 1000 offers after filtering 15000 offers
> round 15 allocate() took 247.250433ms to make 1000 offers after filtering 16000 offers
> round 16 allocate() took 276.932824ms to make 1000 offers after filtering 17000 offers
> round 17 allocate() took 248.888469ms to make 1000 offers after filtering 18000 offers
> round 18 allocate() took 193.890556ms to make 1000 offers after filtering 19000 offers
> round 19 allocate() took 195.105346ms to make 1000 offers after filtering 20000 offers
> round 20 allocate() took 194.447428ms to make 1000 offers after filtering 21000 offers
> round 21 allocate() took 205.486287ms to make 1000 offers after filtering 22000 offers
> round 22 allocate() took 199.241922ms to make 1000 offers after filtering 23000 offers
> round 23 allocate() took 200.885488ms to make 1000 offers after filtering 24000 offers
> round 24 allocate() took 216.132361ms to make 1000 offers after filtering 25000 offers
> round 25 allocate() took 210.638273ms to make 1000 offers after filtering 26000 offers
> round 26 allocate() took 232.397778ms to make 1000 offers after filtering 27000 offers
> round 27 allocate() took 239.633708ms to make 1000 offers after filtering 28000 offers
> round 28 allocate() took 225.829677ms to make 1000 offers after filtering 29000 offers
> round 29 allocate() took 245.143272ms to make 1000 offers after filtering 30000 offers
> round 30 allocate() took 248.630295ms to make 1000 offers after filtering 31000 offers
> round 31 allocate() took 262.147804ms to make 1000 offers after filtering 32000 offers
> round 32 allocate() took 251.109969ms to make 1000 offers after filtering 33000 offers
> round 33 allocate() took 263.92273ms to make 1000 offers after filtering 34000 offers
> round 34 allocate() took 268.422275ms to make 1000 offers after filtering 35000 offers
> round 35 allocate() took 273.830163ms to make 1000 offers after filtering 36000 offers
> round 36 allocate() took 283.25481ms to make 1000 offers after filtering 37000 offers
> round 37 allocate() took 366.400781ms to make 1000 offers after filtering 38000 offers
> round 38 allocate() took 382.202755ms to make 1000 offers after filtering 39000 offers
> round 39 allocate() took 337.266121ms to make 1000 offers after filtering 40000 offers
> round 40 allocate() took 381.033696ms to make 1000 offers after filtering 41000 offers
> round 41 allocate() took 365.941946ms to make 1000 offers after filtering 42000 offers
> round 42 allocate() took 379.527886ms to make 1000 offers after filtering 43000 offers
> round 43 allocate() took 425.661181ms to make 1000 offers after filtering 44000 offers
> round 44 allocate() took 455.86657ms to make 1000 offers after filtering 45000 offers
> round 45 allocate() took 506.943117ms to make 1000 offers after filtering 46000 offers
> round 46 allocate() took 681.880233ms to make 1000 offers after filtering 47000 offers
> round 47 allocate() took 860.932974ms to make 1000 offers after filtering 48000 offers
> round 48 allocate() took 960.272209ms to make 1000 offers after filtering 49000 offers
> round 49 allocate() took 1.907427057secs to make 0 offers after filtering 50000 offers
> round 50 allocate() took 2.157864654secs to make 0 offers after filtering 50000 offers
> 
> 
> After:
> 
> Using 1000 agents and 1 frameworks
> Added 1 frameworks in 404709ns
> Added 1000 agents in 245.162138ms
> round 0 allocate() took 38.498883ms to make 0 offers after filtering 1000 offers
> round 1 allocate() took 42.234898ms to make 0 offers after filtering 1000 offers
> 
> Using 1000 agents and 50 frameworks
> Added 50 frameworks in 1.441281ms
> Added 1000 agents in 261.923711ms
> round 0 allocate() took 121.706074ms to make 1000 offers after filtering 1000 offers
> round 1 allocate() took 150.465925ms to make 1000 offers after filtering 2000 offers
> round 2 allocate() took 273.549893ms to make 1000 offers after filtering 3000 offers
> round 3 allocate() took 199.248799ms to make 1000 offers after filtering 4000 offers
> round 4 allocate() took 133.865752ms to make 1000 offers after filtering 5000 offers
> round 5 allocate() took 130.828599ms to make 1000 offers after filtering 6000 offers
> round 6 allocate() took 134.385597ms to make 1000 offers after filtering 7000 offers
> round 7 allocate() took 135.810198ms to make 1000 offers after filtering 8000 offers
> round 8 allocate() took 126.128592ms to make 1000 offers after filtering 9000 offers
> round 9 allocate() took 144.794829ms to make 1000 offers after filtering 10000 offers
> round 10 allocate() took 162.533506ms to make 1000 offers after filtering 11000 offers
> round 11 allocate() took 159.22141ms to make 1000 offers after filtering 12000 offers
> round 12 allocate() took 174.739823ms to make 1000 offers after filtering 13000 offers
> round 13 allocate() took 171.095423ms to make 1000 offers after filtering 14000 offers
> round 14 allocate() took 186.876661ms to make 1000 offers after filtering 15000 offers
> round 15 allocate() took 177.021603ms to make 1000 offers after filtering 16000 offers
> round 16 allocate() took 165.970722ms to make 1000 offers after filtering 17000 offers
> round 17 allocate() took 162.016338ms to make 1000 offers after filtering 18000 offers
> round 18 allocate() took 138.698917ms to make 1000 offers after filtering 19000 offers
> round 19 allocate() took 144.556913ms to make 1000 offers after filtering 20000 offers
> round 20 allocate() took 155.689926ms to make 1000 offers after filtering 21000 offers
> round 21 allocate() took 149.952025ms to make 1000 offers after filtering 22000 offers
> round 22 allocate() took 135.98823ms to make 1000 offers after filtering 23000 offers
> round 23 allocate() took 132.520992ms to make 1000 offers after filtering 24000 offers
> round 24 allocate() took 143.325635ms to make 1000 offers after filtering 25000 offers
> round 25 allocate() took 153.313423ms to make 1000 offers after filtering 26000 offers
> round 26 allocate() took 169.889066ms to make 1000 offers after filtering 27000 offers
> round 27 allocate() took 188.969694ms to make 1000 offers after filtering 28000 offers
> round 28 allocate() took 176.132259ms to make 1000 offers after filtering 29000 offers
> round 29 allocate() took 186.754676ms to make 1000 offers after filtering 30000 offers
> round 30 allocate() took 166.346508ms to make 1000 offers after filtering 31000 offers
> round 31 allocate() took 172.557665ms to make 1000 offers after filtering 32000 offers
> round 32 allocate() took 169.874406ms to make 1000 offers after filtering 33000 offers
> round 33 allocate() took 190.470692ms to make 1000 offers after filtering 34000 offers
> round 34 allocate() took 184.328221ms to make 1000 offers after filtering 35000 offers
> round 35 allocate() took 222.081892ms to make 1000 offers after filtering 36000 offers
> round 36 allocate() took 203.134216ms to make 1000 offers after filtering 37000 offers
> round 37 allocate() took 217.490016ms to make 1000 offers after filtering 38000 offers
> round 38 allocate() took 257.449904ms to make 1000 offers after filtering 39000 offers
> round 39 allocate() took 252.468529ms to make 1000 offers after filtering 40000 offers
> round 40 allocate() took 229.433398ms to make 1000 offers after filtering 41000 offers
> round 41 allocate() took 251.920859ms to make 1000 offers after filtering 42000 offers
> round 42 allocate() took 273.529747ms to make 1000 offers after filtering 43000 offers
> round 43 allocate() took 315.08445ms to make 1000 offers after filtering 44000 offers
> round 44 allocate() took 354.758003ms to make 1000 offers after filtering 45000 offers
> round 45 allocate() took 350.378463ms to make 1000 offers after filtering 46000 offers
> round 46 allocate() took 415.070355ms to make 1000 offers after filtering 47000 offers
> round 47 allocate() took 519.922944ms to make 1000 offers after filtering 48000 offers
> round 48 allocate() took 710.598546ms to make 1000 offers after filtering 49000 offers
> round 49 allocate() took 1.267395251secs to make 0 offers after filtering 50000 offers
> round 50 allocate() took 1.210188235secs to make 0 offers after filtering 50000 offers
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>

Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

Posted by Meng Zhu <mz...@mesosphere.io>.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > Thanks! Can you also include some commentary on the rest of the allocation benchmarks as well as the overhead within the Resources micro-benchmarks?

Posted the updated patches. Will summarize and post more comprehensive results soon.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > include/mesos/resources.hpp
> > Lines 678-679 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076881#file2076881line680>
> >
> >     What's this referring to? Reading this I see it already has an rvalue reference?

Referring to `void add(std::shared_ptr<Resource_>&& that);`. Moved the comment down to clarifiy it.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > include/mesos/resources.hpp
> > Lines 682 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076881#file2076881line684>
> >
> >     Why do we need this? It seems like the callers should just be de-referencing their shared_ptr prior to calling?

Consider adding two non-mergeble `Resources`(e.g. 1 cpu + 1 mem), if we only have `Resources::add(const Resource_& that)`, we will always need to `make_shared(Resource_)`. If we directly pass `shared_ptr`, we can then just `push_back`. For any non-mergable `Resource_`, we will be able to save one `make_shared`.

One rule of thumb I figured is that, we should avoid `make_shared` when possible (because it makes a copy). This means if we already have a `shared_ptr` we should just use it and pass it around. This also explains why I use `shared_ptr` for foreach loops.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1471 (original), 1473 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1473>
> >
> >     I'm a little puzzled about these `const shared_ptr<...>` foreach loops. Why don't the const ones do `const Resource_&` loops?
> >     
> >     ```
> >     foreach (const Resource_& resource_, that) {
> >     ```
> >     
> >     Something I'm missing?

See my comment above regarding `add(const std::shared_ptr<Resource_>& that)`.

Actual benefits aside, I also want to make it consistent that inside the `class Resources`, we are speaking `shared_ptr` unless we want to make an implicit copy as in `push/popReservation`.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1503 (original), 1505 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1505>
> >
> >     It seems ok to slip this in here, but generally please send unrelated cleanups as separate patches. :)

Got it.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1514 (original), 1516-1518 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1516>
> >
> >     I'm not sure how obvious these will be to readers, maybe a comment on all of these?
> >     
> >     ```
> >     // Copy-on-write (if more than 1 reference).
> >     if (resource_.use_count() > 1) {
> >       resource_ = make_shared<Resource_>(*resource_);
> >     }
> >     ```
> >     
> >     Maybe something like (since we can't use `mutable`):
> >     
> >     ```
> >     make_mutable(resource_);
> >     ```
> >     
> >     Maybe also make it composable?
> >     
> >     ```
> >     make_mutable(resource_)->resource.mutable_allocation_info()->set_role(role);
> >     ```
> >     
> >     Still, seems rather error prone and easy to forget to check for copies prior to mutating? Wonder if there's a way to make it less brittle.

Added the comment. Added a TODO regarding introducing a more controlled mutation interface.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Lines 1667-1670 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1667>
> >
> >     Seems odd, why don't we use `Resource` directly here?
> >     
> >     ```
> >           Resource r = *resource_;
> >           r.clear_reservations();
> >     
> >           result.add(std::move(r));
> >     ```
> >     
> >     (I left a comment above about removing the add that takes a shared_ptr, since it doesn't seem needed?)

See my comments regarding `add` shared_ptr above.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1676 (original), 1692 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1692>
> >
> >     Feel free to pull some of these unrelated improvements out in front of this, it would help make the diff easier for reviewers.

Got it.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1954 (original), 1972-1979 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1972>
> >
> >     Why is this needed? Isn't the point of this change that toUnreserved avoids copying now?

You are right, we can just do

```
Resources unreserved;
unreserved.add(resource_);
unreserved = unreserved.toUnreserved();
```


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Lines 1958-1963 (original), 1983-1993 (patched)
> > <https://reviews.apache.org/r/68490/diff/1/?file=2076883#file2076883line1983>
> >
> >     Hm.. is there a way to simplify this? This seems to be just adding in the reserved resources? Could we call pushReservation or something instead of touching the resources field directly?
> >     
> >     Right now `find()` is not performance critical, so it doesn't need to be optimal if it's simpler not to be.

Done. We will just make a copy then.


- Meng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208286
-----------------------------------------------------------


On Sept. 5, 2018, 12:17 p.m., Meng Zhu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68490/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2018, 12:17 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Gastón Kleiman.
> 
> 
> Bugs: MESOS-6765
>     https://issues.apache.org/jira/browse/MESOS-6765
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch lets `class Resources` only store shared
> pointers to the underlying resource objects, so that
> read-only filter operations such as `reserved()`,
> `unreserved()` and etc. can avoid making copies of
> the whole resource objects. Instead, only shared pointers
> are copied.
> 
> In write operations, we check if there are more than one
> references to the resource object. If so, a copy is made
> for safe mutation without affecting owners.
> 
> To maintain the usage abstraction that `class Resources`
> still holds resource objects, we utilize
> `boost::indirect_iterator` iterator adapter to deference
> the shared pointers as we iterate.
> 
> 
> Diffs
> -----
> 
>   include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
>   include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
>   src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
>   src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 
> 
> 
> Diff: https://reviews.apache.org/r/68490/diff/2/
> 
> 
> Testing
> -------
> 
> make check
> 
> Did a quick test on Mac with an optimized build, running benchmark `HierarchicalAllocator_BENCHMARK_Test.ResourceLabels`, here are the results of comparing "before" and "after". Note, DVFS is not fixed. And we only did a partial run to verify the validity of the patch, full evaluation coming soon.
> 
> **Overall, 33% performance improvement for the 1st steup (1000 agents and 1 frameworks) and 32% improvement for the first 50 iterarions of the 2nd setup (1000 agents and 50 frameworks).**
> 
> Before:
> 
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/0
> 
> Using 1000 agents and 1 frameworks
> Added 1 frameworks in 1.220022ms
> Added 1000 agents in 465.045382ms
> round 0 allocate() took 54.803773ms to make 0 offers after filtering 1000 offers
> round 1 allocate() took 66.690456ms to make 0 offers after filtering 1000 offers
> 
> Using 1000 agents and 50 frameworks
> Added 50 frameworks in 1.40774ms
> Added 1000 agents in 460.562488ms
> round 0 allocate() took 155.500548ms to make 1000 offers after filtering 1000 offers
> round 1 allocate() took 155.194986ms to make 1000 offers after filtering 2000 offers
> round 2 allocate() took 160.706712ms to make 1000 offers after filtering 3000 offers
> round 3 allocate() took 156.701883ms to make 1000 offers after filtering 4000 offers
> round 4 allocate() took 164.246987ms to make 1000 offers after filtering 5000 offers
> round 5 allocate() took 160.791809ms to make 1000 offers after filtering 6000 offers
> round 6 allocate() took 166.064652ms to make 1000 offers after filtering 7000 offers
> round 7 allocate() took 162.177213ms to make 1000 offers after filtering 8000 offers
> round 8 allocate() took 166.620163ms to make 1000 offers after filtering 9000 offers
> round 9 allocate() took 167.970045ms to make 1000 offers after filtering 10000 offers
> round 10 allocate() took 166.189463ms to make 1000 offers after filtering 11000 offers
> round 11 allocate() took 170.863462ms to make 1000 offers after filtering 12000 offers
> round 12 allocate() took 172.260473ms to make 1000 offers after filtering 13000 offers
> round 13 allocate() took 170.553911ms to make 1000 offers after filtering 14000 offers
> round 14 allocate() took 235.764593ms to make 1000 offers after filtering 15000 offers
> round 15 allocate() took 247.250433ms to make 1000 offers after filtering 16000 offers
> round 16 allocate() took 276.932824ms to make 1000 offers after filtering 17000 offers
> round 17 allocate() took 248.888469ms to make 1000 offers after filtering 18000 offers
> round 18 allocate() took 193.890556ms to make 1000 offers after filtering 19000 offers
> round 19 allocate() took 195.105346ms to make 1000 offers after filtering 20000 offers
> round 20 allocate() took 194.447428ms to make 1000 offers after filtering 21000 offers
> round 21 allocate() took 205.486287ms to make 1000 offers after filtering 22000 offers
> round 22 allocate() took 199.241922ms to make 1000 offers after filtering 23000 offers
> round 23 allocate() took 200.885488ms to make 1000 offers after filtering 24000 offers
> round 24 allocate() took 216.132361ms to make 1000 offers after filtering 25000 offers
> round 25 allocate() took 210.638273ms to make 1000 offers after filtering 26000 offers
> round 26 allocate() took 232.397778ms to make 1000 offers after filtering 27000 offers
> round 27 allocate() took 239.633708ms to make 1000 offers after filtering 28000 offers
> round 28 allocate() took 225.829677ms to make 1000 offers after filtering 29000 offers
> round 29 allocate() took 245.143272ms to make 1000 offers after filtering 30000 offers
> round 30 allocate() took 248.630295ms to make 1000 offers after filtering 31000 offers
> round 31 allocate() took 262.147804ms to make 1000 offers after filtering 32000 offers
> round 32 allocate() took 251.109969ms to make 1000 offers after filtering 33000 offers
> round 33 allocate() took 263.92273ms to make 1000 offers after filtering 34000 offers
> round 34 allocate() took 268.422275ms to make 1000 offers after filtering 35000 offers
> round 35 allocate() took 273.830163ms to make 1000 offers after filtering 36000 offers
> round 36 allocate() took 283.25481ms to make 1000 offers after filtering 37000 offers
> round 37 allocate() took 366.400781ms to make 1000 offers after filtering 38000 offers
> round 38 allocate() took 382.202755ms to make 1000 offers after filtering 39000 offers
> round 39 allocate() took 337.266121ms to make 1000 offers after filtering 40000 offers
> round 40 allocate() took 381.033696ms to make 1000 offers after filtering 41000 offers
> round 41 allocate() took 365.941946ms to make 1000 offers after filtering 42000 offers
> round 42 allocate() took 379.527886ms to make 1000 offers after filtering 43000 offers
> round 43 allocate() took 425.661181ms to make 1000 offers after filtering 44000 offers
> round 44 allocate() took 455.86657ms to make 1000 offers after filtering 45000 offers
> round 45 allocate() took 506.943117ms to make 1000 offers after filtering 46000 offers
> round 46 allocate() took 681.880233ms to make 1000 offers after filtering 47000 offers
> round 47 allocate() took 860.932974ms to make 1000 offers after filtering 48000 offers
> round 48 allocate() took 960.272209ms to make 1000 offers after filtering 49000 offers
> round 49 allocate() took 1.907427057secs to make 0 offers after filtering 50000 offers
> round 50 allocate() took 2.157864654secs to make 0 offers after filtering 50000 offers
> 
> 
> After:
> 
> Using 1000 agents and 1 frameworks
> Added 1 frameworks in 404709ns
> Added 1000 agents in 245.162138ms
> round 0 allocate() took 38.498883ms to make 0 offers after filtering 1000 offers
> round 1 allocate() took 42.234898ms to make 0 offers after filtering 1000 offers
> 
> Using 1000 agents and 50 frameworks
> Added 50 frameworks in 1.441281ms
> Added 1000 agents in 261.923711ms
> round 0 allocate() took 121.706074ms to make 1000 offers after filtering 1000 offers
> round 1 allocate() took 150.465925ms to make 1000 offers after filtering 2000 offers
> round 2 allocate() took 273.549893ms to make 1000 offers after filtering 3000 offers
> round 3 allocate() took 199.248799ms to make 1000 offers after filtering 4000 offers
> round 4 allocate() took 133.865752ms to make 1000 offers after filtering 5000 offers
> round 5 allocate() took 130.828599ms to make 1000 offers after filtering 6000 offers
> round 6 allocate() took 134.385597ms to make 1000 offers after filtering 7000 offers
> round 7 allocate() took 135.810198ms to make 1000 offers after filtering 8000 offers
> round 8 allocate() took 126.128592ms to make 1000 offers after filtering 9000 offers
> round 9 allocate() took 144.794829ms to make 1000 offers after filtering 10000 offers
> round 10 allocate() took 162.533506ms to make 1000 offers after filtering 11000 offers
> round 11 allocate() took 159.22141ms to make 1000 offers after filtering 12000 offers
> round 12 allocate() took 174.739823ms to make 1000 offers after filtering 13000 offers
> round 13 allocate() took 171.095423ms to make 1000 offers after filtering 14000 offers
> round 14 allocate() took 186.876661ms to make 1000 offers after filtering 15000 offers
> round 15 allocate() took 177.021603ms to make 1000 offers after filtering 16000 offers
> round 16 allocate() took 165.970722ms to make 1000 offers after filtering 17000 offers
> round 17 allocate() took 162.016338ms to make 1000 offers after filtering 18000 offers
> round 18 allocate() took 138.698917ms to make 1000 offers after filtering 19000 offers
> round 19 allocate() took 144.556913ms to make 1000 offers after filtering 20000 offers
> round 20 allocate() took 155.689926ms to make 1000 offers after filtering 21000 offers
> round 21 allocate() took 149.952025ms to make 1000 offers after filtering 22000 offers
> round 22 allocate() took 135.98823ms to make 1000 offers after filtering 23000 offers
> round 23 allocate() took 132.520992ms to make 1000 offers after filtering 24000 offers
> round 24 allocate() took 143.325635ms to make 1000 offers after filtering 25000 offers
> round 25 allocate() took 153.313423ms to make 1000 offers after filtering 26000 offers
> round 26 allocate() took 169.889066ms to make 1000 offers after filtering 27000 offers
> round 27 allocate() took 188.969694ms to make 1000 offers after filtering 28000 offers
> round 28 allocate() took 176.132259ms to make 1000 offers after filtering 29000 offers
> round 29 allocate() took 186.754676ms to make 1000 offers after filtering 30000 offers
> round 30 allocate() took 166.346508ms to make 1000 offers after filtering 31000 offers
> round 31 allocate() took 172.557665ms to make 1000 offers after filtering 32000 offers
> round 32 allocate() took 169.874406ms to make 1000 offers after filtering 33000 offers
> round 33 allocate() took 190.470692ms to make 1000 offers after filtering 34000 offers
> round 34 allocate() took 184.328221ms to make 1000 offers after filtering 35000 offers
> round 35 allocate() took 222.081892ms to make 1000 offers after filtering 36000 offers
> round 36 allocate() took 203.134216ms to make 1000 offers after filtering 37000 offers
> round 37 allocate() took 217.490016ms to make 1000 offers after filtering 38000 offers
> round 38 allocate() took 257.449904ms to make 1000 offers after filtering 39000 offers
> round 39 allocate() took 252.468529ms to make 1000 offers after filtering 40000 offers
> round 40 allocate() took 229.433398ms to make 1000 offers after filtering 41000 offers
> round 41 allocate() took 251.920859ms to make 1000 offers after filtering 42000 offers
> round 42 allocate() took 273.529747ms to make 1000 offers after filtering 43000 offers
> round 43 allocate() took 315.08445ms to make 1000 offers after filtering 44000 offers
> round 44 allocate() took 354.758003ms to make 1000 offers after filtering 45000 offers
> round 45 allocate() took 350.378463ms to make 1000 offers after filtering 46000 offers
> round 46 allocate() took 415.070355ms to make 1000 offers after filtering 47000 offers
> round 47 allocate() took 519.922944ms to make 1000 offers after filtering 48000 offers
> round 48 allocate() took 710.598546ms to make 1000 offers after filtering 49000 offers
> round 49 allocate() took 1.267395251secs to make 0 offers after filtering 50000 offers
> round 50 allocate() took 1.210188235secs to make 0 offers after filtering 50000 offers
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>