You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Neil Conway <ne...@gmail.com> on 2017/05/18 22:40:32 UTC

Re: Review Request 59355: WIP: Optimized sorter perf for many inactive clients.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59355/
-----------------------------------------------------------

(Updated May 18, 2017, 10:40 p.m.)


Review request for mesos, Benjamin Mahler and Michael Park.


Changes
-------

New approach, further improves perf.


Summary (updated)
-----------------

WIP: Optimized sorter perf for many inactive clients.


Bugs: MESOS-7521
    https://issues.apache.org/jira/browse/MESOS-7521


Repository: mesos


Description (updated)
-------

Rearrange the `children` vector so that inactive leaves are always at
the end of the vector. This makes it easy to skip processing of inactive
leaves.


Diffs (updated)
-----

  src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
  src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 


Diff: https://reviews.apache.org/r/59355/diff/2/

Changes: https://reviews.apache.org/r/59355/diff/1-2/


Testing
-------

Initial perf testing:

MESOS 1.2.0:
===================
```
[ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.SuppressOffers/15
Using 5000 agents and 6000 frameworks
Added 6000 frameworks in 73.164389ms
Added 5000 agents in 39.148832702secs
allocate() took 31.824026984secs to make 5000 offers with 1200 out of 6000 frameworks suppressing offers
allocate() took 23.467935579secs to make 5000 offers with 2400 out of 6000 frameworks suppressing offers
allocate() took 15.528170646secs to make 5000 offers with 3600 out of 6000 frameworks suppressing offers
allocate() took 7.746030551secs to make 5000 offers with 4800 out of 6000 frameworks suppressing offers
allocate() took 25.867454ms to make 0 offers with 6000 out of 6000 frameworks suppressing offers
[       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.SuppressOffers/15 (121894 ms)
[----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (121894 ms total)
```

MESOS in master branch:
===================
```
[ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.SuppressOffers/15
Using 5000 agents and 6000 frameworks
Added 6000 frameworks in 282.858346ms
Added 5000 agents in 16.986189021secs
allocate() took 17.262100444secs to make 5000 offers with 1200 out of 6000 frameworks suppressing offers
allocate() took 17.106477051secs to make 5000 offers with 2400 out of 6000 frameworks suppressing offers
allocate() took 17.032456469secs to make 5000 offers with 3600 out of 6000 frameworks suppressing offers
allocate() took 15.375699284secs to make 5000 offers with 4800 out of 6000 frameworks suppressing offers
allocate() took 323.279937ms to make 0 offers with 6000 out of 6000 frameworks suppressing offers
[       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.SuppressOffers/15 (88780 ms)
[----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (88781 ms total)
```

MESOS in master branch + this patch:
===================
```
[ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.SuppressOffers/15
Using 5000 agents and 6000 frameworks
Added 6000 frameworks in 290.040808ms
Added 5000 agents in 17.406607193secs
allocate() took 14.893550012secs to make 5000 offers with 1200 out of 6000 frameworks suppressing offers
allocate() took 12.148378927secs to make 5000 offers with 2400 out of 6000 frameworks suppressing offers
allocate() took 10.28024964secs to make 5000 offers with 3600 out of 6000 frameworks suppressing offers
allocate() took 8.455132769secs to make 5000 offers with 4800 out of 6000 frameworks suppressing offers
allocate() took 298.49347ms to make 0 offers with 6000 out of 6000 frameworks suppressing offers
[       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.SuppressOffers/15 (68181 ms)
[----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (68181 ms total)
```


Thanks,

Neil Conway


Re: Review Request 59355: WIP: Optimized sorter perf for many inactive clients.

Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59355/#review175463
-----------------------------------------------------------



Patch looks great!

Reviews applied: [59383, 59355]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On May 18, 2017, 10:40 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59355/
> -----------------------------------------------------------
> 
> (Updated May 18, 2017, 10:40 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Michael Park.
> 
> 
> Bugs: MESOS-7521
>     https://issues.apache.org/jira/browse/MESOS-7521
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Rearrange the `children` vector so that inactive leaves are always at
> the end of the vector. This makes it easy to skip processing of inactive
> leaves.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
>   src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 
> 
> 
> Diff: https://reviews.apache.org/r/59355/diff/2/
> 
> 
> Testing
> -------
> 
> Initial perf testing:
> 
> MESOS 1.2.0:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 90.61248ms
> Added 5000 agents in 38.788639509secs
> allocate() took 1.030826713secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.051713631secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 932.748778ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.150094679secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.052298779secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (48234 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (48235 ms total)
> ```
> 
> MESOS in master branch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 295.603058ms
> Added 5000 agents in 16.676030553secs
> allocate() took 16.258004727secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.179602864secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.378586621secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.394222636secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.185625358secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (102353 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (102353 ms total)
> ```
> 
> MESOS in master branch + this patch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 312.507943ms
> Added 5000 agents in 15.793940251secs
> allocate() took 1.170606834secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.128574596secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.03562006secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.229924753secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.145860199secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (25815 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (25819 ms total)
> ```
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59355: Optimized sorter performance with many inactive clients.

Posted by Michael Park <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59355/#review175908
-----------------------------------------------------------


Ship it!




Ship It!

- Michael Park


On May 22, 2017, 11:30 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59355/
> -----------------------------------------------------------
> 
> (Updated May 22, 2017, 11:30 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, James Peach, Michael Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7521
>     https://issues.apache.org/jira/browse/MESOS-7521
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Unlike in Mesos <= 1.2, the sorter now stores inactive clients in the
> same data structure used to store active clients. This resulted in a
> significant performance regression when the vast majority of sorter
> clients are inactive: when sorting clients and producing the sorted
> order, we iterate over ALL clients (active and inactive), which could be
> much slower than the old active-only implementation.
> 
> This commit revises the sorter to ensure that inactive leaf nodes are
> always stored at the end of their parent's list of child nodes. This
> allows the sorter to stop early (at the first inactive leaf) when
> iterating over a node's children, if we're only interested in applying
> an operation to each active leaf or internal node. This change fixes the
> observed performance regression relative to Mesos 1.2.0.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
>   src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 
> 
> 
> Diff: https://reviews.apache.org/r/59355/diff/3/
> 
> 
> Testing
> -------
> 
> Initial perf testing:
> 
> MESOS 1.2.0:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 90.61248ms
> Added 5000 agents in 38.788639509secs
> allocate() took 1.030826713secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.051713631secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 932.748778ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.150094679secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.052298779secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (48234 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (48235 ms total)
> ```
> 
> MESOS in master branch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 295.603058ms
> Added 5000 agents in 16.676030553secs
> allocate() took 16.258004727secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.179602864secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.378586621secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.394222636secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.185625358secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (102353 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (102353 ms total)
> ```
> 
> MESOS in master branch + this patch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 325.582494ms
> Added 5000 agents in 16.311040922secs
> allocate() took 720.776578ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 822.521925ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 846.866828ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 857.232458ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 880.696907ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (24664 ms)
> ```
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59355: Optimized sorter performance with many inactive clients.

Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59355/#review175781
-----------------------------------------------------------



Patch looks great!

Reviews applied: [59481, 59482, 59383, 59483, 59484, 59355]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On May 23, 2017, 6:30 a.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59355/
> -----------------------------------------------------------
> 
> (Updated May 23, 2017, 6:30 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, James Peach, Michael Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7521
>     https://issues.apache.org/jira/browse/MESOS-7521
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Unlike in Mesos <= 1.2, the sorter now stores inactive clients in the
> same data structure used to store active clients. This resulted in a
> significant performance regression when the vast majority of sorter
> clients are inactive: when sorting clients and producing the sorted
> order, we iterate over ALL clients (active and inactive), which could be
> much slower than the old active-only implementation.
> 
> This commit revises the sorter to ensure that inactive leaf nodes are
> always stored at the end of their parent's list of child nodes. This
> allows the sorter to stop early (at the first inactive leaf) when
> iterating over a node's children, if we're only interested in applying
> an operation to each active leaf or internal node. This change fixes the
> observed performance regression relative to Mesos 1.2.0.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
>   src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 
> 
> 
> Diff: https://reviews.apache.org/r/59355/diff/3/
> 
> 
> Testing
> -------
> 
> Initial perf testing:
> 
> MESOS 1.2.0:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 90.61248ms
> Added 5000 agents in 38.788639509secs
> allocate() took 1.030826713secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.051713631secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 932.748778ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.150094679secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.052298779secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (48234 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (48235 ms total)
> ```
> 
> MESOS in master branch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 295.603058ms
> Added 5000 agents in 16.676030553secs
> allocate() took 16.258004727secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.179602864secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.378586621secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.394222636secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.185625358secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (102353 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (102353 ms total)
> ```
> 
> MESOS in master branch + this patch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 325.582494ms
> Added 5000 agents in 16.311040922secs
> allocate() took 720.776578ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 822.521925ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 846.866828ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 857.232458ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 880.696907ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (24664 ms)
> ```
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59355: Optimized sorter performance with many inactive clients.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59355/#review175843
-----------------------------------------------------------



Patch looks great!

Reviews applied: [59481, 59482, 59383, 59483, 59484, 59355]

Passed command: support\windows-build.bat

- Mesos Reviewbot Windows


On May 23, 2017, 6:30 a.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59355/
> -----------------------------------------------------------
> 
> (Updated May 23, 2017, 6:30 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, James Peach, Michael Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7521
>     https://issues.apache.org/jira/browse/MESOS-7521
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Unlike in Mesos <= 1.2, the sorter now stores inactive clients in the
> same data structure used to store active clients. This resulted in a
> significant performance regression when the vast majority of sorter
> clients are inactive: when sorting clients and producing the sorted
> order, we iterate over ALL clients (active and inactive), which could be
> much slower than the old active-only implementation.
> 
> This commit revises the sorter to ensure that inactive leaf nodes are
> always stored at the end of their parent's list of child nodes. This
> allows the sorter to stop early (at the first inactive leaf) when
> iterating over a node's children, if we're only interested in applying
> an operation to each active leaf or internal node. This change fixes the
> observed performance regression relative to Mesos 1.2.0.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
>   src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 
> 
> 
> Diff: https://reviews.apache.org/r/59355/diff/3/
> 
> 
> Testing
> -------
> 
> Initial perf testing:
> 
> MESOS 1.2.0:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 90.61248ms
> Added 5000 agents in 38.788639509secs
> allocate() took 1.030826713secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.051713631secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 932.748778ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.150094679secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.052298779secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (48234 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (48235 ms total)
> ```
> 
> MESOS in master branch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 295.603058ms
> Added 5000 agents in 16.676030553secs
> allocate() took 16.258004727secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.179602864secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.378586621secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.394222636secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.185625358secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (102353 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (102353 ms total)
> ```
> 
> MESOS in master branch + this patch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 325.582494ms
> Added 5000 agents in 16.311040922secs
> allocate() took 720.776578ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 822.521925ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 846.866828ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 857.232458ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 880.696907ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (24664 ms)
> ```
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59355: Optimized sorter performance with many inactive clients.

Posted by Neil Conway <ne...@gmail.com>.

> On May 23, 2017, 4:56 p.m., James Peach wrote:
> > src/master/allocator/sorter/drf/sorter.hpp
> > Lines 316 (patched)
> > <https://reviews.apache.org/r/59355/diff/3/?file=1727589#file1727589line316>
> >
> >     Did you consider keeping separate vectors for active and inactive children? That would avoid the copying due to the insert() here, and I think it would simplify the sorting code later on.

Yeah, I considered that -- it is definitely a bit awkward to maintain the ordering invariant over a single `children` vector. The downside to keeping two vectors is that the code for inserting, removing, and updating clients becomes more complicated, because we need to search two vectors, check for duplicates in both vectors, etc. My initial preference was for a single vector, but you could definitely argue for two vectors instead.


- Neil


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59355/#review175812
-----------------------------------------------------------


On May 23, 2017, 6:30 a.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59355/
> -----------------------------------------------------------
> 
> (Updated May 23, 2017, 6:30 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, James Peach, Michael Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7521
>     https://issues.apache.org/jira/browse/MESOS-7521
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Unlike in Mesos <= 1.2, the sorter now stores inactive clients in the
> same data structure used to store active clients. This resulted in a
> significant performance regression when the vast majority of sorter
> clients are inactive: when sorting clients and producing the sorted
> order, we iterate over ALL clients (active and inactive), which could be
> much slower than the old active-only implementation.
> 
> This commit revises the sorter to ensure that inactive leaf nodes are
> always stored at the end of their parent's list of child nodes. This
> allows the sorter to stop early (at the first inactive leaf) when
> iterating over a node's children, if we're only interested in applying
> an operation to each active leaf or internal node. This change fixes the
> observed performance regression relative to Mesos 1.2.0.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
>   src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 
> 
> 
> Diff: https://reviews.apache.org/r/59355/diff/3/
> 
> 
> Testing
> -------
> 
> Initial perf testing:
> 
> MESOS 1.2.0:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 90.61248ms
> Added 5000 agents in 38.788639509secs
> allocate() took 1.030826713secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.051713631secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 932.748778ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.150094679secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.052298779secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (48234 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (48235 ms total)
> ```
> 
> MESOS in master branch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 295.603058ms
> Added 5000 agents in 16.676030553secs
> allocate() took 16.258004727secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.179602864secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.378586621secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.394222636secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.185625358secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (102353 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (102353 ms total)
> ```
> 
> MESOS in master branch + this patch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 325.582494ms
> Added 5000 agents in 16.311040922secs
> allocate() took 720.776578ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 822.521925ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 846.866828ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 857.232458ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 880.696907ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (24664 ms)
> ```
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59355: Optimized sorter performance with many inactive clients.

Posted by James Peach <jp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59355/#review175812
-----------------------------------------------------------




src/master/allocator/sorter/drf/sorter.hpp
Lines 316 (patched)
<https://reviews.apache.org/r/59355/#comment249139>

    Did you consider keeping separate vectors for active and inactive children? That would avoid the copying due to the insert() here, and I think it would simplify the sorting code later on.


- James Peach


On May 23, 2017, 6:30 a.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59355/
> -----------------------------------------------------------
> 
> (Updated May 23, 2017, 6:30 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, James Peach, Michael Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7521
>     https://issues.apache.org/jira/browse/MESOS-7521
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Unlike in Mesos <= 1.2, the sorter now stores inactive clients in the
> same data structure used to store active clients. This resulted in a
> significant performance regression when the vast majority of sorter
> clients are inactive: when sorting clients and producing the sorted
> order, we iterate over ALL clients (active and inactive), which could be
> much slower than the old active-only implementation.
> 
> This commit revises the sorter to ensure that inactive leaf nodes are
> always stored at the end of their parent's list of child nodes. This
> allows the sorter to stop early (at the first inactive leaf) when
> iterating over a node's children, if we're only interested in applying
> an operation to each active leaf or internal node. This change fixes the
> observed performance regression relative to Mesos 1.2.0.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
>   src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 
> 
> 
> Diff: https://reviews.apache.org/r/59355/diff/3/
> 
> 
> Testing
> -------
> 
> Initial perf testing:
> 
> MESOS 1.2.0:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 90.61248ms
> Added 5000 agents in 38.788639509secs
> allocate() took 1.030826713secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.051713631secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 932.748778ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.150094679secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 1.052298779secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (48234 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (48235 ms total)
> ```
> 
> MESOS in master branch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 295.603058ms
> Added 5000 agents in 16.676030553secs
> allocate() took 16.258004727secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.179602864secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.378586621secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.394222636secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 16.185625358secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (102353 ms)
> [----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (102353 ms total)
> ```
> 
> MESOS in master branch + this patch:
> ===================
> ```
> [ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
> Using 5000 agents and 6000 frameworks
> Added 6000 frameworks in 325.582494ms
> Added 5000 agents in 16.311040922secs
> allocate() took 720.776578ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 822.521925ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 846.866828ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 857.232458ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> allocate() took 880.696907ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
> [       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (24664 ms)
> ```
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59355: Optimized sorter performance with many inactive clients.

Posted by Neil Conway <ne...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59355/
-----------------------------------------------------------

(Updated May 23, 2017, 6:30 a.m.)


Review request for mesos, Benjamin Mahler, James Peach, Michael Park, and Jiang Yan Xu.


Bugs: MESOS-7521
    https://issues.apache.org/jira/browse/MESOS-7521


Repository: mesos


Description
-------

Unlike in Mesos <= 1.2, the sorter now stores inactive clients in the
same data structure used to store active clients. This resulted in a
significant performance regression when the vast majority of sorter
clients are inactive: when sorting clients and producing the sorted
order, we iterate over ALL clients (active and inactive), which could be
much slower than the old active-only implementation.

This commit revises the sorter to ensure that inactive leaf nodes are
always stored at the end of their parent's list of child nodes. This
allows the sorter to stop early (at the first inactive leaf) when
iterating over a node's children, if we're only interested in applying
an operation to each active leaf or internal node. This change fixes the
observed performance regression relative to Mesos 1.2.0.


Diffs
-----

  src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
  src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 


Diff: https://reviews.apache.org/r/59355/diff/3/


Testing (updated)
-------

Initial perf testing:

MESOS 1.2.0:
===================
```
[ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
Using 5000 agents and 6000 frameworks
Added 6000 frameworks in 90.61248ms
Added 5000 agents in 38.788639509secs
allocate() took 1.030826713secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 1.051713631secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 932.748778ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 1.150094679secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 1.052298779secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
[       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (48234 ms)
[----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (48235 ms total)
```

MESOS in master branch:
===================
```
[ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
Using 5000 agents and 6000 frameworks
Added 6000 frameworks in 295.603058ms
Added 5000 agents in 16.676030553secs
allocate() took 16.258004727secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 16.179602864secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 16.378586621secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 16.394222636secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 16.185625358secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
[       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (102353 ms)
[----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (102353 ms total)
```

MESOS in master branch + this patch:
===================
```
[ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
Using 5000 agents and 6000 frameworks
Added 6000 frameworks in 325.582494ms
Added 5000 agents in 16.311040922secs
allocate() took 720.776578ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 822.521925ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 846.866828ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 857.232458ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 880.696907ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
[       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (24664 ms)
```


Thanks,

Neil Conway


Re: Review Request 59355: Optimized sorter performance with many inactive clients.

Posted by Neil Conway <ne...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59355/
-----------------------------------------------------------

(Updated May 23, 2017, 6:28 a.m.)


Review request for mesos, Benjamin Mahler, James Peach, Michael Park, and Jiang Yan Xu.


Changes
-------

New approach.


Summary (updated)
-----------------

Optimized sorter performance with many inactive clients.


Bugs: MESOS-7521
    https://issues.apache.org/jira/browse/MESOS-7521


Repository: mesos


Description (updated)
-------

Unlike in Mesos <= 1.2, the sorter now stores inactive clients in the
same data structure used to store active clients. This resulted in a
significant performance regression when the vast majority of sorter
clients are inactive: when sorting clients and producing the sorted
order, we iterate over ALL clients (active and inactive), which could be
much slower than the old active-only implementation.

This commit revises the sorter to ensure that inactive leaf nodes are
always stored at the end of their parent's list of child nodes. This
allows the sorter to stop early (at the first inactive leaf) when
iterating over a node's children, if we're only interested in applying
an operation to each active leaf or internal node. This change fixes the
observed performance regression relative to Mesos 1.2.0.


Diffs (updated)
-----

  src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
  src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 


Diff: https://reviews.apache.org/r/59355/diff/3/

Changes: https://reviews.apache.org/r/59355/diff/2-3/


Testing
-------

Initial perf testing:

MESOS 1.2.0:
===================
```
[ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
Using 5000 agents and 6000 frameworks
Added 6000 frameworks in 90.61248ms
Added 5000 agents in 38.788639509secs
allocate() took 1.030826713secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 1.051713631secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 932.748778ms to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 1.150094679secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 1.052298779secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
[       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (48234 ms)
[----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (48235 ms total)
```

MESOS in master branch:
===================
```
[ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
Using 5000 agents and 6000 frameworks
Added 6000 frameworks in 295.603058ms
Added 5000 agents in 16.676030553secs
allocate() took 16.258004727secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 16.179602864secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 16.378586621secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 16.394222636secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 16.185625358secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
[       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (102353 ms)
[----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (102353 ms total)
```

MESOS in master branch + this patch:
===================
```
[ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15
Using 5000 agents and 6000 frameworks
Added 6000 frameworks in 312.507943ms
Added 5000 agents in 15.793940251secs
allocate() took 1.170606834secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 1.128574596secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 1.03562006secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 1.229924753secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
allocate() took 1.145860199secs to make 5000 offers with 5940 out of 6000 frameworks suppressing offers
[       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ExtremeSuppressOffers/15 (25815 ms)
[----------] 1 test from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (25819 ms total)
```


Thanks,

Neil Conway