You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Thomas Marshall <tw...@gmail.com> on 2013/03/01 20:48:57 UTC

Re: Review Request: Speed up allocations by keeping a set of slaves with available resources

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/
-----------------------------------------------------------

(Updated March 1, 2013, 7:48 p.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Updated to Vinod's review.


Description
-------

Currently, every time we do an allocation we have to traverse the entire list of active slaves an check each one to see if its whitelisted and if it has resources to allocate. This patch keeps a set of all slaves that meet those requirements, and updates it when slaves are added/removed and when resources are allocated/recovered.

Timing comparisons, using test_framework on a local cluster on my laptop (each number is an average over 10 tests):

100 slaves, 100 tasks
without patch: 5.82s
with patch: 5.773s
improvement of about 1%

1000 slaves, 100 tasks
without patch: 8.261s
with patch: 8.07s
improvement of about 2%

Since this was a scalability issue, you'd presumably see a bigger improvement in larger clusters but its difficult to simulate more than 1000 slaves or so locally. If more convincing numbers are needed I can do some testing on EC2 (if nothing else, I'm hoping we'll have a Jenkins build with automated performance tests set up later this semester, so I can test this in the process of setting that up), but at the very least it seems clear that the patch improves the runtime complexity of the allocation algorithm and doesn't slow allocations down even in small clusters. 


Diffs (updated)
-----

  src/local/local.hpp 2633d25 
  src/local/local.cpp 3402029 
  src/master/allocator.hpp b68b67d 
  src/master/drf_sorter.hpp 79566cc 
  src/master/drf_sorter.cpp 33a8ec8 
  src/master/hierarchical_allocator_process.hpp 33e059c 
  src/master/main.cpp ca0abec 
  src/master/master.hpp c9b4b3f 
  src/master/master.cpp 814a6e1 
  src/master/sorter.hpp 73db6b1 
  src/tests/allocator_tests.cpp 44d72ad 
  src/tests/allocator_zookeeper_tests.cpp 5f83dd7 
  src/tests/fault_tolerance_tests.cpp 44eef03 
  src/tests/gc_tests.cpp 411bcc0 
  src/tests/master_detector_tests.cpp 5d09bab 
  src/tests/master_tests.cpp e140f40 
  src/tests/resource_offers_tests.cpp 5981191 
  src/tests/sorter_tests.cpp 61e6038 
  src/tests/utils.hpp 4600c2f 

Diff: https://reviews.apache.org/r/6665/diff/


Testing
-------

make check


Thanks,

Thomas Marshall


Re: Review Request: Speed up allocations by keeping a set of slaves with available resources

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review17261
-----------------------------------------------------------

Ship it!



src/local/local.hpp
<https://reviews.apache.org/r/6665/#comment36698>

    thank you.



src/local/local.hpp
<https://reviews.apache.org/r/6665/#comment36706>

    thank you.


- Vinod Kone


On March 1, 2013, 7:48 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated March 1, 2013, 7:48 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> Currently, every time we do an allocation we have to traverse the entire list of active slaves an check each one to see if its whitelisted and if it has resources to allocate. This patch keeps a set of all slaves that meet those requirements, and updates it when slaves are added/removed and when resources are allocated/recovered.
> 
> Timing comparisons, using test_framework on a local cluster on my laptop (each number is an average over 10 tests):
> 
> 100 slaves, 100 tasks
> without patch: 5.82s
> with patch: 5.773s
> improvement of about 1%
> 
> 1000 slaves, 100 tasks
> without patch: 8.261s
> with patch: 8.07s
> improvement of about 2%
> 
> Since this was a scalability issue, you'd presumably see a bigger improvement in larger clusters but its difficult to simulate more than 1000 slaves or so locally. If more convincing numbers are needed I can do some testing on EC2 (if nothing else, I'm hoping we'll have a Jenkins build with automated performance tests set up later this semester, so I can test this in the process of setting that up), but at the very least it seems clear that the patch improves the runtime complexity of the allocation algorithm and doesn't slow allocations down even in small clusters. 
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp 3402029 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp 33a8ec8 
>   src/master/hierarchical_allocator_process.hpp 33e059c 
>   src/master/main.cpp ca0abec 
>   src/master/master.hpp c9b4b3f 
>   src/master/master.cpp 814a6e1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp 44d72ad 
>   src/tests/allocator_zookeeper_tests.cpp 5f83dd7 
>   src/tests/fault_tolerance_tests.cpp 44eef03 
>   src/tests/gc_tests.cpp 411bcc0 
>   src/tests/master_detector_tests.cpp 5d09bab 
>   src/tests/master_tests.cpp e140f40 
>   src/tests/resource_offers_tests.cpp 5981191 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp 4600c2f 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Speed up allocations by keeping a set of slaves with available resources

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review17744
-----------------------------------------------------------


Also, can you update the review subject / description accordingly?

- Ben Mahler


On March 11, 2013, 9:59 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated March 11, 2013, 9:59 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> Currently, every time we do an allocation we have to traverse the entire list of active slaves an check each one to see if its whitelisted and if it has resources to allocate. This patch keeps a set of all slaves that meet those requirements, and updates it when slaves are added/removed and when resources are allocated/recovered.
> 
> Timing comparisons, using test_framework on a local cluster on my laptop (each number is an average over 10 tests):
> 
> 100 slaves, 100 tasks
> without patch: 5.82s
> with patch: 5.773s
> improvement of about 1%
> 
> 1000 slaves, 100 tasks
> without patch: 8.261s
> with patch: 8.07s
> improvement of about 2%
> 
> Since this was a scalability issue, you'd presumably see a bigger improvement in larger clusters but its difficult to simulate more than 1000 slaves or so locally. If more convincing numbers are needed I can do some testing on EC2 (if nothing else, I'm hoping we'll have a Jenkins build with automated performance tests set up later this semester, so I can test this in the process of setting that up), but at the very least it seems clear that the patch improves the runtime complexity of the allocation algorithm and doesn't slow allocations down even in small clusters. 
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp 3402029 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp fe31a00 
>   src/master/hierarchical_allocator_process.hpp c1d6f54 
>   src/master/main.cpp ca0abec 
>   src/master/master.hpp c9b4b3f 
>   src/master/master.cpp 814a6e1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp b953cd1 
>   src/tests/allocator_zookeeper_tests.cpp 3f70202 
>   src/tests/fault_tolerance_tests.cpp 61509f1 
>   src/tests/gc_tests.cpp fbdd6d6 
>   src/tests/master_detector_tests.cpp 787ba19 
>   src/tests/master_tests.cpp 2ba14fc 
>   src/tests/resource_offers_tests.cpp 44eaf0d 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp d3efa58 
>   third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Ben Mahler <be...@gmail.com>.

> On March 12, 2013, 6:09 p.m., Ben Mahler wrote:
> > Thanks for the patience Thomas, I think you've managed to simplify the allocator code quite nicely here!
> > 
> > As a note, I think we should clean up the crazy filter pointer ownership as well, but in another review.
> 
> Thomas Marshall wrote:
>     Agreed about the filters. It'll go on my allocator todo list.

Perfect!


> On March 12, 2013, 6:09 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 769
> > <https://reviews.apache.org/r/6665/diff/9/?file=269278#file269278line769>
> >
> >     So I was looking at resources.hpp, and these are intensive operations at the moment, perhaps less so if / when we make resources first class fields in the protobufs.
> >     
> >     Also, there are duplicate operator implementations defined there, unless I'm seeing things!
> 
> Thomas Marshall wrote:
>     This particular line, 'slaves[slaveId].available -= resources', is deceptive since slaves[slaveId].available will always == resources and all you're really wanting to do is zero out the resources in slaves[slaveId].available. There's not currently an easy way of doing this, but its an operation that I use a lot in implementing static allocations, so I'll be adding some sort of '.clear()' type function to Resources in a later patch and I'll deal with this accordingly then, if that's okay.
>     
>     When you say duplicate operator implementations, are you maybe confusing the singular '-= Resource' and plural '-= Resources' functions?

Yes, an more efficient way to clear the resources would be great!

For the duplicates, I see the following duplicated signatures (each has two implementations..?):
  Resources operator + (const Resources& that) const
  Resources operator - (const Resources& that) const
  Resources& operator += (const Resources& that)
  Resources& operator -= (const Resources& that)


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review17739
-----------------------------------------------------------


On March 12, 2013, 8:58 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated March 12, 2013, 8:58 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp 3402029 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp fe31a00 
>   src/master/hierarchical_allocator_process.hpp c1d6f54 
>   src/master/main.cpp ca0abec 
>   src/master/master.hpp c9b4b3f 
>   src/master/master.cpp 814a6e1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp b953cd1 
>   src/tests/allocator_zookeeper_tests.cpp 3f70202 
>   src/tests/fault_tolerance_tests.cpp 61509f1 
>   src/tests/gc_tests.cpp fbdd6d6 
>   src/tests/master_detector_tests.cpp 787ba19 
>   src/tests/master_tests.cpp 2ba14fc 
>   src/tests/resource_offers_tests.cpp 44eaf0d 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp d3efa58 
>   third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Thomas Marshall <tw...@gmail.com>.

> On March 12, 2013, 6:09 p.m., Ben Mahler wrote:
> > Thanks for the patience Thomas, I think you've managed to simplify the allocator code quite nicely here!
> > 
> > As a note, I think we should clean up the crazy filter pointer ownership as well, but in another review.

Agreed about the filters. It'll go on my allocator todo list.


> On March 12, 2013, 6:09 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 769
> > <https://reviews.apache.org/r/6665/diff/9/?file=269278#file269278line769>
> >
> >     So I was looking at resources.hpp, and these are intensive operations at the moment, perhaps less so if / when we make resources first class fields in the protobufs.
> >     
> >     Also, there are duplicate operator implementations defined there, unless I'm seeing things!

This particular line, 'slaves[slaveId].available -= resources', is deceptive since slaves[slaveId].available will always == resources and all you're really wanting to do is zero out the resources in slaves[slaveId].available. There's not currently an easy way of doing this, but its an operation that I use a lot in implementing static allocations, so I'll be adding some sort of '.clear()' type function to Resources in a later patch and I'll deal with this accordingly then, if that's okay.

When you say duplicate operator implementations, are you maybe confusing the singular '-= Resource' and plural '-= Resources' functions?


- Thomas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review17739
-----------------------------------------------------------


On March 12, 2013, 8:58 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated March 12, 2013, 8:58 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp 3402029 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp fe31a00 
>   src/master/hierarchical_allocator_process.hpp c1d6f54 
>   src/master/main.cpp ca0abec 
>   src/master/master.hpp c9b4b3f 
>   src/master/master.cpp 814a6e1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp b953cd1 
>   src/tests/allocator_zookeeper_tests.cpp 3f70202 
>   src/tests/fault_tolerance_tests.cpp 61509f1 
>   src/tests/gc_tests.cpp fbdd6d6 
>   src/tests/master_detector_tests.cpp 787ba19 
>   src/tests/master_tests.cpp 2ba14fc 
>   src/tests/resource_offers_tests.cpp 44eaf0d 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp d3efa58 
>   third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Speed up allocations by keeping a set of slaves with available resources

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review17739
-----------------------------------------------------------

Ship it!


Thanks for the patience Thomas, I think you've managed to simplify the allocator code quite nicely here!

As a note, I think we should clean up the crazy filter pointer ownership as well, but in another review.


src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37656>

    Can you move this comment up to the frameworks.erase() call? And update it to indicate the fact that the framework filter is stored in the framework now.



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37657>

    Ditto, can you move this right above the clear() call?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37658>

    So I was looking at resources.hpp, and these are intensive operations at the moment, perhaps less so if / when we make resources first class fields in the protobufs.
    
    Also, there are duplicate operator implementations defined there, unless I'm seeing things!



third_party/libprocess/third_party/stout/include/stout/stringify.hpp
<https://reviews.apache.org/r/6665/#comment37655>

    thanks!


- Ben Mahler


On March 11, 2013, 9:59 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated March 11, 2013, 9:59 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> Currently, every time we do an allocation we have to traverse the entire list of active slaves an check each one to see if its whitelisted and if it has resources to allocate. This patch keeps a set of all slaves that meet those requirements, and updates it when slaves are added/removed and when resources are allocated/recovered.
> 
> Timing comparisons, using test_framework on a local cluster on my laptop (each number is an average over 10 tests):
> 
> 100 slaves, 100 tasks
> without patch: 5.82s
> with patch: 5.773s
> improvement of about 1%
> 
> 1000 slaves, 100 tasks
> without patch: 8.261s
> with patch: 8.07s
> improvement of about 2%
> 
> Since this was a scalability issue, you'd presumably see a bigger improvement in larger clusters but its difficult to simulate more than 1000 slaves or so locally. If more convincing numbers are needed I can do some testing on EC2 (if nothing else, I'm hoping we'll have a Jenkins build with automated performance tests set up later this semester, so I can test this in the process of setting that up), but at the very least it seems clear that the patch improves the runtime complexity of the allocation algorithm and doesn't slow allocations down even in small clusters. 
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp 3402029 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp fe31a00 
>   src/master/hierarchical_allocator_process.hpp c1d6f54 
>   src/master/main.cpp ca0abec 
>   src/master/master.hpp c9b4b3f 
>   src/master/master.cpp 814a6e1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp b953cd1 
>   src/tests/allocator_zookeeper_tests.cpp 3f70202 
>   src/tests/fault_tolerance_tests.cpp 61509f1 
>   src/tests/gc_tests.cpp fbdd6d6 
>   src/tests/master_detector_tests.cpp 787ba19 
>   src/tests/master_tests.cpp 2ba14fc 
>   src/tests/resource_offers_tests.cpp 44eaf0d 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp d3efa58 
>   third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Thomas Marshall <tw...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/
-----------------------------------------------------------

(Updated May 9, 2013, 9:49 p.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Updated to trunk.


Description
-------

There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.


Diffs (updated)
-----

  src/local/local.hpp 2633d25 
  src/local/local.cpp f43ab2a 
  src/master/allocator.hpp b68b67d 
  src/master/drf_sorter.hpp 79566cc 
  src/master/drf_sorter.cpp fe31a00 
  src/master/hierarchical_allocator_process.hpp 2422fbd 
  src/master/main.cpp d6e1c73 
  src/master/master.hpp d3790dc 
  src/master/master.cpp 3207157 
  src/master/sorter.hpp 73db6b1 
  src/tests/allocator_tests.cpp 540c05a 
  src/tests/allocator_zookeeper_tests.cpp 2c7deb1 
  src/tests/cluster.hpp 682b7d6 
  src/tests/fault_tolerance_tests.cpp 68cd5fc 
  src/tests/isolator_tests.cpp 435c780 
  src/tests/resource_offers_tests.cpp 5e53a89 
  src/tests/slave_recovery_tests.cpp d963ce1 
  src/tests/sorter_tests.cpp 95e8d28 
  src/tests/utils.hpp ca3ecd7 
  third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 

Diff: https://reviews.apache.org/r/6665/diff/


Testing
-------

On both OSX and Ubuntu (with this patch: https://reviews.apache.org/r/10786/ also applied)

bin/mesos-tests.sh --gtest_break_on_failure --gtest_repeat=2000 --gtest_filter=*Allocator*


Thanks,

Thomas Marshall


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Thomas Marshall <tw...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/
-----------------------------------------------------------

(Updated April 26, 2013, 12:31 a.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Updated to trunk.


Description
-------

There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.


Diffs (updated)
-----

  src/local/local.hpp 2633d25 
  src/local/local.cpp f43ab2a 
  src/master/allocator.hpp b68b67d 
  src/master/drf_sorter.hpp 79566cc 
  src/master/drf_sorter.cpp fe31a00 
  src/master/hierarchical_allocator_process.hpp 2422fbd 
  src/master/main.cpp d6e1c73 
  src/master/master.hpp 9776a7c 
  src/master/master.cpp c3b26b1 
  src/master/sorter.hpp 73db6b1 
  src/tests/allocator_tests.cpp 540c05a 
  src/tests/allocator_zookeeper_tests.cpp 2c7deb1 
  src/tests/cluster.hpp 682b7d6 
  src/tests/fault_tolerance_tests.cpp d3476f7 
  src/tests/isolator_tests.cpp 435c780 
  src/tests/resource_offers_tests.cpp 5e53a89 
  src/tests/slave_recovery_tests.cpp c1d2000 
  src/tests/sorter_tests.cpp 95e8d28 
  src/tests/utils.hpp ca3ecd7 
  third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 

Diff: https://reviews.apache.org/r/6665/diff/


Testing (updated)
-------

On both OSX and Ubuntu (with this patch: https://reviews.apache.org/r/10786/ also applied)

bin/mesos-tests.sh --gtest_break_on_failure --gtest_repeat=2000 --gtest_filter=*Allocator*


Thanks,

Thomas Marshall


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Thomas Marshall <tw...@gmail.com>.

> On April 17, 2013, 7:21 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 99
> > <https://reviews.apache.org/r/6665/diff/14/?file=281663#file281663line99>
> >
> >     Can this be const?

Correct me if I'm wrong, but we can't make this const unless we start saving the Slaves as pointers, since it makes the Slaves unassignable.


> On April 17, 2013, 7:21 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 116
> > <https://reviews.apache.org/r/6665/diff/14/?file=281663#file281663line116>
> >
> >     Can this be const?

Ditto.


> On April 17, 2013, 7:21 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 423
> > <https://reviews.apache.org/r/6665/diff/14/?file=281663#file281663line423>
> >
> >     Would love to see a cleanup if this tricky filter deletion in a follow up!

It's on my TODO list.


- Thomas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review19340
-----------------------------------------------------------


On April 16, 2013, 9:53 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated April 16, 2013, 9:53 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp f43ab2a 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp fe31a00 
>   src/master/hierarchical_allocator_process.hpp 2422fbd 
>   src/master/main.cpp d6e1c73 
>   src/master/master.hpp 9776a7c 
>   src/master/master.cpp 5b0e8c0 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp ca8d4ea 
>   src/tests/allocator_zookeeper_tests.cpp 82a498e 
>   src/tests/cluster.hpp 28fc269 
>   src/tests/fault_tolerance_tests.cpp 0348f20 
>   src/tests/isolator_tests.cpp 4a115c3 
>   src/tests/resource_offers_tests.cpp ad3640f 
>   src/tests/slave_recovery_tests.cpp 7be332b 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp b5c577d 
>   third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Ben Mahler <be...@gmail.com>.

> On April 17, 2013, 7:21 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 99
> > <https://reviews.apache.org/r/6665/diff/14/?file=281663#file281663line99>
> >
> >     Can this be const?
> 
> Thomas Marshall wrote:
>     Correct me if I'm wrong, but we can't make this const unless we start saving the Slaves as pointers, since it makes the Slaves unassignable.

Ah yes, thanks!


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review19340
-----------------------------------------------------------


On April 26, 2013, 12:31 a.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated April 26, 2013, 12:31 a.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp f43ab2a 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp fe31a00 
>   src/master/hierarchical_allocator_process.hpp 2422fbd 
>   src/master/main.cpp d6e1c73 
>   src/master/master.hpp 9776a7c 
>   src/master/master.cpp c3b26b1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp 540c05a 
>   src/tests/allocator_zookeeper_tests.cpp 2c7deb1 
>   src/tests/cluster.hpp 682b7d6 
>   src/tests/fault_tolerance_tests.cpp d3476f7 
>   src/tests/isolator_tests.cpp 435c780 
>   src/tests/resource_offers_tests.cpp 5e53a89 
>   src/tests/slave_recovery_tests.cpp c1d2000 
>   src/tests/sorter_tests.cpp 95e8d28 
>   src/tests/utils.hpp ca3ecd7 
>   third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> On both OSX and Ubuntu (with this patch: https://reviews.apache.org/r/10786/ also applied)
> 
> bin/mesos-tests.sh --gtest_break_on_failure --gtest_repeat=2000 --gtest_filter=*Allocator*
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review19340
-----------------------------------------------------------

Ship it!


This is looking good, can you ensure the allocator tests are all passing on a high number of iterations? Let me know and I can get this committed for you.


src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment40022>

    Can this be const?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment40023>

    Can this be const?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment40024>

    Would love to see a cleanup if this tricky filter deletion in a follow up!


- Ben Mahler


On April 16, 2013, 9:53 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated April 16, 2013, 9:53 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp f43ab2a 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp fe31a00 
>   src/master/hierarchical_allocator_process.hpp 2422fbd 
>   src/master/main.cpp d6e1c73 
>   src/master/master.hpp 9776a7c 
>   src/master/master.cpp 5b0e8c0 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp ca8d4ea 
>   src/tests/allocator_zookeeper_tests.cpp 82a498e 
>   src/tests/cluster.hpp 28fc269 
>   src/tests/fault_tolerance_tests.cpp 0348f20 
>   src/tests/isolator_tests.cpp 4a115c3 
>   src/tests/resource_offers_tests.cpp ad3640f 
>   src/tests/slave_recovery_tests.cpp 7be332b 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp b5c577d 
>   third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Thomas Marshall <tw...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/
-----------------------------------------------------------

(Updated April 16, 2013, 9:53 p.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Minor fix. Actually updated to trunk now.


Description
-------

There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.


Diffs (updated)
-----

  src/local/local.hpp 2633d25 
  src/local/local.cpp f43ab2a 
  src/master/allocator.hpp b68b67d 
  src/master/drf_sorter.hpp 79566cc 
  src/master/drf_sorter.cpp fe31a00 
  src/master/hierarchical_allocator_process.hpp 2422fbd 
  src/master/main.cpp d6e1c73 
  src/master/master.hpp 9776a7c 
  src/master/master.cpp 5b0e8c0 
  src/master/sorter.hpp 73db6b1 
  src/tests/allocator_tests.cpp ca8d4ea 
  src/tests/allocator_zookeeper_tests.cpp 82a498e 
  src/tests/cluster.hpp 28fc269 
  src/tests/fault_tolerance_tests.cpp 0348f20 
  src/tests/isolator_tests.cpp 4a115c3 
  src/tests/resource_offers_tests.cpp ad3640f 
  src/tests/slave_recovery_tests.cpp 7be332b 
  src/tests/sorter_tests.cpp 61e6038 
  src/tests/utils.hpp b5c577d 
  third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 

Diff: https://reviews.apache.org/r/6665/diff/


Testing
-------

make check


Thanks,

Thomas Marshall


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Thomas Marshall <tw...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/
-----------------------------------------------------------

(Updated April 16, 2013, 9:08 p.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Updated to trunk.


Description
-------

There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.


Diffs (updated)
-----

  src/local/local.hpp 2633d25 
  src/local/local.cpp f43ab2a 
  src/master/allocator.hpp b68b67d 
  src/master/drf_sorter.hpp 79566cc 
  src/master/drf_sorter.cpp fe31a00 
  src/master/hierarchical_allocator_process.hpp 2422fbd 
  src/master/main.cpp d6e1c73 
  src/master/master.hpp 9776a7c 
  src/master/master.cpp 5b0e8c0 
  src/master/sorter.hpp 73db6b1 
  src/tests/allocator_tests.cpp ca8d4ea 
  src/tests/allocator_zookeeper_tests.cpp 82a498e 
  src/tests/fault_tolerance_tests.cpp 0348f20 
  src/tests/resource_offers_tests.cpp ad3640f 
  src/tests/slave_recovery_tests.cpp 7be332b 
  src/tests/sorter_tests.cpp 61e6038 
  src/tests/utils.hpp b5c577d 
  third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 

Diff: https://reviews.apache.org/r/6665/diff/


Testing
-------

make check


Thanks,

Thomas Marshall


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Thomas Marshall <tw...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/
-----------------------------------------------------------

(Updated April 2, 2013, midnight)


Review request for mesos and Benjamin Hindman.


Changes
-------

Updated to trunk.


Description
-------

There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.


Diffs (updated)
-----

  src/local/local.hpp 2633d25 
  src/local/local.cpp f43ab2a 
  src/master/allocator.hpp b68b67d 
  src/master/drf_sorter.hpp 79566cc 
  src/master/drf_sorter.cpp fe31a00 
  src/master/hierarchical_allocator_process.hpp 2422fbd 
  src/master/main.cpp d6e1c73 
  src/master/master.hpp 9776a7c 
  src/master/master.cpp 5b0e8c0 
  src/master/sorter.hpp 73db6b1 
  src/tests/allocator_tests.cpp 04a8581 
  src/tests/allocator_zookeeper_tests.cpp cc6e22e 
  src/tests/fault_tolerance_tests.cpp d02c9b2 
  src/tests/gc_tests.cpp 037fe09 
  src/tests/master_detector_tests.cpp fe3b91f 
  src/tests/master_tests.cpp c74cb5e 
  src/tests/resource_offers_tests.cpp 53b92ab 
  src/tests/slave_recovery_tests.cpp b4276ed 
  src/tests/sorter_tests.cpp 61e6038 
  src/tests/status_update_manager_tests.cpp a355ecd 
  src/tests/utils.hpp 32784a7 
  third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 

Diff: https://reviews.apache.org/r/6665/diff/


Testing
-------

make check


Thanks,

Thomas Marshall


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Thomas Marshall <tw...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/
-----------------------------------------------------------

(Updated March 21, 2013, 11:10 p.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Updated to trunk.


Description
-------

There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.


Diffs (updated)
-----

  src/local/local.hpp 2633d25 
  src/local/local.cpp f43ab2a 
  src/master/allocator.hpp b68b67d 
  src/master/drf_sorter.hpp 79566cc 
  src/master/drf_sorter.cpp fe31a00 
  src/master/hierarchical_allocator_process.hpp 2422fbd 
  src/master/main.cpp d6e1c73 
  src/master/master.hpp 9776a7c 
  src/master/master.cpp 5b0e8c0 
  src/master/sorter.hpp 73db6b1 
  src/tests/allocator_tests.cpp e8b835f 
  src/tests/allocator_zookeeper_tests.cpp 802a508 
  src/tests/fault_tolerance_tests.cpp 9d3f8b1 
  src/tests/gc_tests.cpp 67bf957 
  src/tests/master_detector_tests.cpp fe3b91f 
  src/tests/master_tests.cpp 16f7637 
  src/tests/resource_offers_tests.cpp 53b92ab 
  src/tests/slave_recovery_tests.cpp 47f9b0f 
  src/tests/sorter_tests.cpp 61e6038 
  src/tests/status_update_manager_tests.cpp 9ce47b5 
  src/tests/utils.hpp 5020357 
  third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 

Diff: https://reviews.apache.org/r/6665/diff/


Testing
-------

make check


Thanks,

Thomas Marshall


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review17760
-----------------------------------------------------------

Ship it!


Also, if you're worried about performance, feel free to re-run your performance test to check for regressions!

- Ben Mahler


On March 12, 2013, 8:58 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated March 12, 2013, 8:58 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp 3402029 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp fe31a00 
>   src/master/hierarchical_allocator_process.hpp c1d6f54 
>   src/master/main.cpp ca0abec 
>   src/master/master.hpp c9b4b3f 
>   src/master/master.cpp 814a6e1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp b953cd1 
>   src/tests/allocator_zookeeper_tests.cpp 3f70202 
>   src/tests/fault_tolerance_tests.cpp 61509f1 
>   src/tests/gc_tests.cpp fbdd6d6 
>   src/tests/master_detector_tests.cpp 787ba19 
>   src/tests/master_tests.cpp 2ba14fc 
>   src/tests/resource_offers_tests.cpp 44eaf0d 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp d3efa58 
>   third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Thomas Marshall <tw...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/
-----------------------------------------------------------

(Updated March 12, 2013, 8:58 p.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Updated to Ben M's review.


Summary (updated)
-----------------

Clean up HierarchicalAllocatorProcess code


Description (updated)
-------

There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.


Diffs (updated)
-----

  src/local/local.hpp 2633d25 
  src/local/local.cpp 3402029 
  src/master/allocator.hpp b68b67d 
  src/master/drf_sorter.hpp 79566cc 
  src/master/drf_sorter.cpp fe31a00 
  src/master/hierarchical_allocator_process.hpp c1d6f54 
  src/master/main.cpp ca0abec 
  src/master/master.hpp c9b4b3f 
  src/master/master.cpp 814a6e1 
  src/master/sorter.hpp 73db6b1 
  src/tests/allocator_tests.cpp b953cd1 
  src/tests/allocator_zookeeper_tests.cpp 3f70202 
  src/tests/fault_tolerance_tests.cpp 61509f1 
  src/tests/gc_tests.cpp fbdd6d6 
  src/tests/master_detector_tests.cpp 787ba19 
  src/tests/master_tests.cpp 2ba14fc 
  src/tests/resource_offers_tests.cpp 44eaf0d 
  src/tests/sorter_tests.cpp 61e6038 
  src/tests/utils.hpp d3efa58 
  third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 

Diff: https://reviews.apache.org/r/6665/diff/


Testing
-------

make check


Thanks,

Thomas Marshall


Re: Review Request: Speed up allocations by keeping a set of slaves with available resources

Posted by Thomas Marshall <tw...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/
-----------------------------------------------------------

(Updated March 11, 2013, 9:59 p.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Updated to Ben M's review.


Description
-------

Currently, every time we do an allocation we have to traverse the entire list of active slaves an check each one to see if its whitelisted and if it has resources to allocate. This patch keeps a set of all slaves that meet those requirements, and updates it when slaves are added/removed and when resources are allocated/recovered.

Timing comparisons, using test_framework on a local cluster on my laptop (each number is an average over 10 tests):

100 slaves, 100 tasks
without patch: 5.82s
with patch: 5.773s
improvement of about 1%

1000 slaves, 100 tasks
without patch: 8.261s
with patch: 8.07s
improvement of about 2%

Since this was a scalability issue, you'd presumably see a bigger improvement in larger clusters but its difficult to simulate more than 1000 slaves or so locally. If more convincing numbers are needed I can do some testing on EC2 (if nothing else, I'm hoping we'll have a Jenkins build with automated performance tests set up later this semester, so I can test this in the process of setting that up), but at the very least it seems clear that the patch improves the runtime complexity of the allocation algorithm and doesn't slow allocations down even in small clusters. 


Diffs (updated)
-----

  src/local/local.hpp 2633d25 
  src/local/local.cpp 3402029 
  src/master/allocator.hpp b68b67d 
  src/master/drf_sorter.hpp 79566cc 
  src/master/drf_sorter.cpp fe31a00 
  src/master/hierarchical_allocator_process.hpp c1d6f54 
  src/master/main.cpp ca0abec 
  src/master/master.hpp c9b4b3f 
  src/master/master.cpp 814a6e1 
  src/master/sorter.hpp 73db6b1 
  src/tests/allocator_tests.cpp b953cd1 
  src/tests/allocator_zookeeper_tests.cpp 3f70202 
  src/tests/fault_tolerance_tests.cpp 61509f1 
  src/tests/gc_tests.cpp fbdd6d6 
  src/tests/master_detector_tests.cpp 787ba19 
  src/tests/master_tests.cpp 2ba14fc 
  src/tests/resource_offers_tests.cpp 44eaf0d 
  src/tests/sorter_tests.cpp 61e6038 
  src/tests/utils.hpp d3efa58 
  third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 

Diff: https://reviews.apache.org/r/6665/diff/


Testing
-------

make check


Thanks,

Thomas Marshall


Re: Review Request: Speed up allocations by keeping a set of slaves with available resources

Posted by Ben Mahler <be...@gmail.com>.

> On March 6, 2013, 7:23 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 529
> > <https://reviews.apache.org/r/6665/diff/8/?file=264106#file264106line529>
> >
> >     Looks to me like allocatable can just be a function on the slave struct:
> >     
> >     bool allocatable() const { return whitelisted && isSufficient(available); }
> >     
> >     Seems cleaner, no? That way we only have to update the whitelisted bit here and elsewhere.
> >     
> >     This loop would then become:
> >     
> >     foreachkey (const SlaveID& slaveId, slaves) {
> >       slave[slaveId].whitelisted = isWhitelisted(slaveId);
> >     }
> 
> Thomas Marshall wrote:
>     I implemented this the way that you suggested because it seems reasonable, but I just wanted to note that this means that the title of this review is now meaningless and this patch only accomplishes restructuring the allocator to have clearer code and does not do anything about speeding allocations up, since we are back to checking whitelisting and resource minimums on all slaves for every allocation now (again, as per your suggestions).
>     
>     I'm assuming that your relative lack of concern about efficiency issues means that Twitter isn't experiencing any problems with the allocator running too slowly, so I'm perfectly fine with this.

Great! I don't quite get your comment here. We do indeed now check resource minimums but "checking whitelisting" is *only* checking the newly added whitelisted bit, rather than the hashset, right? (are you referring to line 699?)


> On March 6, 2013, 7:23 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 452
> > <https://reviews.apache.org/r/6665/diff/8/?file=264106#file264106line452>
> >
> >     Inline?
> >     
> >     slaves[slaveId] = Slave(slaveInfo);
> >     
> >     Also, maybe you want to add whitelisted as an optional constructor parameter?
> 
> Thomas Marshall wrote:
>     The problem with doing something like 'slaves[slaveId] = Slave(slaveInfo, isWhitelisted(slaveId))' is that you're calling isWhitelisted before slaveId is in slaves, meaning that the 'CHECK(slaves.contains(slaveId)' in isWhitelisted will fail and 'slaves[slaveId].hostname()' won't return anything. We could modify isWhitelisted to take the hostname as the parameter instead of a slaveId, but given that we're also using isWhitelisted in updateWhitelist, I would rather leave isWhitelisted the way it is.

Got it, sounds like a bad abstraction, maybe we should fix that in a later review!


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review17482
-----------------------------------------------------------


On March 11, 2013, 9:59 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated March 11, 2013, 9:59 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> Currently, every time we do an allocation we have to traverse the entire list of active slaves an check each one to see if its whitelisted and if it has resources to allocate. This patch keeps a set of all slaves that meet those requirements, and updates it when slaves are added/removed and when resources are allocated/recovered.
> 
> Timing comparisons, using test_framework on a local cluster on my laptop (each number is an average over 10 tests):
> 
> 100 slaves, 100 tasks
> without patch: 5.82s
> with patch: 5.773s
> improvement of about 1%
> 
> 1000 slaves, 100 tasks
> without patch: 8.261s
> with patch: 8.07s
> improvement of about 2%
> 
> Since this was a scalability issue, you'd presumably see a bigger improvement in larger clusters but its difficult to simulate more than 1000 slaves or so locally. If more convincing numbers are needed I can do some testing on EC2 (if nothing else, I'm hoping we'll have a Jenkins build with automated performance tests set up later this semester, so I can test this in the process of setting that up), but at the very least it seems clear that the patch improves the runtime complexity of the allocation algorithm and doesn't slow allocations down even in small clusters. 
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp 3402029 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp fe31a00 
>   src/master/hierarchical_allocator_process.hpp c1d6f54 
>   src/master/main.cpp ca0abec 
>   src/master/master.hpp c9b4b3f 
>   src/master/master.cpp 814a6e1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp b953cd1 
>   src/tests/allocator_zookeeper_tests.cpp 3f70202 
>   src/tests/fault_tolerance_tests.cpp 61509f1 
>   src/tests/gc_tests.cpp fbdd6d6 
>   src/tests/master_detector_tests.cpp 787ba19 
>   src/tests/master_tests.cpp 2ba14fc 
>   src/tests/resource_offers_tests.cpp 44eaf0d 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp d3efa58 
>   third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Ben Mahler <be...@gmail.com>.

> On March 6, 2013, 7:23 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 529
> > <https://reviews.apache.org/r/6665/diff/8/?file=264106#file264106line529>
> >
> >     Looks to me like allocatable can just be a function on the slave struct:
> >     
> >     bool allocatable() const { return whitelisted && isSufficient(available); }
> >     
> >     Seems cleaner, no? That way we only have to update the whitelisted bit here and elsewhere.
> >     
> >     This loop would then become:
> >     
> >     foreachkey (const SlaveID& slaveId, slaves) {
> >       slave[slaveId].whitelisted = isWhitelisted(slaveId);
> >     }
> 
> Thomas Marshall wrote:
>     I implemented this the way that you suggested because it seems reasonable, but I just wanted to note that this means that the title of this review is now meaningless and this patch only accomplishes restructuring the allocator to have clearer code and does not do anything about speeding allocations up, since we are back to checking whitelisting and resource minimums on all slaves for every allocation now (again, as per your suggestions).
>     
>     I'm assuming that your relative lack of concern about efficiency issues means that Twitter isn't experiencing any problems with the allocator running too slowly, so I'm perfectly fine with this.
> 
> Ben Mahler wrote:
>     Great! I don't quite get your comment here. We do indeed now check resource minimums but "checking whitelisting" is *only* checking the newly added whitelisted bit, rather than the hashset, right? (are you referring to line 699?)
> 
> Thomas Marshall wrote:
>     Yes, I'm referring to line 699, which now essentially does the work of the giant for loop at the beginning of allocate() eliminating which was the original goal of this patch. It's true that we've changed checking whitelist status from a call to isWhitelisted (which checks against the hashset) to checking the whitelisted bit, but we've reverted from not needing to check resources (because membership in the list we were iterating through indicated sufficient resources), to checking a bit for sufficient resources, back to actually pulling out and checking resources for each slave.
>     
>     Again, this isn't a problem for me, and I'll update the title and description accordingly.

Also, for more background, we undertook some performance analysis at twitter on a large simulated cluster. A large amount of time (~20%) was spent in the allocator, from which significant time was spent working with the resources. Check out this pprof call graph:

https://dl.dropbox.com/u/8265424/pprof.svg


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review17482
-----------------------------------------------------------


On March 12, 2013, 8:58 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated March 12, 2013, 8:58 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp 3402029 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp fe31a00 
>   src/master/hierarchical_allocator_process.hpp c1d6f54 
>   src/master/main.cpp ca0abec 
>   src/master/master.hpp c9b4b3f 
>   src/master/master.cpp 814a6e1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp b953cd1 
>   src/tests/allocator_zookeeper_tests.cpp 3f70202 
>   src/tests/fault_tolerance_tests.cpp 61509f1 
>   src/tests/gc_tests.cpp fbdd6d6 
>   src/tests/master_detector_tests.cpp 787ba19 
>   src/tests/master_tests.cpp 2ba14fc 
>   src/tests/resource_offers_tests.cpp 44eaf0d 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp d3efa58 
>   third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Speed up allocations by keeping a set of slaves with available resources

Posted by Thomas Marshall <tw...@gmail.com>.

> On March 6, 2013, 7:23 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 529
> > <https://reviews.apache.org/r/6665/diff/8/?file=264106#file264106line529>
> >
> >     Looks to me like allocatable can just be a function on the slave struct:
> >     
> >     bool allocatable() const { return whitelisted && isSufficient(available); }
> >     
> >     Seems cleaner, no? That way we only have to update the whitelisted bit here and elsewhere.
> >     
> >     This loop would then become:
> >     
> >     foreachkey (const SlaveID& slaveId, slaves) {
> >       slave[slaveId].whitelisted = isWhitelisted(slaveId);
> >     }

I implemented this the way that you suggested because it seems reasonable, but I just wanted to note that this means that the title of this review is now meaningless and this patch only accomplishes restructuring the allocator to have clearer code and does not do anything about speeding allocations up, since we are back to checking whitelisting and resource minimums on all slaves for every allocation now (again, as per your suggestions).

I'm assuming that your relative lack of concern about efficiency issues means that Twitter isn't experiencing any problems with the allocator running too slowly, so I'm perfectly fine with this.


> On March 6, 2013, 7:23 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 452
> > <https://reviews.apache.org/r/6665/diff/8/?file=264106#file264106line452>
> >
> >     Inline?
> >     
> >     slaves[slaveId] = Slave(slaveInfo);
> >     
> >     Also, maybe you want to add whitelisted as an optional constructor parameter?

The problem with doing something like 'slaves[slaveId] = Slave(slaveInfo, isWhitelisted(slaveId))' is that you're calling isWhitelisted before slaveId is in slaves, meaning that the 'CHECK(slaves.contains(slaveId)' in isWhitelisted will fail and 'slaves[slaveId].hostname()' won't return anything. We could modify isWhitelisted to take the hostname as the parameter instead of a slaveId, but given that we're also using isWhitelisted in updateWhitelist, I would rather leave isWhitelisted the way it is.


> On March 6, 2013, 7:23 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 792
> > <https://reviews.apache.org/r/6665/diff/8/?file=264106#file264106line792>
> >
> >     const&?

This causes a seg fault because slaves[slaveId].available is updated below.


- Thomas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review17482
-----------------------------------------------------------


On March 11, 2013, 9:59 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated March 11, 2013, 9:59 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> Currently, every time we do an allocation we have to traverse the entire list of active slaves an check each one to see if its whitelisted and if it has resources to allocate. This patch keeps a set of all slaves that meet those requirements, and updates it when slaves are added/removed and when resources are allocated/recovered.
> 
> Timing comparisons, using test_framework on a local cluster on my laptop (each number is an average over 10 tests):
> 
> 100 slaves, 100 tasks
> without patch: 5.82s
> with patch: 5.773s
> improvement of about 1%
> 
> 1000 slaves, 100 tasks
> without patch: 8.261s
> with patch: 8.07s
> improvement of about 2%
> 
> Since this was a scalability issue, you'd presumably see a bigger improvement in larger clusters but its difficult to simulate more than 1000 slaves or so locally. If more convincing numbers are needed I can do some testing on EC2 (if nothing else, I'm hoping we'll have a Jenkins build with automated performance tests set up later this semester, so I can test this in the process of setting that up), but at the very least it seems clear that the patch improves the runtime complexity of the allocation algorithm and doesn't slow allocations down even in small clusters. 
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp 3402029 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp fe31a00 
>   src/master/hierarchical_allocator_process.hpp c1d6f54 
>   src/master/main.cpp ca0abec 
>   src/master/master.hpp c9b4b3f 
>   src/master/master.cpp 814a6e1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp b953cd1 
>   src/tests/allocator_zookeeper_tests.cpp 3f70202 
>   src/tests/fault_tolerance_tests.cpp 61509f1 
>   src/tests/gc_tests.cpp fbdd6d6 
>   src/tests/master_detector_tests.cpp 787ba19 
>   src/tests/master_tests.cpp 2ba14fc 
>   src/tests/resource_offers_tests.cpp 44eaf0d 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp d3efa58 
>   third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Clean up HierarchicalAllocatorProcess code

Posted by Thomas Marshall <tw...@gmail.com>.

> On March 6, 2013, 7:23 p.m., Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 529
> > <https://reviews.apache.org/r/6665/diff/8/?file=264106#file264106line529>
> >
> >     Looks to me like allocatable can just be a function on the slave struct:
> >     
> >     bool allocatable() const { return whitelisted && isSufficient(available); }
> >     
> >     Seems cleaner, no? That way we only have to update the whitelisted bit here and elsewhere.
> >     
> >     This loop would then become:
> >     
> >     foreachkey (const SlaveID& slaveId, slaves) {
> >       slave[slaveId].whitelisted = isWhitelisted(slaveId);
> >     }
> 
> Thomas Marshall wrote:
>     I implemented this the way that you suggested because it seems reasonable, but I just wanted to note that this means that the title of this review is now meaningless and this patch only accomplishes restructuring the allocator to have clearer code and does not do anything about speeding allocations up, since we are back to checking whitelisting and resource minimums on all slaves for every allocation now (again, as per your suggestions).
>     
>     I'm assuming that your relative lack of concern about efficiency issues means that Twitter isn't experiencing any problems with the allocator running too slowly, so I'm perfectly fine with this.
> 
> Ben Mahler wrote:
>     Great! I don't quite get your comment here. We do indeed now check resource minimums but "checking whitelisting" is *only* checking the newly added whitelisted bit, rather than the hashset, right? (are you referring to line 699?)

Yes, I'm referring to line 699, which now essentially does the work of the giant for loop at the beginning of allocate() eliminating which was the original goal of this patch. It's true that we've changed checking whitelist status from a call to isWhitelisted (which checks against the hashset) to checking the whitelisted bit, but we've reverted from not needing to check resources (because membership in the list we were iterating through indicated sufficient resources), to checking a bit for sufficient resources, back to actually pulling out and checking resources for each slave.

Again, this isn't a problem for me, and I'll update the title and description accordingly.


- Thomas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review17482
-----------------------------------------------------------


On March 12, 2013, 8:58 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated March 12, 2013, 8:58 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> There has been an unfortunate trend in the HierarchicalAllocatorProcess code to use lots of hashmaps between slaveIds or frameworkIds and some attribute of those frameworks/slaves. This patch collapses those into two hashmaps, one for slaves and one for frameworks, mapping ids to structs containing the appropriate attributes. The patch also introduces an allocator namespace.
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp 3402029 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp fe31a00 
>   src/master/hierarchical_allocator_process.hpp c1d6f54 
>   src/master/main.cpp ca0abec 
>   src/master/master.hpp c9b4b3f 
>   src/master/master.cpp 814a6e1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp b953cd1 
>   src/tests/allocator_zookeeper_tests.cpp 3f70202 
>   src/tests/fault_tolerance_tests.cpp 61509f1 
>   src/tests/gc_tests.cpp fbdd6d6 
>   src/tests/master_detector_tests.cpp 787ba19 
>   src/tests/master_tests.cpp 2ba14fc 
>   src/tests/resource_offers_tests.cpp 44eaf0d 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp d3efa58 
>   third_party/libprocess/third_party/stout/include/stout/stringify.hpp 136316d 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>


Re: Review Request: Speed up allocations by keeping a set of slaves with available resources

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review17482
-----------------------------------------------------------


Thanks for the cleanup, this is looking much better! Sorry for all the back and forth =/


src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37081>

    I like the defensiveness of the CHECKs, however, I think it's too much complexity, given we already use default constructors all over the place in our map values.
    
    So I think for Slave and Framework, let's
    
    1. kill the 'initialized' member
    2. one-line the 'resources()', 'hostname()', and 'user()' helpers (akin to the duration.hpp one-line methods)
    3. have the default constructor have no initializer list (unless the compiler complains?)
    
    I think that will simplify this code quite a bit!



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37077>

    const



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37078>

    const



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37079>

    const



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37082>

    inline?
    
    frameworks[frameworkId] = Framework(frameworkInfo);



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37083>

    const&?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37086>

    much nicer! But maybe we need to keep the comment as to why we don't delete the filters (in the Framework destructor)



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37084>

    const&?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37087>

    Here as well, keep the comment on why we don't delete them.



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37088>

    Inline?
    
    slaves[slaveId] = Slave(slaveInfo);
    
    Also, maybe you want to add whitelisted as an optional constructor parameter?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37092>

    Looks like stringify will work instead?
    
    LOG(INFO) << "Updated slave white list: " << strinfify(whitelist.get());
    
    We have set utilities in stringify.hpp
    
    template <typename T>
    std::string stringify(const std::set<T>& set)
    {
      std::ostringstream out;
      out << "{ ";
      typename std::set<T>::const_iterator iterator = set.begin();
      while (iterator != set.end()) {
        out << stringify(*iterator);
        if (++iterator != set.end()) {
          out << ", ";
        }
      }
      out << " }";
      return out.str();
    }



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37096>

    Looks to me like allocatable can just be a function on the slave struct:
    
    bool allocatable() const { return whitelisted && isSufficient(available); }
    
    Seems cleaner, no? That way we only have to update the whitelisted bit here and elsewhere.
    
    This loop would then become:
    
    foreachkey (const SlaveID& slaveId, slaves) {
      slave[slaveId].whitelisted = isWhitelisted(slaveId);
    }



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37099>

    else on the same line as closing brace



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37095>

    ditto



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37100>

    const&?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37101>

    const&?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37102>

    empty()?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37103>

    const&?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37105>

    !empty()?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37107>

    Supposing you made allocatable a function in Slave as I described above, it would now be returning false as expected right?
    
    Since isSufficient(slave.available) == false?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/6665/#comment37076>

    Just an FYI, Vinod has introduced helpers to resources:
    
    Option<double> cpus = resources.cpus();
    Option<double> mem = resources.mem();
    
    But feel free to leave as is, since we don't currently have an optional default value in Option::get(). So using these helpers here is probably more code than what you're doing now.


- Ben Mahler


On March 1, 2013, 7:48 p.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated March 1, 2013, 7:48 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> Currently, every time we do an allocation we have to traverse the entire list of active slaves an check each one to see if its whitelisted and if it has resources to allocate. This patch keeps a set of all slaves that meet those requirements, and updates it when slaves are added/removed and when resources are allocated/recovered.
> 
> Timing comparisons, using test_framework on a local cluster on my laptop (each number is an average over 10 tests):
> 
> 100 slaves, 100 tasks
> without patch: 5.82s
> with patch: 5.773s
> improvement of about 1%
> 
> 1000 slaves, 100 tasks
> without patch: 8.261s
> with patch: 8.07s
> improvement of about 2%
> 
> Since this was a scalability issue, you'd presumably see a bigger improvement in larger clusters but its difficult to simulate more than 1000 slaves or so locally. If more convincing numbers are needed I can do some testing on EC2 (if nothing else, I'm hoping we'll have a Jenkins build with automated performance tests set up later this semester, so I can test this in the process of setting that up), but at the very least it seems clear that the patch improves the runtime complexity of the allocation algorithm and doesn't slow allocations down even in small clusters. 
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp 3402029 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp 33a8ec8 
>   src/master/hierarchical_allocator_process.hpp 33e059c 
>   src/master/main.cpp ca0abec 
>   src/master/master.hpp c9b4b3f 
>   src/master/master.cpp 814a6e1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp 44d72ad 
>   src/tests/allocator_zookeeper_tests.cpp 5f83dd7 
>   src/tests/fault_tolerance_tests.cpp 44eef03 
>   src/tests/gc_tests.cpp 411bcc0 
>   src/tests/master_detector_tests.cpp 5d09bab 
>   src/tests/master_tests.cpp e140f40 
>   src/tests/resource_offers_tests.cpp 5981191 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp 4600c2f 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>