You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Vinod Kone <vi...@gmail.com> on 2014/08/06 02:49:49 UTC

Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/
-----------------------------------------------------------

Review request for mesos and Benjamin Hindman.


Bugs: MESOS-1119
    https://issues.apache.org/jira/browse/MESOS-1119


Repository: mesos-git


Description
-------

The allocator now updates the DRF algorithm after every allocation of a slave's resources. This will make the allocation more granular and fair.


Diffs
-----

  src/master/hierarchical_allocator_process.hpp 35d1579bbd665dfdd238932902ae16880dadeb66 
  src/tests/allocator_tests.cpp b920533bbc36a154c02b467673bc7a0d50d65bc7 

Diff: https://reviews.apache.org/r/24356/diff/


Testing
-------

make check

make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1


Thanks,

Vinod Kone


Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/
-----------------------------------------------------------

(Updated Aug. 7, 2014, 11:02 p.m.)


Review request for mesos, Adam B and Benjamin Hindman.


Changes
-------

benh's comments.


Bugs: MESOS-1119
    https://issues.apache.org/jira/browse/MESOS-1119


Repository: mesos-git


Description
-------

The allocator now updates the DRF algorithm after every allocation of a slave's resources. This will make the allocation more granular and fair.


Diffs (updated)
-----

  src/master/hierarchical_allocator_process.hpp d81082f7fa2a4ffc13f18c232abb8ad2814bebc1 
  src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 

Diff: https://reviews.apache.org/r/24356/diff/


Testing
-------

make check

make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1


Thanks,

Vinod Kone


Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

Posted by Vinod Kone <vi...@gmail.com>.

> On Aug. 7, 2014, 3:39 p.m., Adam B wrote:
> > Have you done any simulations or performance tests on a test cluster? I'd love to see some numbers that prove that this is more fair in some scenario than the previous algorithm. Not sure I can blindly trust a unit test.

Not yet. I'll have to think about how to test "fairness". Intuitively though, this is doing more fine-grained resource allocation. Previously we were giving roughly the entire cluster's resources to one framework with the smallest share and waiting for it decline them in order to give them to another framework. With this change, a framework will be allocated enough slaves' resources to make its DRF share on par/better than other frameworks. This way multiple frameworks would be given "some" resources during every allocation, while still honoring DRF. It would also help ameliorate concerns around frameworks holding on to offers and not playing well with DRF.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49901
-----------------------------------------------------------


On Aug. 7, 2014, 11:02 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> -----------------------------------------------------------
> 
> (Updated Aug. 7, 2014, 11:02 p.m.)
> 
> 
> Review request for mesos, Adam B and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
>     https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> The allocator now updates the DRF algorithm after every allocation of a slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -----
> 
>   src/master/hierarchical_allocator_process.hpp d81082f7fa2a4ffc13f18c232abb8ad2814bebc1 
>   src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

Posted by Adam B <ad...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49901
-----------------------------------------------------------


Have you done any simulations or performance tests on a test cluster? I'd love to see some numbers that prove that this is more fair in some scenario than the previous algorithm. Not sure I can blindly trust a unit test.

- Adam B


On Aug. 6, 2014, 3:56 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> -----------------------------------------------------------
> 
> (Updated Aug. 6, 2014, 3:56 p.m.)
> 
> 
> Review request for mesos, Adam B and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
>     https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> The allocator now updates the DRF algorithm after every allocation of a slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -----
> 
>   src/master/hierarchical_allocator_process.hpp c7e689e882e88ef5adb31e72909ef85d8a8c0067 
>   src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

Posted by Vinod Kone <vi...@gmail.com>.

> On Aug. 7, 2014, 3:23 p.m., Benjamin Hindman wrote:
> > src/tests/allocator_tests.cpp, lines 405-407
> > <https://reviews.apache.org/r/24356/diff/3/?file=654288#file654288line405>
> >
> >     This makes me feel like you're assuming there is going to be a time delay between when the resources are recovered and the next allocation, otherwise, especially since you set the refuse seconds to 0, these resources should be reallocated right away. I think you want to have the other framework already registered before the first framework declines so that you're guaranteed that there isn't some weird timing event that makes this test be flaky. Or if I'm missing something please leave a big comment explaining why you don't need to do it that way, and any assumptions you're making.

Added a comment.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49898
-----------------------------------------------------------


On Aug. 7, 2014, 11:02 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> -----------------------------------------------------------
> 
> (Updated Aug. 7, 2014, 11:02 p.m.)
> 
> 
> Review request for mesos, Adam B and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
>     https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> The allocator now updates the DRF algorithm after every allocation of a slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -----
> 
>   src/master/hierarchical_allocator_process.hpp d81082f7fa2a4ffc13f18c232abb8ad2814bebc1 
>   src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

Posted by Benjamin Hindman <be...@berkeley.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49898
-----------------------------------------------------------

Ship it!


See my comment in the test below, otherwise this looks good to go.


src/tests/allocator_tests.cpp
<https://reviews.apache.org/r/24356/#comment87310>

    Can we still not do Some("role1,role2")? I'm guessing you need Option<string> because the compiler can't go from const char* to Option<string> right?



src/tests/allocator_tests.cpp
<https://reviews.apache.org/r/24356/#comment87311>

    This makes me feel like you're assuming there is going to be a time delay between when the resources are recovered and the next allocation, otherwise, especially since you set the refuse seconds to 0, these resources should be reallocated right away. I think you want to have the other framework already registered before the first framework declines so that you're guaranteed that there isn't some weird timing event that makes this test be flaky. Or if I'm missing something please leave a big comment explaining why you don't need to do it that way, and any assumptions you're making.


- Benjamin Hindman


On Aug. 6, 2014, 10:56 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> -----------------------------------------------------------
> 
> (Updated Aug. 6, 2014, 10:56 p.m.)
> 
> 
> Review request for mesos, Adam B and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
>     https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> The allocator now updates the DRF algorithm after every allocation of a slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -----
> 
>   src/master/hierarchical_allocator_process.hpp c7e689e882e88ef5adb31e72909ef85d8a8c0067 
>   src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/
-----------------------------------------------------------

(Updated Aug. 6, 2014, 10:56 p.m.)


Review request for mesos, Adam B and Benjamin Hindman.


Changes
-------

rebased. s/resourcesUnused/resourcesRecovered/


Bugs: MESOS-1119
    https://issues.apache.org/jira/browse/MESOS-1119


Repository: mesos-git


Description
-------

The allocator now updates the DRF algorithm after every allocation of a slave's resources. This will make the allocation more granular and fair.


Diffs (updated)
-----

  src/master/hierarchical_allocator_process.hpp c7e689e882e88ef5adb31e72909ef85d8a8c0067 
  src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 

Diff: https://reviews.apache.org/r/24356/diff/


Testing
-------

make check

make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1


Thanks,

Vinod Kone


Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/
-----------------------------------------------------------

(Updated Aug. 6, 2014, 10:24 p.m.)


Review request for mesos, Adam B and Benjamin Hindman.


Changes
-------

comments. PTAL.


Bugs: MESOS-1119
    https://issues.apache.org/jira/browse/MESOS-1119


Repository: mesos-git


Description
-------

The allocator now updates the DRF algorithm after every allocation of a slave's resources. This will make the allocation more granular and fair.


Diffs (updated)
-----

  src/master/hierarchical_allocator_process.hpp c7e689e882e88ef5adb31e72909ef85d8a8c0067 
  src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 

Diff: https://reviews.apache.org/r/24356/diff/


Testing
-------

make check

make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1


Thanks,

Vinod Kone


Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

Posted by Vinod Kone <vi...@gmail.com>.

> On Aug. 6, 2014, 6:37 a.m., Adam B wrote:
> > src/master/hierarchical_allocator_process.hpp, line 729
> > <https://reviews.apache.org/r/24356/diff/1/?file=653338#file653338line729>
> >
> >     In what order are you iterating through the slaves? What order should it be?

good point. right now it's the order in which the keys are stored in hashet which is undefined. maybe one option is to shuffle the slaves during every allocation for some sort of fairness. or we could sort them by available resources on a slave. not sure which one gels well with the DRF algorithm. i'm open to suggestions here.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49700
-----------------------------------------------------------


On Aug. 6, 2014, 12:49 a.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> -----------------------------------------------------------
> 
> (Updated Aug. 6, 2014, 12:49 a.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
>     https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> The allocator now updates the DRF algorithm after every allocation of a slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -----
> 
>   src/master/hierarchical_allocator_process.hpp 35d1579bbd665dfdd238932902ae16880dadeb66 
>   src/tests/allocator_tests.cpp b920533bbc36a154c02b467673bc7a0d50d65bc7 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

Posted by Vinod Kone <vi...@gmail.com>.

> On Aug. 6, 2014, 6:37 a.m., Adam B wrote:
> > src/master/hierarchical_allocator_process.hpp, line 729
> > <https://reviews.apache.org/r/24356/diff/1/?file=653338#file653338line729>
> >
> >     In what order are you iterating through the slaves? What order should it be?
> 
> Vinod Kone wrote:
>     good point. right now it's the order in which the keys are stored in hashet which is undefined. maybe one option is to shuffle the slaves during every allocation for some sort of fairness. or we could sort them by available resources on a slave. not sure which one gels well with the DRF algorithm. i'm open to suggestions here.

briefly chatted w/ benh. decided to go with random shuffling for now with a TODO for a smarter algorithm.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49700
-----------------------------------------------------------


On Aug. 6, 2014, 12:49 a.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> -----------------------------------------------------------
> 
> (Updated Aug. 6, 2014, 12:49 a.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
>     https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> The allocator now updates the DRF algorithm after every allocation of a slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -----
> 
>   src/master/hierarchical_allocator_process.hpp 35d1579bbd665dfdd238932902ae16880dadeb66 
>   src/tests/allocator_tests.cpp b920533bbc36a154c02b467673bc7a0d50d65bc7 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>


Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

Posted by Adam B <ad...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49700
-----------------------------------------------------------



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/24356/#comment86989>

    In what order are you iterating through the slaves? What order should it be?



src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/24356/#comment86990>

    Why even have an 'allocatedResources' variable if it's always the same as 'unreserved'?


- Adam B


On Aug. 5, 2014, 5:49 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> -----------------------------------------------------------
> 
> (Updated Aug. 5, 2014, 5:49 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
>     https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> The allocator now updates the DRF algorithm after every allocation of a slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -----
> 
>   src/master/hierarchical_allocator_process.hpp 35d1579bbd665dfdd238932902ae16880dadeb66 
>   src/tests/allocator_tests.cpp b920533bbc36a154c02b467673bc7a0d50d65bc7 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>