You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Michael Park <mc...@gmail.com> on 2015/06/20 22:09:44 UTC

Review Request 35702: [WIP] Add /reserve HTTP endpoint to the master.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

Review request for mesos, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Repository: mesos


Description
-------

This is a work in progress. But I'm sharing early to gather some feedback around:

(1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
(2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
    Is this still correct with oversubscription in the picture?
(3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
    Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?


Diffs
-----

  include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
  src/Makefile.am e7de0f3d1a5efeaef47d5074defe3b40db94f573 
  src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
  src/master/allocator/mesos/hierarchical.hpp 646ee8c1c0fb824e1d17150b4e96e6281c65358f 
  src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

Added `src/tests/reserve_tests.cpp` which still need to be fleshed out for error cases (missing parameters, unauthorized client, etc).


Thanks,

Michael Park


Re: Review Request 35702: [WIP] Add /reserve HTTP endpoint to the master.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88686
-----------------------------------------------------------


Patch looks great!

Reviews applied: [35702]

All tests passed.

- Mesos ReviewBot


On June 20, 2015, 8:19 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 20, 2015, 8:19 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This is still a work in progress, but I'm sharing to gather some feedback around:
> 
> (1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
> (2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
>     Is this still correct with oversubscription in the picture?
> (3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
>     Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
>   src/Makefile.am e7de0f3d1a5efeaef47d5074defe3b40db94f573 
>   src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
>   src/master/allocator/mesos/hierarchical.hpp 646ee8c1c0fb824e1d17150b4e96e6281c65358f 
>   src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
>   src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
>   src/tests/reserve_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> Added `src/tests/reserve_tests.cpp` which still need to be fleshed out for error cases (missing parameters, unauthorized client, etc).
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review89595
-----------------------------------------------------------


Patch looks great!

Reviews applied: [35934, 35939, 35947, 35702]

All tests passed.

- Mesos ReviewBot


On June 26, 2015, 10:56 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 26, 2015, 10:56 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am a064d17a6b62e6e3c8e190135bcc8cbbb0051ed5 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
>   src/tests/reserve_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> Added `src/tests/reserve_tests.cpp`.
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, line 507
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line507>
> >
> >     The code until this line is basically request validation and authorization. Though it's not how we do it now, do you think it makes sense to split the function into smaller logical parts?
> >     
> >     How about something like this:
> >     
> >     ```
> >     Future<Response> Master::Http::reserve(const Request& request) const
> >     {
> >       return Master::Http::reserveValidate();
> >     }
> >     
> >     Future<Response> Master::Http::reserveValidate(const Request& request) const
> >     {
> >       <...>
> >       return Master::Http::reserveAuthorize();
> >     }
> >     
> >     <...>
> >     ```
> 
> Michael Park wrote:
>     Yeah, I think it does make sense to break huge functions down to the smaller logical pieces. I think we can do a more general refactoring for the validation pattern, since they all pretty much do the same thing. But I think we can consider doing that uniformly, outside of this patch. What do you think?

I personally prefer sacrificing consistency, but write new code "right". However, generally we tend to favour consistency over local improvements, so feel free to "fix" the issue by creating a JIRA : ).


> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.
> 
> Michael Park wrote:
>     I didn't see a good reason to require a "one role per request" condition. The current interface accurately models the fact that an operator does not have a role associated to it like a framework does, and I don't think "avoiding changes in the `validate()` function" should have any influence in deciding how an interface behaves.
>     
>     If we required such a condition, the per-request atomicity guarantee comes with a limitation that it can only be for a single role. While I'm not sure of its value, I'm also not sure what we gain by requiring it from the user's perspective?

I think I'm missing something, my understanding is that each dynamic reservation is associated with a role, regardless, who issues a reservation request. I don't think limiting users to one role per request gives them any benefit, but it looks like we can be closer to framework-issued request if we do so. What am I missing?


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 27, 2015, 11:30 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 27, 2015, 11:30 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > A high level question: do you think rescinding offers is a big deal for now?

I don't believe it's a big deal for now because frameworks need to deal with rescinded offers regardless, and I imagine the frequency of operators using these endpoints will probably be low. Of course we'll monitor whether these expectations are true as we go forward, but I think for now it's ok.


> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, line 507
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line507>
> >
> >     The code until this line is basically request validation and authorization. Though it's not how we do it now, do you think it makes sense to split the function into smaller logical parts?
> >     
> >     How about something like this:
> >     
> >     ```
> >     Future<Response> Master::Http::reserve(const Request& request) const
> >     {
> >       return Master::Http::reserveValidate();
> >     }
> >     
> >     Future<Response> Master::Http::reserveValidate(const Request& request) const
> >     {
> >       <...>
> >       return Master::Http::reserveAuthorize();
> >     }
> >     
> >     <...>
> >     ```

Yeah, I think it does make sense to break huge functions down to the smaller logical pieces. I think we can do a more general refactoring for the validation pattern, since they all pretty much do the same thing. But I think we can consider doing that uniformly, outside of this patch. What do you think?


> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.

I didn't see a good reason to require a "one role per request" condition. The current interface accurately models the fact that an operator does not have a role associated to it like a framework does, and I don't think "avoiding changes in the `validate()` function" should have any influence in deciding how an interface behaves.

If we required such a condition, the per-request atomicity guarantee comes with a limitation that it can only be for a single role. While I'm not sure of its value, I'm also not sure what we gain by requiring it from the user's perspective?


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On June 28, 2015, 8:36 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 28, 2015, 8:36 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, line 507
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line507>
> >
> >     The code until this line is basically request validation and authorization. Though it's not how we do it now, do you think it makes sense to split the function into smaller logical parts?
> >     
> >     How about something like this:
> >     
> >     ```
> >     Future<Response> Master::Http::reserve(const Request& request) const
> >     {
> >       return Master::Http::reserveValidate();
> >     }
> >     
> >     Future<Response> Master::Http::reserveValidate(const Request& request) const
> >     {
> >       <...>
> >       return Master::Http::reserveAuthorize();
> >     }
> >     
> >     <...>
> >     ```
> 
> Michael Park wrote:
>     Yeah, I think it does make sense to break huge functions down to the smaller logical pieces. I think we can do a more general refactoring for the validation pattern, since they all pretty much do the same thing. But I think we can consider doing that uniformly, outside of this patch. What do you think?
> 
> Alexander Rukletsov wrote:
>     I personally prefer sacrificing consistency, but write new code "right". However, generally we tend to favour consistency over local improvements, so feel free to "fix" the issue by creating a JIRA : ).

I've filed a JIRA ticket for this here: https://issues.apache.org/jira/browse/MESOS-3186


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 31, 2015, 9:56 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 31, 2015, 9:56 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3772e39015a22655dcad00ad844dc5ddc90db43f 
>   src/master/master.hpp ea18c4e0bb0743747401b9cd5ea14ae9b56ae3cc 
>   src/master/master.cpp 351a3c2b5f551ad065682cea601d2436258e4544 
>   src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
>   src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.
> 
> Michael Park wrote:
>     I didn't see a good reason to require a "one role per request" condition. The current interface accurately models the fact that an operator does not have a role associated to it like a framework does, and I don't think "avoiding changes in the `validate()` function" should have any influence in deciding how an interface behaves.
>     
>     If we required such a condition, the per-request atomicity guarantee comes with a limitation that it can only be for a single role. While I'm not sure of its value, I'm also not sure what we gain by requiring it from the user's perspective?
> 
> Alexander Rukletsov wrote:
>     I think I'm missing something, my understanding is that each dynamic reservation is associated with a role, regardless, who issues a reservation request. I don't think limiting users to one role per request gives them any benefit, but it looks like we can be closer to framework-issued request if we do so. What am I missing?
> 
> Michael Park wrote:
>     Your understanding is correct. Aside from the resources being associated with a role, frameworks are also associated with a role. We check that every resource is being reserved for the framework's role because a framework is associated with a role and it wouldn't make sense to allow a framework to reserve resources for a role that does not match its role. On the contrary, the same rule doesn't apply for an operator since there's no such thing as an "operator's role".
> 
> Alexander Rukletsov wrote:
>     That's right. Let me try to reformulate my proposal. If we require an operator to reserve resources for one role per request, it can be interpreted as an "operator role". An advantage here is that `validate()` method doesn't need to be changed, while a disadvantage is that this approach is a bit artificial and can lead to confusion. What do you think?

As I mentioned in my initial comment, I think that the `validate()` function is an implementation detail and shouldn't be a source of motivation in deciding how an API should behave.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 28, 2015, 9:03 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 9:03 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.
> 
> Michael Park wrote:
>     I didn't see a good reason to require a "one role per request" condition. The current interface accurately models the fact that an operator does not have a role associated to it like a framework does, and I don't think "avoiding changes in the `validate()` function" should have any influence in deciding how an interface behaves.
>     
>     If we required such a condition, the per-request atomicity guarantee comes with a limitation that it can only be for a single role. While I'm not sure of its value, I'm also not sure what we gain by requiring it from the user's perspective?
> 
> Alexander Rukletsov wrote:
>     I think I'm missing something, my understanding is that each dynamic reservation is associated with a role, regardless, who issues a reservation request. I don't think limiting users to one role per request gives them any benefit, but it looks like we can be closer to framework-issued request if we do so. What am I missing?

Your understanding is correct. Aside from the resources being associated with a role, frameworks are also associated with a role. We check that every resource is being reserved for the framework's role because a framework is associated with a role and it wouldn't make sense to allow a framework to reserve resources for a role that does not match its role. On the contrary, the same rule doesn't apply for an operator since there's no such thing as an "operator's role".


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 27, 2015, 11:30 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 27, 2015, 11:30 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.
> 
> Michael Park wrote:
>     I didn't see a good reason to require a "one role per request" condition. The current interface accurately models the fact that an operator does not have a role associated to it like a framework does, and I don't think "avoiding changes in the `validate()` function" should have any influence in deciding how an interface behaves.
>     
>     If we required such a condition, the per-request atomicity guarantee comes with a limitation that it can only be for a single role. While I'm not sure of its value, I'm also not sure what we gain by requiring it from the user's perspective?
> 
> Alexander Rukletsov wrote:
>     I think I'm missing something, my understanding is that each dynamic reservation is associated with a role, regardless, who issues a reservation request. I don't think limiting users to one role per request gives them any benefit, but it looks like we can be closer to framework-issued request if we do so. What am I missing?
> 
> Michael Park wrote:
>     Your understanding is correct. Aside from the resources being associated with a role, frameworks are also associated with a role. We check that every resource is being reserved for the framework's role because a framework is associated with a role and it wouldn't make sense to allow a framework to reserve resources for a role that does not match its role. On the contrary, the same rule doesn't apply for an operator since there's no such thing as an "operator's role".

That's right. Let me try to reformulate my proposal. If we require an operator to reserve resources for one role per request, it can be interpreted as an "operator role". An advantage here is that `validate()` method doesn't need to be changed, while a disadvantage is that this approach is a bit artificial and can lead to confusion. What do you think?


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 28, 2015, 9:03 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 9:03 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.
> 
> Michael Park wrote:
>     I didn't see a good reason to require a "one role per request" condition. The current interface accurately models the fact that an operator does not have a role associated to it like a framework does, and I don't think "avoiding changes in the `validate()` function" should have any influence in deciding how an interface behaves.
>     
>     If we required such a condition, the per-request atomicity guarantee comes with a limitation that it can only be for a single role. While I'm not sure of its value, I'm also not sure what we gain by requiring it from the user's perspective?
> 
> Alexander Rukletsov wrote:
>     I think I'm missing something, my understanding is that each dynamic reservation is associated with a role, regardless, who issues a reservation request. I don't think limiting users to one role per request gives them any benefit, but it looks like we can be closer to framework-issued request if we do so. What am I missing?
> 
> Michael Park wrote:
>     Your understanding is correct. Aside from the resources being associated with a role, frameworks are also associated with a role. We check that every resource is being reserved for the framework's role because a framework is associated with a role and it wouldn't make sense to allow a framework to reserve resources for a role that does not match its role. On the contrary, the same rule doesn't apply for an operator since there's no such thing as an "operator's role".
> 
> Alexander Rukletsov wrote:
>     That's right. Let me try to reformulate my proposal. If we require an operator to reserve resources for one role per request, it can be interpreted as an "operator role". An advantage here is that `validate()` method doesn't need to be changed, while a disadvantage is that this approach is a bit artificial and can lead to confusion. What do you think?
> 
> Michael Park wrote:
>     As I mentioned in my initial comment, I think that the `validate()` function is an implementation detail and shouldn't be a source of motivation in deciding how an API should behave.

Fair enough, feel free to drop.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 28, 2015, 9:03 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 9:03 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


A high level question: do you think rescinding offers is a big deal for now?


src/master/http.cpp (line 507)
<https://reviews.apache.org/r/35702/#comment144877>

    The code until this line is basically request validation and authorization. Though it's not how we do it now, do you think it makes sense to split the function into smaller logical parts?
    
    How about something like this:
    
    ```
    Future<Response> Master::Http::reserve(const Request& request) const
    {
      return Master::Http::reserveValidate();
    }
    
    Future<Response> Master::Http::reserveValidate(const Request& request) const
    {
      <...>
      return Master::Http::reserveAuthorize();
    }
    
    <...>
    ```



src/master/http.cpp (lines 515 - 516)
<https://reviews.apache.org/r/35702/#comment144878>

    It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.



src/master/http.cpp (lines 523 - 524)
<https://reviews.apache.org/r/35702/#comment144880>

    Let's leave a comment here, that we do want to defer the decision completely to an allocator, but can do it currently because offers are issued and handled by the master.



src/master/http.cpp (lines 541 - 545)
<https://reviews.apache.org/r/35702/#comment144879>

    We basically defer the decision whether request can be granted or not to an allocator (up to rescinding). Let's capture it in a comment!



src/master/master.hpp (line 962)
<https://reviews.apache.org/r/35702/#comment144851>

    I see that you extracted this function for code reusal, but let's document it. You may want to add a comment for `applyResourceOperation()`and update the comment for `applyOfferOperation()` as well.


- Alexander Rukletsov


On June 28, 2015, 8:36 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 28, 2015, 8:36 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Guangya Liu <gy...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review97881
-----------------------------------------------------------

Ship it!


Ship It!

- Guangya Liu


On 九月 4, 2015, 11:19 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated 九月 4, 2015, 11:19 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's
>       `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` which has a default `refuse_sec` of 5 seconds,
>       rather than `recoverResources(..., None())` so that we can virtually always win the race against `allocate`.
>       In the rare case that we do lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be semi-optimistic and forward the
>       request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `Conflict`.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 94e97a2898106579434e8cdec04b7b0e130a810e 
>   src/master/master.hpp e1331851c19e3372a4a525dcfd7ba2a01c3e97a6 
>   src/master/master.cpp 5589eca4317b597de509f3387cfc349083b361ac 
>   src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
>   src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated Sept. 4, 2015, 11:19 p.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Rebased. NNFR.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's
      `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` which has a default `refuse_sec` of 5 seconds,
      rather than `recoverResources(..., None())` so that we can virtually always win the race against `allocate`.
      In the rare case that we do lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be semi-optimistic and forward the
      request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `Conflict`.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.


Diffs (updated)
-----

  src/master/http.cpp 94e97a2898106579434e8cdec04b7b0e130a810e 
  src/master/master.hpp e1331851c19e3372a4a525dcfd7ba2a01c3e97a6 
  src/master/master.cpp 5589eca4317b597de509f3387cfc349083b361ac 
  src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
  src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

`make check`


Thanks,

Michael Park


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated Aug. 12, 2015, 2:44 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Discarded [r36987](https://reviews.apache.org/r/36987/) and rebased accordingly.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's
      `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` which has a default `refuse_sec` of 5 seconds,
      rather than `recoverResources(..., None())` so that we can virtually always win the race against `allocate`.
      In the rare case that we do lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be semi-optimistic and forward the
      request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `Conflict`.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.


Diffs (updated)
-----

  src/master/http.cpp 7c650555ed15d310ad28e68239cbd56580fbae37 
  src/master/master.hpp bb7c8e9e05be829a6b9aa3100a714b2359854d96 
  src/master/master.cpp 398203d9367f85340166e66ecc34b9a33dd81048 
  src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
  src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

`make check`


Thanks,

Michael Park


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated Aug. 5, 2015, 7:12 p.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Removed `Nothing`


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description (updated)
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's
      `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` which has a default `refuse_sec` of 5 seconds,
      rather than `recoverResources(..., None())` so that we can virtually always win the race against `allocate`.
      In the rare case that we do lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be semi-optimistic and forward the
      request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `Conflict`.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.


Diffs (updated)
-----

  src/master/http.cpp 76e70801925041f08bc94f0ca18c86f1a573b2b3 
  src/master/master.hpp e44174976aa64176916827bec4c911333c9a91db 
  src/master/master.cpp 50b98248463fc4cd48962890c14c7ad64f2b6f43 
  src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
  src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

`make check`


Thanks,

Michael Park


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated Aug. 5, 2015, 10:44 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.


Diffs (updated)
-----

  src/master/http.cpp 76e70801925041f08bc94f0ca18c86f1a573b2b3 
  src/master/master.hpp e44174976aa64176916827bec4c911333c9a91db 
  src/master/master.cpp 5aa0a5410804fe16abd50b6953f1ffe46a019ecf 
  src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
  src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

`make check`


Thanks,

Michael Park


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated Aug. 5, 2015, 9:55 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description (updated)
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.


Diffs
-----

  src/master/http.cpp 76e70801925041f08bc94f0ca18c86f1a573b2b3 
  src/master/master.hpp e44174976aa64176916827bec4c911333c9a91db 
  src/master/master.cpp 5aa0a5410804fe16abd50b6953f1ffe46a019ecf 
  src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
  src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

`make check`


Thanks,

Michael Park


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated Aug. 5, 2015, 9:51 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Addressed Jie's comments.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.

TODO(mpark): Add more comments and test cases.


Diffs (updated)
-----

  src/master/http.cpp 76e70801925041f08bc94f0ca18c86f1a573b2b3 
  src/master/master.hpp e44174976aa64176916827bec4c911333c9a91db 
  src/master/master.cpp 5aa0a5410804fe16abd50b6953f1ffe46a019ecf 
  src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
  src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

`make check`


Thanks,

Michael Park


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Jie Yu <yu...@gmail.com>.

> On Aug. 5, 2015, 5:46 a.m., Jie Yu wrote:
> > src/master/http.cpp, line 573
> > <https://reviews.apache.org/r/35702/diff/12/?file=1026443#file1026443line573>
> >
> >     What is 'Nothing' here?
> 
> Michael Park wrote:
>     The `Nothing` here comes from the result of `master->apply` which returns a `Future<Nothing>`. But I feel like you're not actually asking for an answer here?
>     
>     What would you like to see?
>     
>     What I have currently is a comment above the code which reads:
>     
>     ```cpp
>     // Propogate the 'Future<Nothing>' as 'Future<Response>' where
>     // 'Nothing' -> 'OK' and Failed -> 'Conflict'.
>     ```

I mean if we remove 'Nothing' here, will it compile?


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review94154
-----------------------------------------------------------


On Aug. 5, 2015, 10:44 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated Aug. 5, 2015, 10:44 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 76e70801925041f08bc94f0ca18c86f1a573b2b3 
>   src/master/master.hpp e44174976aa64176916827bec4c911333c9a91db 
>   src/master/master.cpp 5aa0a5410804fe16abd50b6953f1ffe46a019ecf 
>   src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
>   src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On Aug. 5, 2015, 5:46 a.m., Jie Yu wrote:
> > src/master/http.cpp, line 475
> > <https://reviews.apache.org/r/35702/diff/12/?file=1026443#file1026443line475>
> >
> >     We typically use leading undescore for temp variables. The tailing underscore is for class members (following google style).
> >     
> >     In fact, I think the temp variable here is not necessary. There are only two places where this temp variable is used. I would rather use 'values.get().at("...")', but this is up to you.

Removed temporary variable and used `values.get().at(...)` instead.

The reason why I did it this way is because I've been following the general pattern of:

```cpp
Try<Obj> _obj = getObj(...);
if (_obj.isError()) {
  return SomeError(_obj.error());
}

const Obj& obj = _obj.get();

// proceed with 'obj'.
```

This is a very common pattern for us that I would like to eventually explore for a cleaner solution.


> On Aug. 5, 2015, 5:46 a.m., Jie Yu wrote:
> > src/master/http.cpp, line 498
> > <https://reviews.apache.org/r/35702/diff/12/?file=1026443#file1026443line498>
> >
> >     Do you need this temp variable. Looks like you can just do
> >     ```
> >     foreach (.. value, parse.get().values) {...
> >     ```

Fixed this to use your suggestion. This was another instance of the pattern I described above.


> On Aug. 5, 2015, 5:46 a.m., Jie Yu wrote:
> > src/master/http.cpp, line 534
> > <https://reviews.apache.org/r/35702/diff/12/?file=1026443#file1026443line534>
> >
> >     I don't like the name 'flatten' :(
> >     
> >     Could you at least be more explicit about it (i.e., emphasize that 'remaining' only has unreserved resources). 
> >     
> >     ```
> >     Resources remaining = resources.flatten('*');
> >     ```

I don't like it either, but we currently have 9 instances of `flatten()` but no instances of `flatten("*")`. Do you think it's worth breaking consistency here? As far as I know, we seem to favor consistency.


> On Aug. 5, 2015, 5:46 a.m., Jie Yu wrote:
> > src/master/http.cpp, line 573
> > <https://reviews.apache.org/r/35702/diff/12/?file=1026443#file1026443line573>
> >
> >     What is 'Nothing' here?

The `Nothing` here comes from the result of `master->apply` which returns a `Future<Nothing>`. But I feel like you're not actually asking for an answer here?

What would you like to see?

What I have currently is a comment above the code which reads:

```cpp
// Propogate the 'Future<Nothing>' as 'Future<Response>' where
// 'Nothing' -> 'OK' and Failed -> 'Conflict'.
```


> On Aug. 5, 2015, 5:46 a.m., Jie Yu wrote:
> > src/master/master.cpp, line 5482
> > <https://reviews.apache.org/r/35702/diff/12/?file=1026445#file1026445line5482>
> >
> >     The name sounds weired. You are passing in an offer operation but the function name is called 'applyResourceOperation'.
> >     
> >     I would suggest we create two 'Master::apply' overloads and don't worry about the code duplication.
> >     
> >     ```
> >     void apply(framework, slave, opeartion);
> >     Future<Nothing> apply(slave, operation);
> >     ```

I've introduced the overloaded `Master::apply` as suggested. I renamed the original `Master::apply` to `Master::_apply` since I wanted to use it as the continuation for `Master::apply(Slave*, const Offer::Operation&)`, then realized I could also use it at the end of `Master::apply(Framework*, Slave*, const Offer::Operation&)` (the same way it was before). So in the end, functions were renamed.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review94154
-----------------------------------------------------------


On Aug. 5, 2015, 10:44 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated Aug. 5, 2015, 10:44 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 76e70801925041f08bc94f0ca18c86f1a573b2b3 
>   src/master/master.hpp e44174976aa64176916827bec4c911333c9a91db 
>   src/master/master.cpp 5aa0a5410804fe16abd50b6953f1ffe46a019ecf 
>   src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
>   src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Jie Yu <yu...@gmail.com>.

> On Aug. 5, 2015, 5:46 a.m., Jie Yu wrote:
> > src/master/http.cpp, line 534
> > <https://reviews.apache.org/r/35702/diff/12/?file=1026443#file1026443line534>
> >
> >     I don't like the name 'flatten' :(
> >     
> >     Could you at least be more explicit about it (i.e., emphasize that 'remaining' only has unreserved resources). 
> >     
> >     ```
> >     Resources remaining = resources.flatten('*');
> >     ```
> 
> Michael Park wrote:
>     I don't like it either, but we currently have 9 instances of `flatten()` but no instances of `flatten("*")`. Do you think it's worth breaking consistency here? As far as I know, we seem to favor consistency.

OK, fair enough. Given the comment you just added, I think it's much more clear not.


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review94154
-----------------------------------------------------------


On Aug. 5, 2015, 10:44 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated Aug. 5, 2015, 10:44 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 76e70801925041f08bc94f0ca18c86f1a573b2b3 
>   src/master/master.hpp e44174976aa64176916827bec4c911333c9a91db 
>   src/master/master.cpp 5aa0a5410804fe16abd50b6953f1ffe46a019ecf 
>   src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
>   src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On Aug. 5, 2015, 5:46 a.m., Jie Yu wrote:
> > src/master/http.cpp, line 573
> > <https://reviews.apache.org/r/35702/diff/12/?file=1026443#file1026443line573>
> >
> >     What is 'Nothing' here?
> 
> Michael Park wrote:
>     The `Nothing` here comes from the result of `master->apply` which returns a `Future<Nothing>`. But I feel like you're not actually asking for an answer here?
>     
>     What would you like to see?
>     
>     What I have currently is a comment above the code which reads:
>     
>     ```cpp
>     // Propogate the 'Future<Nothing>' as 'Future<Response>' where
>     // 'Nothing' -> 'OK' and Failed -> 'Conflict'.
>     ```
> 
> Jie Yu wrote:
>     I mean if we remove 'Nothing' here, will it compile?

Oh, it does. I forgot about this behavior. Thanks, updated.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review94154
-----------------------------------------------------------


On Aug. 5, 2015, 7:12 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated Aug. 5, 2015, 7:12 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's
>       `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` which has a default `refuse_sec` of 5 seconds,
>       rather than `recoverResources(..., None())` so that we can virtually always win the race against `allocate`.
>       In the rare case that we do lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be semi-optimistic and forward the
>       request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `Conflict`.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 76e70801925041f08bc94f0ca18c86f1a573b2b3 
>   src/master/master.hpp e44174976aa64176916827bec4c911333c9a91db 
>   src/master/master.cpp 50b98248463fc4cd48962890c14c7ad64f2b6f43 
>   src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
>   src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review94154
-----------------------------------------------------------

Ship it!


LGTM overall. Please address the remaining issues and commit yourself!


src/master/http.cpp (line 475)
<https://reviews.apache.org/r/35702/#comment148677>

    We typically use leading undescore for temp variables. The tailing underscore is for class members (following google style).
    
    In fact, I think the temp variable here is not necessary. There are only two places where this temp variable is used. I would rather use 'values.get().at("...")', but this is up to you.



src/master/http.cpp (line 498)
<https://reviews.apache.org/r/35702/#comment148678>

    Do you need this temp variable. Looks like you can just do
    ```
    foreach (.. value, parse.get().values) {...
    ```



src/master/http.cpp (line 511)
<https://reviews.apache.org/r/35702/#comment148679>

    Kill this line.



src/master/http.cpp (line 533)
<https://reviews.apache.org/r/35702/#comment148685>

    What does 'recovered' mean and what does remaining mean? It'll be great if you comment on that to improve readability.
    
    For example,
    
    ```
    // The resources recovered by recinding outstanding offers.
    ...
    // The unreserved resources needed to satify the RESERVE operation.
    ```
    
    IIUC, the variable 'remaining' is only used for optimization, right? Could you please mention that in the comments that keep this variable is for optimization (i.e., avoid rescinding unnecessary offers).



src/master/http.cpp (line 534)
<https://reviews.apache.org/r/35702/#comment148683>

    I don't like the name 'flatten' :(
    
    Could you at least be more explicit about it (i.e., emphasize that 'remaining' only has unreserved resources). 
    
    ```
    Resources remaining = resources.flatten('*');
    ```



src/master/http.cpp (line 553)
<https://reviews.apache.org/r/35702/#comment148686>

    you win the race because the default filter refuse_sec is 5 seconds? Worth mentioning that in the comment.



src/master/http.cpp (line 573)
<https://reviews.apache.org/r/35702/#comment148727>

    What is 'Nothing' here?



src/master/master.cpp (line 5472)
<https://reviews.apache.org/r/35702/#comment148729>

    The name sounds weired. You are passing in an offer operation but the function name is called 'applyResourceOperation'.
    
    I would suggest we create two 'Master::apply' overloads and don't worry about the code duplication.
    
    ```
    void apply(framework, slave, opeartion);
    Future<Nothing> apply(slave, operation);
    ```


- Jie Yu


On July 31, 2015, 9:56 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 31, 2015, 9:56 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3772e39015a22655dcad00ad844dc5ddc90db43f 
>   src/master/master.hpp ea18c4e0bb0743747401b9cd5ea14ae9b56ae3cc 
>   src/master/master.cpp 351a3c2b5f551ad065682cea601d2436258e4544 
>   src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
>   src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated July 31, 2015, 9:56 p.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Rebased.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.

TODO(mpark): Add more comments and test cases.


Diffs (updated)
-----

  src/master/http.cpp 3772e39015a22655dcad00ad844dc5ddc90db43f 
  src/master/master.hpp ea18c4e0bb0743747401b9cd5ea14ae9b56ae3cc 
  src/master/master.cpp 351a3c2b5f551ad065682cea601d2436258e4544 
  src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
  src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

`make check`


Thanks,

Michael Park


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated July 28, 2015, 9:03 p.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Only rescind offers if rescinding the offer will contribute in satisfying the request


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.

TODO(mpark): Add more comments and test cases.


Diffs (updated)
-----

  src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
  src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
  src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

`make check`


Thanks,

Michael Park


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated July 27, 2015, 11:30 p.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Addressed AlexR's comments.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.

TODO(mpark): Add more comments and test cases.


Diffs (updated)
-----

  src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
  src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
  src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

`make check`


Thanks,

Michael Park


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On July 16, 2015, 2:54 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, line 447
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line447>
> >
> >     Not directly related to endpoints, but to dynamic reservations in general. Do you think it makes sense to bookkeep dynamic reservation or have an aggregating method in `mesos::internal::master::Role`?
> 
> Michael Park wrote:
>     We have a `Role::resources()` function which aggregates all resources, and we can filter for dynamically reserved ones by doing something like: `resources.filter(Resources::isDynamicallyReserved)`. Is this sufficient for what you're asking about? or is there more?

Good point! I think this is close to what I had in mind. However, one thing still bothers me: how can we hint somebody who is not very familiar with the codebase, that they can do tricks like this? Maybe a comment in `Role` struct like

```
NOTE: You can use filters to extract specific resources, e.g. Role::resources().filter(Resources::isDynamicallyReserved).
```
But maybe it's too much (why putting such comment into the `Role` struct), what do you think?


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91889
-----------------------------------------------------------


On July 27, 2015, 11:30 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 27, 2015, 11:30 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On July 16, 2015, 2:54 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, line 447
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line447>
> >
> >     Not directly related to endpoints, but to dynamic reservations in general. Do you think it makes sense to bookkeep dynamic reservation or have an aggregating method in `mesos::internal::master::Role`?

We have a `Role::resources()` function which aggregates all resources, and we can filter for dynamically reserved ones by doing something like: `resources.filter(Resources::isDynamicallyReserved)`. Is this sufficient for what you're asking about? or is there more?


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91889
-----------------------------------------------------------


On July 27, 2015, 11:30 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 27, 2015, 11:30 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On July 16, 2015, 2:54 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, line 447
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line447>
> >
> >     Not directly related to endpoints, but to dynamic reservations in general. Do you think it makes sense to bookkeep dynamic reservation or have an aggregating method in `mesos::internal::master::Role`?
> 
> Michael Park wrote:
>     We have a `Role::resources()` function which aggregates all resources, and we can filter for dynamically reserved ones by doing something like: `resources.filter(Resources::isDynamicallyReserved)`. Is this sufficient for what you're asking about? or is there more?
> 
> Alexander Rukletsov wrote:
>     Good point! I think this is close to what I had in mind. However, one thing still bothers me: how can we hint somebody who is not very familiar with the codebase, that they can do tricks like this? Maybe a comment in `Role` struct like
>     
>     ```
>     NOTE: You can use filters to extract specific resources, e.g. Role::resources().filter(Resources::isDynamicallyReserved).
>     ```
>     But maybe it's too much (why putting such comment into the `Role` struct), what do you think?

Yeah, I'm not sure about a seemingly unrelated comment living at the `Role` struct. But it would be useful if it's something that people ask about often.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91889
-----------------------------------------------------------


On July 27, 2015, 11:30 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 27, 2015, 11:30 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91889
-----------------------------------------------------------



src/master/http.cpp (line 447)
<https://reviews.apache.org/r/35702/#comment145598>

    Not directly related to endpoints, but to dynamic reservations in general. Do you think it makes sense to bookkeep dynamic reservation or have an aggregating method in `mesos::internal::master::Role`?


- Alexander Rukletsov


On June 28, 2015, 8:36 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 28, 2015, 8:36 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 28, 2015, 8:36 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.

TODO(mpark): Add more comments and test cases.


Diffs
-----

  src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 

Diff: https://reviews.apache.org/r/35702/diff/


Testing (updated)
-------

`make check`


Thanks,

Michael Park


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 28, 2015, 8:35 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Split tests out.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.

TODO(mpark): Add more comments and test cases.


Diffs (updated)
-----

  src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

(1) Added `src/tests/reserve_tests.cpp`.
(2) `make check`


Thanks,

Michael Park


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 27, 2015, 2:23 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.

TODO(mpark): Add more comments and test cases.


Diffs
-----

  src/Makefile.am a064d17a6b62e6e3c8e190135bcc8cbbb0051ed5 
  src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing (updated)
-------

(1) Added `src/tests/reserve_tests.cpp`.
(2) `make check`


Thanks,

Michael Park


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 26, 2015, 10:56 p.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Ready for review.


Summary (updated)
-----------------

Added /reserve HTTP endpoint to the master.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.

TODO(mpark): Add more comments and test cases.


Diffs
-----

  src/Makefile.am a064d17a6b62e6e3c8e190135bcc8cbbb0051ed5 
  src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

Added `src/tests/reserve_tests.cpp`.


Thanks,

Michael Park


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 26, 2015, 10:55 p.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Finished TODO.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description (updated)
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.

TODO(mpark): Add more comments and test cases.


Diffs
-----

  src/Makefile.am a064d17a6b62e6e3c8e190135bcc8cbbb0051ed5 
  src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

Added `src/tests/reserve_tests.cpp`.


Thanks,

Michael Park


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 26, 2015, 10:55 p.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Split out the allocator changes.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.

TODO(mpark): Add more comments and test cases.
TODO(mpark): Return a `Future<Nothing>` rather than `Future<Try<Nothing>>` and use `Future::repair` to propagate the failure state.


Diffs (updated)
-----

  src/Makefile.am a064d17a6b62e6e3c8e190135bcc8cbbb0051ed5 
  src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

Added `src/tests/reserve_tests.cpp`.


Thanks,

Michael Park


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review89471
-----------------------------------------------------------


Patch looks great!

Reviews applied: [35714, 35702]

All tests passed.

- Mesos ReviewBot


On June 26, 2015, 4:44 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 26, 2015, 4:44 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> TODO(mpark): Return a `Future<Nothing>` rather than `Future<Try<Nothing>>` and use `Future::repair` to propagate the failure state.
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/allocator.hpp 22992c0c77058af4fcd28aa8e4a1191693a16f44 
>   src/Makefile.am a064d17a6b62e6e3c8e190135bcc8cbbb0051ed5 
>   src/master/allocator/mesos/allocator.hpp 72470ec7f56f84a9a9815c09adb88def90ef672f 
>   src/master/allocator/mesos/hierarchical.hpp 3264d145d52b48852878abf7ab9be29ab98208cc 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
>   src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
>   src/tests/reserve_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> Added `src/tests/reserve_tests.cpp`.
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 26, 2015, 4:44 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description (updated)
-------

This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)

Key points:

* This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
* `updateAvailable` and `updateSlave` are kept separate because
  (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
  (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
  (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
* The algorithm:
    * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
      This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
      As such, we first try to satisfy the request only with the offered resources.
    * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
      IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
                 In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
    * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
    * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.

This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
The challenges of implementing the ideal solution in the current state is described in the document above.

TODO(mpark): Add more comments and test cases.
TODO(mpark): Return a `Future<Nothing>` rather than `Future<Try<Nothing>>` and use `Future::repair` to propagate the failure state.


Diffs (updated)
-----

  include/mesos/master/allocator.hpp 22992c0c77058af4fcd28aa8e4a1191693a16f44 
  src/Makefile.am a064d17a6b62e6e3c8e190135bcc8cbbb0051ed5 
  src/master/allocator/mesos/allocator.hpp 72470ec7f56f84a9a9815c09adb88def90ef672f 
  src/master/allocator/mesos/hierarchical.hpp 3264d145d52b48852878abf7ab9be29ab98208cc 
  src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

Added `src/tests/reserve_tests.cpp`.


Thanks,

Michael Park


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 24, 2015, 3:39 p.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description (updated)
-------

Work currently in progress, incorporating feedback. Please hold off on reviewing this patch for now.

(1) Q: I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
    A: I think we shold be able to generalize `updateSlave` to support not only `oversubscription`, but `reserve`, `unreserve`, `create` and `destroy`.
(2) Q: To determine the amount of available resources, I used: `available = total - (offered + used)`.
       Is this still correct with oversubscription in the picture?
    A: I think this decision needs to be made in the allocator.
(3) Q: I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
       Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?
    A: I think we'll want to introduce something like `Resource.Operation`.


Diffs
-----

  include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
  src/Makefile.am dfebd2b14c9cb45c437509809fdf5ac3b0c8838c 
  src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
  src/master/allocator/mesos/hierarchical.hpp 7097482fc0adad1c177c15c35edd51c10754f89c 
  src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

Added `src/tests/reserve_tests.cpp`.


Thanks,

Michael Park


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88733
-----------------------------------------------------------


Patch looks great!

Reviews applied: [35714, 35702]

All tests passed.

- Mesos ReviewBot


On June 22, 2015, 5:29 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 22, 2015, 5:29 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This is still a work in progress, but I'm sharing to gather some feedback around:
> 
> (1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
> (2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
>     Is this still correct with oversubscription in the picture?
> (3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
>     Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
>   src/Makefile.am dfebd2b14c9cb45c437509809fdf5ac3b0c8838c 
>   src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
>   src/master/allocator/mesos/hierarchical.hpp 7097482fc0adad1c177c15c35edd51c10754f89c 
>   src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
>   src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
>   src/tests/reserve_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> Added `src/tests/reserve_tests.cpp`.
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On June 22, 2015, 1:32 p.m., Alexander Rukletsov wrote:
> > src/master/master.cpp, line 749
> > <https://reviews.apache.org/r/35702/diff/6/?file=989449#file989449line749>
> >
> >     I think reserve is too abstract and may collide with future actions (think quota). How about `/dynamic/reserve`?
> 
> Alexander Rukletsov wrote:
>     Though we currently do not support slashes in endpoints, I think we should fix that first before introducing a `/reserve` endpoint, given these endpoint are not targeted for 0.23.
> 
> Joris Van Remoortere wrote:
>     Cody had some patches for enabling sub namespaces in endpoints (as in enabling slashes). Might be worth pulling those in.
> 
> Alexander Rukletsov wrote:
>     Yep, it's https://issues.apache.org/jira/browse/MESOS-2130, I plan to bring up the discussion today at the community sync.
> 
> Michael Park wrote:
>     The concensus for now seems that (1) we introduce the allocator changes, but address the allocator refactor sooner rather than later, (2) go with `/reserve` for now and update them once the HTTP API folks get to supporting the nested endpoint stuff.
> 
> Alexander Rukletsov wrote:
>     And (3) we update endpoints names before the following release, i.e. there is no Mesos release, where `/reserve` will exist. Correct?
> 
> Michael Park wrote:
>     That is the ideal outcome. But if we commit this now/soon, whether we can update the names before 0.24.0 gets out entirely depends on whether the nested endpoint names capabilities get committed on time.
> 
> Alexander Rukletsov wrote:
>     I think we agreed that we should, in order to avoid deprecation cycle for "old" endpoints.

I synced with BenH and Jie regarding this topic, they both suggested that we should get this in as is and update later. Jie suggested that if the HTTP API doesn't make it in in time, we can either not mention it in the `CHANGELOG` or mention it as an alpha feature that is subject to change.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88781
-----------------------------------------------------------


On July 28, 2015, 9:03 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 9:03 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Joris Van Remoortere <jo...@gmail.com>.

> On June 22, 2015, 1:32 p.m., Alexander Rukletsov wrote:
> > src/master/master.cpp, line 749
> > <https://reviews.apache.org/r/35702/diff/6/?file=989449#file989449line749>
> >
> >     I think reserve is too abstract and may collide with future actions (think quota). How about `/dynamic/reserve`?
> 
> Alexander Rukletsov wrote:
>     Though we currently do not support slashes in endpoints, I think we should fix that first before introducing a `/reserve` endpoint, given these endpoint are not targeted for 0.23.

Cody had some patches for enabling sub namespaces in endpoints (as in enabling slashes). Might be worth pulling those in.


- Joris


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88781
-----------------------------------------------------------


On June 28, 2015, 8:36 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 28, 2015, 8:36 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On June 22, 2015, 1:32 p.m., Alexander Rukletsov wrote:
> > src/master/master.cpp, line 749
> > <https://reviews.apache.org/r/35702/diff/6/?file=989449#file989449line749>
> >
> >     I think reserve is too abstract and may collide with future actions (think quota). How about `/dynamic/reserve`?
> 
> Alexander Rukletsov wrote:
>     Though we currently do not support slashes in endpoints, I think we should fix that first before introducing a `/reserve` endpoint, given these endpoint are not targeted for 0.23.
> 
> Joris Van Remoortere wrote:
>     Cody had some patches for enabling sub namespaces in endpoints (as in enabling slashes). Might be worth pulling those in.
> 
> Alexander Rukletsov wrote:
>     Yep, it's https://issues.apache.org/jira/browse/MESOS-2130, I plan to bring up the discussion today at the community sync.
> 
> Michael Park wrote:
>     The concensus for now seems that (1) we introduce the allocator changes, but address the allocator refactor sooner rather than later, (2) go with `/reserve` for now and update them once the HTTP API folks get to supporting the nested endpoint stuff.
> 
> Alexander Rukletsov wrote:
>     And (3) we update endpoints names before the following release, i.e. there is no Mesos release, where `/reserve` will exist. Correct?
> 
> Michael Park wrote:
>     That is the ideal outcome. But if we commit this now/soon, whether we can update the names before 0.24.0 gets out entirely depends on whether the nested endpoint names capabilities get committed on time.

I think we agreed that we should, in order to avoid deprecation cycle for "old" endpoints.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88781
-----------------------------------------------------------


On July 28, 2015, 9:03 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 9:03 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On June 22, 2015, 1:32 p.m., Alexander Rukletsov wrote:
> > src/master/master.cpp, line 749
> > <https://reviews.apache.org/r/35702/diff/6/?file=989449#file989449line749>
> >
> >     I think reserve is too abstract and may collide with future actions (think quota). How about `/dynamic/reserve`?
> 
> Alexander Rukletsov wrote:
>     Though we currently do not support slashes in endpoints, I think we should fix that first before introducing a `/reserve` endpoint, given these endpoint are not targeted for 0.23.
> 
> Joris Van Remoortere wrote:
>     Cody had some patches for enabling sub namespaces in endpoints (as in enabling slashes). Might be worth pulling those in.
> 
> Alexander Rukletsov wrote:
>     Yep, it's https://issues.apache.org/jira/browse/MESOS-2130, I plan to bring up the discussion today at the community sync.
> 
> Michael Park wrote:
>     The concensus for now seems that (1) we introduce the allocator changes, but address the allocator refactor sooner rather than later, (2) go with `/reserve` for now and update them once the HTTP API folks get to supporting the nested endpoint stuff.
> 
> Alexander Rukletsov wrote:
>     And (3) we update endpoints names before the following release, i.e. there is no Mesos release, where `/reserve` will exist. Correct?

That is the ideal outcome. But if we commit this now/soon, whether we can update the names before 0.24.0 gets out entirely depends on whether the nested endpoint names capabilities get committed on time.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88781
-----------------------------------------------------------


On July 28, 2015, 9:03 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 9:03 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On June 22, 2015, 1:32 p.m., Alexander Rukletsov wrote:
> > src/master/master.cpp, line 749
> > <https://reviews.apache.org/r/35702/diff/6/?file=989449#file989449line749>
> >
> >     I think reserve is too abstract and may collide with future actions (think quota). How about `/dynamic/reserve`?
> 
> Alexander Rukletsov wrote:
>     Though we currently do not support slashes in endpoints, I think we should fix that first before introducing a `/reserve` endpoint, given these endpoint are not targeted for 0.23.
> 
> Joris Van Remoortere wrote:
>     Cody had some patches for enabling sub namespaces in endpoints (as in enabling slashes). Might be worth pulling those in.
> 
> Alexander Rukletsov wrote:
>     Yep, it's https://issues.apache.org/jira/browse/MESOS-2130, I plan to bring up the discussion today at the community sync.

The concensus for now seems that (1) we introduce the allocator changes, but address the allocator refactor sooner rather than later, (2) go with `/reserve` for now and update them once the HTTP API folks get to supporting the nested endpoint stuff.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88781
-----------------------------------------------------------


On June 28, 2015, 8:36 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 28, 2015, 8:36 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On June 22, 2015, 1:32 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 519-520
> > <https://reviews.apache.org/r/35702/diff/6/?file=989447#file989447line519>
> >
> >     Some food for thought here. I think the ultimate decision whether to grant a request or not, should be taken by the allocator. Imagine a funky allocator module that is utterly optimistic and prefers granting requests and then killing tasks.
> >     
> >     With quota (what we have discussed offline) things go even further: a master cannot decide wether a quota can be fulfilled. We should either expose a lot of stats from allocator so that master can make a decision or we should change allocator interface so that it allows querying.

This is a really good point! I think we'll have to change many of the existing mechanisms to defer their decision to the allocator. Let's capture this in the quota design doc and discuss it separately.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88781
-----------------------------------------------------------


On June 22, 2015, 5:29 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 22, 2015, 5:29 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This is still a work in progress, but I'm sharing to gather some feedback around:
> 
> (1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
> (2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
>     Is this still correct with oversubscription in the picture?
> (3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
>     Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
>   src/Makefile.am dfebd2b14c9cb45c437509809fdf5ac3b0c8838c 
>   src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
>   src/master/allocator/mesos/hierarchical.hpp 7097482fc0adad1c177c15c35edd51c10754f89c 
>   src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
>   src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
>   src/tests/reserve_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> Added `src/tests/reserve_tests.cpp`.
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On June 22, 2015, 1:32 p.m., Alexander Rukletsov wrote:
> > src/master/master.cpp, line 749
> > <https://reviews.apache.org/r/35702/diff/6/?file=989449#file989449line749>
> >
> >     I think reserve is too abstract and may collide with future actions (think quota). How about `/dynamic/reserve`?
> 
> Alexander Rukletsov wrote:
>     Though we currently do not support slashes in endpoints, I think we should fix that first before introducing a `/reserve` endpoint, given these endpoint are not targeted for 0.23.
> 
> Joris Van Remoortere wrote:
>     Cody had some patches for enabling sub namespaces in endpoints (as in enabling slashes). Might be worth pulling those in.

Yep, it's https://issues.apache.org/jira/browse/MESOS-2130, I plan to bring up the discussion today at the community sync.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88781
-----------------------------------------------------------


On June 28, 2015, 8:36 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 28, 2015, 8:36 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On June 22, 2015, 1:32 p.m., Alexander Rukletsov wrote:
> > src/master/master.cpp, line 749
> > <https://reviews.apache.org/r/35702/diff/6/?file=989449#file989449line749>
> >
> >     I think reserve is too abstract and may collide with future actions (think quota). How about `/dynamic/reserve`?

Though we currently do not support slashes in endpoints, I think we should fix that first before introducing a `/reserve` endpoint, given these endpoint are not targeted for 0.23.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88781
-----------------------------------------------------------


On June 28, 2015, 8:36 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 28, 2015, 8:36 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On June 22, 2015, 1:32 p.m., Alexander Rukletsov wrote:
> > src/master/master.cpp, line 749
> > <https://reviews.apache.org/r/35702/diff/6/?file=989449#file989449line749>
> >
> >     I think reserve is too abstract and may collide with future actions (think quota). How about `/dynamic/reserve`?
> 
> Alexander Rukletsov wrote:
>     Though we currently do not support slashes in endpoints, I think we should fix that first before introducing a `/reserve` endpoint, given these endpoint are not targeted for 0.23.
> 
> Joris Van Remoortere wrote:
>     Cody had some patches for enabling sub namespaces in endpoints (as in enabling slashes). Might be worth pulling those in.
> 
> Alexander Rukletsov wrote:
>     Yep, it's https://issues.apache.org/jira/browse/MESOS-2130, I plan to bring up the discussion today at the community sync.
> 
> Michael Park wrote:
>     The concensus for now seems that (1) we introduce the allocator changes, but address the allocator refactor sooner rather than later, (2) go with `/reserve` for now and update them once the HTTP API folks get to supporting the nested endpoint stuff.

And (3) we update endpoints names before the following release, i.e. there is no Mesos release, where `/reserve` will exist. Correct?


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88781
-----------------------------------------------------------


On July 27, 2015, 11:30 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 27, 2015, 11:30 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88781
-----------------------------------------------------------



src/master/http.cpp (lines 519 - 520)
<https://reviews.apache.org/r/35702/#comment141370>

    Some food for thought here. I think the ultimate decision whether to grant a request or not, should be taken by the allocator. Imagine a funky allocator module that is utterly optimistic and prefers granting requests and then killing tasks.
    
    With quota (what we have discussed offline) things go even further: a master cannot decide wether a quota can be fulfilled. We should either expose a lot of stats from allocator so that master can make a decision or we should change allocator interface so that it allows querying.



src/master/master.cpp (line 749)
<https://reviews.apache.org/r/35702/#comment141371>

    I think reserve is too abstract and may collide with future actions (think quota). How about `/dynamic/reserve`?


- Alexander Rukletsov


On June 22, 2015, 5:29 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 22, 2015, 5:29 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This is still a work in progress, but I'm sharing to gather some feedback around:
> 
> (1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
> (2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
>     Is this still correct with oversubscription in the picture?
> (3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
>     Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
>   src/Makefile.am dfebd2b14c9cb45c437509809fdf5ac3b0c8838c 
>   src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
>   src/master/allocator/mesos/hierarchical.hpp 7097482fc0adad1c177c15c35edd51c10754f89c 
>   src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
>   src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
>   src/tests/reserve_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> Added `src/tests/reserve_tests.cpp`.
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On June 22, 2015, 10:20 a.m., Alexander Rukletsov wrote:
> > Before making a thorough review, let's discuss one high level question.
> > 
> > Here is the problem how I understand it: we reserve for roles, but our code works mostly with frameworks (allocator methods, Offer protobuf). To workaround this you decided to drop role from the reservation request at all. Is it right? Don't we want to update at least the role sorter?
> > 
> > If you decide to add the role, we can preserve validation function for reservation and split `updateAllocation()` into two parts: role-specific and framework specific and use just role-specific variant for both operator-initiated reservations and framework-initiated reservations.
> > 
> > Thoughts?

Thanks for the questions!
> Here is the problem how I understand it: we reserve for roles, but our code works mostly with frameworks (allocator methods, Offer protobuf). To workaround this you decided to drop role from the reservation request at all. Is it right?

No, the desired roles are specified in the `resources` query parameter. If you're wondering why the validation for `Reserve` now takes an optional role, that's because the master endpoint doesn't have a `role` to which all `resources` need to match. Recall that a framework always has a `role` to which all specified resources need to match.
> Don't we want to update at least the role sorter?

Yes, you're right about this. I do need to update the role sorter's total.
> If you decide to add the role, we can preserve validation function for reservation

Hopefully my reply above clarifies the reason for the changes made to the `validate` function.
> ... and split `updateAllocation()` into two parts: role-specific and framework specific and use just role-specific variant for both operator-initiated reservations and framework-initiated reservations.

There's a fundamental difference between operator-initiated reservations and framework-initiated reservations, in that operator-initiated reservations modify the available (unallocated) resources, whereas framework-initiated reservations modify the allocated resources. `updateAvailble` and `updateAllocation` handle each of the cases.

The following are some of the differing characteristics:

`updateAvailable` handles the former case:
  (1) the `available` resources change
  (2) `frameworkSorter` is unaffected
  (3) only the total of `roleSorter` is affected, the allocations are not.
  
`updateAllocation` handles the latter case:
  (1) the `available` resources do not change.
  (2) both `frameworkSorter` and `roleSorter` are affected.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88760
-----------------------------------------------------------


On June 22, 2015, 5:29 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 22, 2015, 5:29 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This is still a work in progress, but I'm sharing to gather some feedback around:
> 
> (1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
> (2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
>     Is this still correct with oversubscription in the picture?
> (3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
>     Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
>   src/Makefile.am dfebd2b14c9cb45c437509809fdf5ac3b0c8838c 
>   src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
>   src/master/allocator/mesos/hierarchical.hpp 7097482fc0adad1c177c15c35edd51c10754f89c 
>   src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
>   src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
>   src/tests/reserve_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> Added `src/tests/reserve_tests.cpp`.
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On June 22, 2015, 10:20 a.m., Alexander Rukletsov wrote:
> > Before making a thorough review, let's discuss one high level question.
> > 
> > Here is the problem how I understand it: we reserve for roles, but our code works mostly with frameworks (allocator methods, Offer protobuf). To workaround this you decided to drop role from the reservation request at all. Is it right? Don't we want to update at least the role sorter?
> > 
> > If you decide to add the role, we can preserve validation function for reservation and split `updateAllocation()` into two parts: role-specific and framework specific and use just role-specific variant for both operator-initiated reservations and framework-initiated reservations.
> > 
> > Thoughts?
> 
> Michael Park wrote:
>     Thanks for the questions!
>     > Here is the problem how I understand it: we reserve for roles, but our code works mostly with frameworks (allocator methods, Offer protobuf). To workaround this you decided to drop role from the reservation request at all. Is it right?
>     
>     No, the desired roles are specified in the `resources` query parameter. If you're wondering why the validation for `Reserve` now takes an optional role, that's because the master endpoint doesn't have a `role` to which all `resources` need to match. Recall that a framework always has a `role` to which all specified resources need to match.
>     > Don't we want to update at least the role sorter?
>     
>     Yes, you're right about this. I do need to update the role sorter's total.
>     > If you decide to add the role, we can preserve validation function for reservation
>     
>     Hopefully my reply above clarifies the reason for the changes made to the `validate` function.
>     > ... and split `updateAllocation()` into two parts: role-specific and framework specific and use just role-specific variant for both operator-initiated reservations and framework-initiated reservations.
>     
>     There's a fundamental difference between operator-initiated reservations and framework-initiated reservations, in that operator-initiated reservations modify the available (unallocated) resources, whereas framework-initiated reservations modify the allocated resources. `updateAvailble` and `updateAllocation` handle each of the cases.
>     
>     The following are some of the differing characteristics:
>     
>     `updateAvailable` handles the former case:
>       (1) the `available` resources change
>       (2) `frameworkSorter` is unaffected
>       (3) only the total of `roleSorter` is affected, the allocations are not.
>       
>     `updateAllocation` handles the latter case:
>       (1) the `available` resources do not change.
>       (2) both `frameworkSorter` and `roleSorter` are affected.

> No, the desired roles are specified in the resources query parameter. If you're wondering why the validation for Reserve now takes an optional role, that's because the master endpoint doesn't have a role to which all resources need to match. Recall that a framework always has a role to which all specified resources need to match.

If I understand you correctly, you mean that an operator may reserve resources for multiple roles in one request (while the framework can't). First, you don't have a test for this case : ). Second, if we group resources by roles and call `applyOfferOperation()` per role, can we avoid modifying validation function? I think you'll need to do the grouping at some point in order to update `roleSorter`, so why not doing it earlier and therefore keep framework-initiated and operator-initiated requests closer to each other? You already group them implicitly by `SlaveID`.

> There's a fundamental difference between operator-initiated reservations and framework-initiated reservations, in that operator-initiated reservations modify the available (unallocated) resources, whereas framework-initiated reservations modify the allocated resources. updateAvailble and updateAllocation handle each of the cases.

Let's capture this in the user guide! It also means that if all frameworks in the role are beyond their fare share, they are not able to dynamically reserve (because they are not getting offers), but an operator can always reserve, right? I also try to understand, whether we can refactor the code in a way that we do not introduce one more method on the allocator. Remember, it's a public interface and folks are supposed to write their own implementations, so we should be careful when changing the API.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88760
-----------------------------------------------------------


On June 22, 2015, 5:29 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 22, 2015, 5:29 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This is still a work in progress, but I'm sharing to gather some feedback around:
> 
> (1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
> (2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
>     Is this still correct with oversubscription in the picture?
> (3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
>     Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
>   src/Makefile.am dfebd2b14c9cb45c437509809fdf5ac3b0c8838c 
>   src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
>   src/master/allocator/mesos/hierarchical.hpp 7097482fc0adad1c177c15c35edd51c10754f89c 
>   src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
>   src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
>   src/tests/reserve_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> Added `src/tests/reserve_tests.cpp`.
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88760
-----------------------------------------------------------


Before making a thorough review, let's discuss one high level question.

Here is the problem how I understand it: we reserve for roles, but our code works mostly with frameworks (allocator methods, Offer protobuf). To workaround this you decided to drop role from the reservation request at all. Is it right? Don't we want to update at least the role sorter?

If you decide to add the role, we can preserve validation function for reservation and split `updateAllocation()` into two parts: role-specific and framework specific and use just role-specific variant for both operator-initiated reservations and framework-initiated reservations.

Thoughts?


src/master/http.cpp (line 553)
<https://reviews.apache.org/r/35702/#comment141344>

    s/recinding/rescinding



src/master/http.cpp (line 560)
<https://reviews.apache.org/r/35702/#comment141345>

    s/Recinding/rescinding/


- Alexander Rukletsov


On June 22, 2015, 5:29 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 22, 2015, 5:29 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This is still a work in progress, but I'm sharing to gather some feedback around:
> 
> (1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
> (2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
>     Is this still correct with oversubscription in the picture?
> (3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
>     Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
>   src/Makefile.am dfebd2b14c9cb45c437509809fdf5ac3b0c8838c 
>   src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
>   src/master/allocator/mesos/hierarchical.hpp 7097482fc0adad1c177c15c35edd51c10754f89c 
>   src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
>   src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
>   src/tests/reserve_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> Added `src/tests/reserve_tests.cpp`.
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 22, 2015, 5:29 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This is still a work in progress, but I'm sharing to gather some feedback around:

(1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
(2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
    Is this still correct with oversubscription in the picture?
(3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
    Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?


Diffs (updated)
-----

  include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
  src/Makefile.am dfebd2b14c9cb45c437509809fdf5ac3b0c8838c 
  src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
  src/master/allocator/mesos/hierarchical.hpp 7097482fc0adad1c177c15c35edd51c10754f89c 
  src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

Added `src/tests/reserve_tests.cpp`.


Thanks,

Michael Park


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 22, 2015, 5:11 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Rebased.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This is still a work in progress, but I'm sharing to gather some feedback around:

(1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
(2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
    Is this still correct with oversubscription in the picture?
(3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
    Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?


Diffs (updated)
-----

  include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
  src/Makefile.am dfebd2b14c9cb45c437509809fdf5ac3b0c8838c 
  src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
  src/master/allocator/mesos/hierarchical.hpp 7097482fc0adad1c177c15c35edd51c10754f89c 
  src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

Added `src/tests/reserve_tests.cpp`.


Thanks,

Michael Park


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 22, 2015, 4:50 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This is still a work in progress, but I'm sharing to gather some feedback around:

(1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
(2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
    Is this still correct with oversubscription in the picture?
(3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
    Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?


Diffs (updated)
-----

  include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
  src/Makefile.am e7de0f3d1a5efeaef47d5074defe3b40db94f573 
  src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
  src/master/allocator/mesos/hierarchical.hpp 646ee8c1c0fb824e1d17150b4e96e6281c65358f 
  src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing (updated)
-------

Added `src/tests/reserve_tests.cpp`.


Thanks,

Michael Park


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 22, 2015, 2:35 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Added tests for the error cases.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This is still a work in progress, but I'm sharing to gather some feedback around:

(1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
(2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
    Is this still correct with oversubscription in the picture?
(3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
    Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?


Diffs (updated)
-----

  include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
  src/Makefile.am e7de0f3d1a5efeaef47d5074defe3b40db94f573 
  src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
  src/master/allocator/mesos/hierarchical.hpp 646ee8c1c0fb824e1d17150b4e96e6281c65358f 
  src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

Added `src/tests/reserve_tests.cpp` which still need to be fleshed out for error cases (missing parameters, unauthorized client, etc).


Thanks,

Michael Park


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88701
-----------------------------------------------------------


Patch looks great!

Reviews applied: [35714, 35702]

All tests passed.

- Mesos ReviewBot


On June 21, 2015, 4:16 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 21, 2015, 4:16 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This is still a work in progress, but I'm sharing to gather some feedback around:
> 
> (1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
> (2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
>     Is this still correct with oversubscription in the picture?
> (3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
>     Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
>   src/Makefile.am e7de0f3d1a5efeaef47d5074defe3b40db94f573 
>   src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
>   src/master/allocator/mesos/hierarchical.hpp 646ee8c1c0fb824e1d17150b4e96e6281c65358f 
>   src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
>   src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
>   src/tests/reserve_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> Added `src/tests/reserve_tests.cpp` which still need to be fleshed out for error cases (missing parameters, unauthorized client, etc).
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 21, 2015, 4:16 p.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Summary (updated)
-----------------

[WIP] Added /reserve HTTP endpoint to the master.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This is still a work in progress, but I'm sharing to gather some feedback around:

(1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
(2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
    Is this still correct with oversubscription in the picture?
(3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
    Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?


Diffs (updated)
-----

  include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
  src/Makefile.am e7de0f3d1a5efeaef47d5074defe3b40db94f573 
  src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
  src/master/allocator/mesos/hierarchical.hpp 646ee8c1c0fb824e1d17150b4e96e6281c65358f 
  src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

Added `src/tests/reserve_tests.cpp` which still need to be fleshed out for error cases (missing parameters, unauthorized client, etc).


Thanks,

Michael Park


Re: Review Request 35702: [WIP] Add /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 20, 2015, 8:19 p.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description (updated)
-------

This is still a work in progress, but I'm sharing to gather some feedback around:

(1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
(2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
    Is this still correct with oversubscription in the picture?
(3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
    Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?


Diffs
-----

  include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
  src/Makefile.am e7de0f3d1a5efeaef47d5074defe3b40db94f573 
  src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
  src/master/allocator/mesos/hierarchical.hpp 646ee8c1c0fb824e1d17150b4e96e6281c65358f 
  src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

Added `src/tests/reserve_tests.cpp` which still need to be fleshed out for error cases (missing parameters, unauthorized client, etc).


Thanks,

Michael Park


Re: Review Request 35702: [WIP] Add /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/
-----------------------------------------------------------

(Updated June 20, 2015, 8:10 p.m.)


Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.


Changes
-------

Added JIRA ticket and added AdamB as a reviewer.


Bugs: MESOS-2600
    https://issues.apache.org/jira/browse/MESOS-2600


Repository: mesos


Description
-------

This is a work in progress. But I'm sharing early to gather some feedback around:

(1) I've added `updateAvailable` to the allocator API. Is this necessary or is there maybe a better way?
(2) To determine the amount of available resources, I used: `available = total - (offered + used)`.
    Is this still correct with oversubscription in the picture?
(3) I'm creating an `Offer.Operation` to perform the necessary updates in the allocator and such.
    Feels weird to use an "offer" operation when there's not an actual offer. Is this fine for now?


Diffs
-----

  include/mesos/master/allocator.hpp 92de1af414321281b00eaa6f129e5e3e2c849448 
  src/Makefile.am e7de0f3d1a5efeaef47d5074defe3b40db94f573 
  src/master/allocator/mesos/allocator.hpp 6cfa04650d91a80211cfbc0809236f9438926c78 
  src/master/allocator/mesos/hierarchical.hpp 646ee8c1c0fb824e1d17150b4e96e6281c65358f 
  src/master/http.cpp b893013ddd052cb58c520ac0328f4a5f0fed862e 
  src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
  src/master/master.cpp 0135c155181546d3cb43e9e05bb874af846d928d 
  src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
  src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
  src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
  src/tests/reserve_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/35702/diff/


Testing
-------

Added `src/tests/reserve_tests.cpp` which still need to be fleshed out for error cases (missing parameters, unauthorized client, etc).


Thanks,

Michael Park