You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Alexander Rukletsov <ru...@gmail.com> on 2015/07/13 18:46:20 UTC

Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


A high level question: do you think rescinding offers is a big deal for now?


src/master/http.cpp (line 507)
<https://reviews.apache.org/r/35702/#comment144877>

    The code until this line is basically request validation and authorization. Though it's not how we do it now, do you think it makes sense to split the function into smaller logical parts?
    
    How about something like this:
    
    ```
    Future<Response> Master::Http::reserve(const Request& request) const
    {
      return Master::Http::reserveValidate();
    }
    
    Future<Response> Master::Http::reserveValidate(const Request& request) const
    {
      <...>
      return Master::Http::reserveAuthorize();
    }
    
    <...>
    ```



src/master/http.cpp (lines 515 - 516)
<https://reviews.apache.org/r/35702/#comment144878>

    It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.



src/master/http.cpp (lines 523 - 524)
<https://reviews.apache.org/r/35702/#comment144880>

    Let's leave a comment here, that we do want to defer the decision completely to an allocator, but can do it currently because offers are issued and handled by the master.



src/master/http.cpp (lines 541 - 545)
<https://reviews.apache.org/r/35702/#comment144879>

    We basically defer the decision whether request can be granted or not to an allocator (up to rescinding). Let's capture it in a comment!



src/master/master.hpp (line 962)
<https://reviews.apache.org/r/35702/#comment144851>

    I see that you extracted this function for code reusal, but let's document it. You may want to add a comment for `applyResourceOperation()`and update the comment for `applyOfferOperation()` as well.


- Alexander Rukletsov


On June 28, 2015, 8:36 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 28, 2015, 8:36 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, line 507
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line507>
> >
> >     The code until this line is basically request validation and authorization. Though it's not how we do it now, do you think it makes sense to split the function into smaller logical parts?
> >     
> >     How about something like this:
> >     
> >     ```
> >     Future<Response> Master::Http::reserve(const Request& request) const
> >     {
> >       return Master::Http::reserveValidate();
> >     }
> >     
> >     Future<Response> Master::Http::reserveValidate(const Request& request) const
> >     {
> >       <...>
> >       return Master::Http::reserveAuthorize();
> >     }
> >     
> >     <...>
> >     ```
> 
> Michael Park wrote:
>     Yeah, I think it does make sense to break huge functions down to the smaller logical pieces. I think we can do a more general refactoring for the validation pattern, since they all pretty much do the same thing. But I think we can consider doing that uniformly, outside of this patch. What do you think?

I personally prefer sacrificing consistency, but write new code "right". However, generally we tend to favour consistency over local improvements, so feel free to "fix" the issue by creating a JIRA : ).


> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.
> 
> Michael Park wrote:
>     I didn't see a good reason to require a "one role per request" condition. The current interface accurately models the fact that an operator does not have a role associated to it like a framework does, and I don't think "avoiding changes in the `validate()` function" should have any influence in deciding how an interface behaves.
>     
>     If we required such a condition, the per-request atomicity guarantee comes with a limitation that it can only be for a single role. While I'm not sure of its value, I'm also not sure what we gain by requiring it from the user's perspective?

I think I'm missing something, my understanding is that each dynamic reservation is associated with a role, regardless, who issues a reservation request. I don't think limiting users to one role per request gives them any benefit, but it looks like we can be closer to framework-issued request if we do so. What am I missing?


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 27, 2015, 11:30 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 27, 2015, 11:30 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > A high level question: do you think rescinding offers is a big deal for now?

I don't believe it's a big deal for now because frameworks need to deal with rescinded offers regardless, and I imagine the frequency of operators using these endpoints will probably be low. Of course we'll monitor whether these expectations are true as we go forward, but I think for now it's ok.


> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, line 507
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line507>
> >
> >     The code until this line is basically request validation and authorization. Though it's not how we do it now, do you think it makes sense to split the function into smaller logical parts?
> >     
> >     How about something like this:
> >     
> >     ```
> >     Future<Response> Master::Http::reserve(const Request& request) const
> >     {
> >       return Master::Http::reserveValidate();
> >     }
> >     
> >     Future<Response> Master::Http::reserveValidate(const Request& request) const
> >     {
> >       <...>
> >       return Master::Http::reserveAuthorize();
> >     }
> >     
> >     <...>
> >     ```

Yeah, I think it does make sense to break huge functions down to the smaller logical pieces. I think we can do a more general refactoring for the validation pattern, since they all pretty much do the same thing. But I think we can consider doing that uniformly, outside of this patch. What do you think?


> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.

I didn't see a good reason to require a "one role per request" condition. The current interface accurately models the fact that an operator does not have a role associated to it like a framework does, and I don't think "avoiding changes in the `validate()` function" should have any influence in deciding how an interface behaves.

If we required such a condition, the per-request atomicity guarantee comes with a limitation that it can only be for a single role. While I'm not sure of its value, I'm also not sure what we gain by requiring it from the user's perspective?


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On June 28, 2015, 8:36 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 28, 2015, 8:36 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, line 507
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line507>
> >
> >     The code until this line is basically request validation and authorization. Though it's not how we do it now, do you think it makes sense to split the function into smaller logical parts?
> >     
> >     How about something like this:
> >     
> >     ```
> >     Future<Response> Master::Http::reserve(const Request& request) const
> >     {
> >       return Master::Http::reserveValidate();
> >     }
> >     
> >     Future<Response> Master::Http::reserveValidate(const Request& request) const
> >     {
> >       <...>
> >       return Master::Http::reserveAuthorize();
> >     }
> >     
> >     <...>
> >     ```
> 
> Michael Park wrote:
>     Yeah, I think it does make sense to break huge functions down to the smaller logical pieces. I think we can do a more general refactoring for the validation pattern, since they all pretty much do the same thing. But I think we can consider doing that uniformly, outside of this patch. What do you think?
> 
> Alexander Rukletsov wrote:
>     I personally prefer sacrificing consistency, but write new code "right". However, generally we tend to favour consistency over local improvements, so feel free to "fix" the issue by creating a JIRA : ).

I've filed a JIRA ticket for this here: https://issues.apache.org/jira/browse/MESOS-3186


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 31, 2015, 9:56 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 31, 2015, 9:56 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3772e39015a22655dcad00ad844dc5ddc90db43f 
>   src/master/master.hpp ea18c4e0bb0743747401b9cd5ea14ae9b56ae3cc 
>   src/master/master.cpp 351a3c2b5f551ad065682cea601d2436258e4544 
>   src/master/validation.hpp 43b8d84556e7f0a891dddf6185bbce7ca50b360a 
>   src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.
> 
> Michael Park wrote:
>     I didn't see a good reason to require a "one role per request" condition. The current interface accurately models the fact that an operator does not have a role associated to it like a framework does, and I don't think "avoiding changes in the `validate()` function" should have any influence in deciding how an interface behaves.
>     
>     If we required such a condition, the per-request atomicity guarantee comes with a limitation that it can only be for a single role. While I'm not sure of its value, I'm also not sure what we gain by requiring it from the user's perspective?
> 
> Alexander Rukletsov wrote:
>     I think I'm missing something, my understanding is that each dynamic reservation is associated with a role, regardless, who issues a reservation request. I don't think limiting users to one role per request gives them any benefit, but it looks like we can be closer to framework-issued request if we do so. What am I missing?
> 
> Michael Park wrote:
>     Your understanding is correct. Aside from the resources being associated with a role, frameworks are also associated with a role. We check that every resource is being reserved for the framework's role because a framework is associated with a role and it wouldn't make sense to allow a framework to reserve resources for a role that does not match its role. On the contrary, the same rule doesn't apply for an operator since there's no such thing as an "operator's role".
> 
> Alexander Rukletsov wrote:
>     That's right. Let me try to reformulate my proposal. If we require an operator to reserve resources for one role per request, it can be interpreted as an "operator role". An advantage here is that `validate()` method doesn't need to be changed, while a disadvantage is that this approach is a bit artificial and can lead to confusion. What do you think?

As I mentioned in my initial comment, I think that the `validate()` function is an implementation detail and shouldn't be a source of motivation in deciding how an API should behave.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 28, 2015, 9:03 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 9:03 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Michael Park <mc...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.
> 
> Michael Park wrote:
>     I didn't see a good reason to require a "one role per request" condition. The current interface accurately models the fact that an operator does not have a role associated to it like a framework does, and I don't think "avoiding changes in the `validate()` function" should have any influence in deciding how an interface behaves.
>     
>     If we required such a condition, the per-request atomicity guarantee comes with a limitation that it can only be for a single role. While I'm not sure of its value, I'm also not sure what we gain by requiring it from the user's perspective?
> 
> Alexander Rukletsov wrote:
>     I think I'm missing something, my understanding is that each dynamic reservation is associated with a role, regardless, who issues a reservation request. I don't think limiting users to one role per request gives them any benefit, but it looks like we can be closer to framework-issued request if we do so. What am I missing?

Your understanding is correct. Aside from the resources being associated with a role, frameworks are also associated with a role. We check that every resource is being reserved for the framework's role because a framework is associated with a role and it wouldn't make sense to allow a framework to reserve resources for a role that does not match its role. On the contrary, the same rule doesn't apply for an operator since there's no such thing as an "operator's role".


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 27, 2015, 11:30 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 27, 2015, 11:30 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.
> 
> Michael Park wrote:
>     I didn't see a good reason to require a "one role per request" condition. The current interface accurately models the fact that an operator does not have a role associated to it like a framework does, and I don't think "avoiding changes in the `validate()` function" should have any influence in deciding how an interface behaves.
>     
>     If we required such a condition, the per-request atomicity guarantee comes with a limitation that it can only be for a single role. While I'm not sure of its value, I'm also not sure what we gain by requiring it from the user's perspective?
> 
> Alexander Rukletsov wrote:
>     I think I'm missing something, my understanding is that each dynamic reservation is associated with a role, regardless, who issues a reservation request. I don't think limiting users to one role per request gives them any benefit, but it looks like we can be closer to framework-issued request if we do so. What am I missing?
> 
> Michael Park wrote:
>     Your understanding is correct. Aside from the resources being associated with a role, frameworks are also associated with a role. We check that every resource is being reserved for the framework's role because a framework is associated with a role and it wouldn't make sense to allow a framework to reserve resources for a role that does not match its role. On the contrary, the same rule doesn't apply for an operator since there's no such thing as an "operator's role".

That's right. Let me try to reformulate my proposal. If we require an operator to reserve resources for one role per request, it can be interpreted as an "operator role". An advantage here is that `validate()` method doesn't need to be changed, while a disadvantage is that this approach is a bit artificial and can lead to confusion. What do you think?


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 28, 2015, 9:03 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 9:03 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>


Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. Do you envision having resources collection with various roles in one request? Maybe it makes sense to add a validation step which ensures there is just one role per request and use it here, also avoiding changes in the `validate()`function.
> 
> Michael Park wrote:
>     I didn't see a good reason to require a "one role per request" condition. The current interface accurately models the fact that an operator does not have a role associated to it like a framework does, and I don't think "avoiding changes in the `validate()` function" should have any influence in deciding how an interface behaves.
>     
>     If we required such a condition, the per-request atomicity guarantee comes with a limitation that it can only be for a single role. While I'm not sure of its value, I'm also not sure what we gain by requiring it from the user's perspective?
> 
> Alexander Rukletsov wrote:
>     I think I'm missing something, my understanding is that each dynamic reservation is associated with a role, regardless, who issues a reservation request. I don't think limiting users to one role per request gives them any benefit, but it looks like we can be closer to framework-issued request if we do so. What am I missing?
> 
> Michael Park wrote:
>     Your understanding is correct. Aside from the resources being associated with a role, frameworks are also associated with a role. We check that every resource is being reserved for the framework's role because a framework is associated with a role and it wouldn't make sense to allow a framework to reserve resources for a role that does not match its role. On the contrary, the same rule doesn't apply for an operator since there's no such thing as an "operator's role".
> 
> Alexander Rukletsov wrote:
>     That's right. Let me try to reformulate my proposal. If we require an operator to reserve resources for one role per request, it can be interpreted as an "operator role". An advantage here is that `validate()` method doesn't need to be changed, while a disadvantage is that this approach is a bit artificial and can lead to confusion. What do you think?
> 
> Michael Park wrote:
>     As I mentioned in my initial comment, I think that the `validate()` function is an implementation detail and shouldn't be a source of motivation in deciding how an API should behave.

Fair enough, feel free to drop.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 28, 2015, 9:03 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 9:03 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the various approaches and limitations and deal-breakers of those approaches here: [Master Endpoint Implementation Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered resources.
>     * We greedily rescind one offer at a time until we've rescinded sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than `recoverResources(..., None())` so that we can pretty much always win the race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be optimistic and forward the request to the allocator since there may be available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as little offers as possible.
> The challenges of implementing the ideal solution in the current state is described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>