You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Alexander Rukletsov <ru...@gmail.com> on 2016/01/13 00:06:27 UTC

Review Request 42222: Added a comment on allocator recovery.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42222/
-----------------------------------------------------------

Review request for mesos, Ben Mahler and Joris Van Remoortere.


Repository: mesos


Description
-------

See summary.


Diffs
-----

  src/master/allocator/mesos/hierarchical.cpp df8bccaf2b8cfc0cb5ca18d4867371ae7a84c12f 

Diff: https://reviews.apache.org/r/42222/diff/


Testing
-------

None: not a functional change.


Thanks,

Alexander Rukletsov


Re: Review Request 42222: Added a comment on allocator recovery.

Posted by Klaus Ma <kl...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42222/#review114218
-----------------------------------------------------------

Ship it!


Ship It!

- Klaus Ma


On Jan. 13, 2016, 7:06 a.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42222/
> -----------------------------------------------------------
> 
> (Updated Jan. 13, 2016, 7:06 a.m.)
> 
> 
> Review request for mesos, Ben Mahler and Joris Van Remoortere.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp df8bccaf2b8cfc0cb5ca18d4867371ae7a84c12f 
> 
> Diff: https://reviews.apache.org/r/42222/diff/
> 
> 
> Testing
> -------
> 
> None: not a functional change.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Re: Review Request 42222: Added a comment on allocator recovery.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On Jan. 15, 2016, 6:50 a.m., Ben Mahler wrote:
> > Thanks Alex! I ended up going over the structure of the recover code and left some higher level comments. There also appears to be a bug that will crash the master that I marked as an issue :)
> > 
> > Have we convinced ourselves that we want to act on partial state when quota is *not* set? It seems to complicate things here, and acting on partial state w/o quota has some issues as well?

Thanks a bunch for a thorough review, Ben! I have opened a separate ticket to track the bug and other comments you had left: https://issues.apache.org/jira/browse/MESOS-4417.


> On Jan. 15, 2016, 6:50 a.m., Ben Mahler wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 205-206
> > <https://reviews.apache.org/r/42222/diff/1/?file=1195000#file1195000line205>
> >
> >     It looks like if we trip the `resume` call in `addSlave`, this delayed resume will crash the master due to the `CHECK(paused)` that currently resides in `resume`.
> >     
> >     Something I'm missing?
> 
> Joris Van Remoortere wrote:
>     Thanks for your review Ben!
>     I'm committing just the comment change for this patch.
>     I'll leave this issue open as we want to address it in a separate patch.

Filed: https://issues.apache.org/jira/browse/MESOS-4417


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42222/#review114628
-----------------------------------------------------------


On Jan. 12, 2016, 11:06 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42222/
> -----------------------------------------------------------
> 
> (Updated Jan. 12, 2016, 11:06 p.m.)
> 
> 
> Review request for mesos, Ben Mahler and Joris Van Remoortere.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp df8bccaf2b8cfc0cb5ca18d4867371ae7a84c12f 
> 
> Diff: https://reviews.apache.org/r/42222/diff/
> 
> 
> Testing
> -------
> 
> None: not a functional change.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Re: Review Request 42222: Added a comment on allocator recovery.

Posted by Joris Van Remoortere <jo...@gmail.com>.

> On Jan. 15, 2016, 6:50 a.m., Ben Mahler wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 205-206
> > <https://reviews.apache.org/r/42222/diff/1/?file=1195000#file1195000line205>
> >
> >     It looks like if we trip the `resume` call in `addSlave`, this delayed resume will crash the master due to the `CHECK(paused)` that currently resides in `resume`.
> >     
> >     Something I'm missing?

Thanks for your review Ben!
I'm committing just the comment change for this patch.
I'll leave this issue open as we want to address it in a separate patch.


- Joris


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42222/#review114628
-----------------------------------------------------------


On Jan. 12, 2016, 11:06 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42222/
> -----------------------------------------------------------
> 
> (Updated Jan. 12, 2016, 11:06 p.m.)
> 
> 
> Review request for mesos, Ben Mahler and Joris Van Remoortere.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp df8bccaf2b8cfc0cb5ca18d4867371ae7a84c12f 
> 
> Diff: https://reviews.apache.org/r/42222/diff/
> 
> 
> Testing
> -------
> 
> None: not a functional change.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Re: Review Request 42222: Added a comment on allocator recovery.

Posted by Alexander Rukletsov <ru...@gmail.com>.

> On Jan. 15, 2016, 6:50 a.m., Ben Mahler wrote:
> > Thanks Alex! I ended up going over the structure of the recover code and left some higher level comments. There also appears to be a bug that will crash the master that I marked as an issue :)
> > 
> > Have we convinced ourselves that we want to act on partial state when quota is *not* set? It seems to complicate things here, and acting on partial state w/o quota has some issues as well?
> 
> Alexander Rukletsov wrote:
>     Thanks a bunch for a thorough review, Ben! I have opened a separate ticket to track the bug and other comments you had left: https://issues.apache.org/jira/browse/MESOS-4417.

> Have we convinced ourselves that we want to act on partial state when quota is not set?

I think it's fine, because the allocator does not provide any guarantees without quota, neither it enforces the sharing scheme computed by the sorter. There can be other scenarios other than recovery that may lead to biased cluster sharing. Do you have any specific use cases in mind why shall we *always* recover?


> On Jan. 15, 2016, 6:50 a.m., Ben Mahler wrote:
> > src/master/allocator/mesos/hierarchical.cpp, line 149
> > <https://reviews.apache.org/r/42222/diff/1/?file=1195000#file1195000line149>
> >
> >     It now seems odd to start the allocation loop within initialize rather than within recover, we'll do the following:
> >     
> >     
> >     intialize -> unpauses, starts allocation loop but the allocation loop should do nothing
> >     recover -> pauses allocations, waits for condition to be met, resumes allocation.
> >     
> >     
> >     It seems a simpler way to think about this would be to just start allocating after the recovery condition is met. This would consist of (1) started = true* and (2) the initial delay to 'batch' that starts the delay loop.
> >     
> >     * Although keeping the pause concept may be useful if we want to support pausing/resuming allocation from endpoints.

When we were thinking about how to implement it in the most consistent way, we have figured out, that the widely used pattern is:
`initialize()` -> [optional] `recover()`. Both `initialize()` and `recover()` are called from the client code. Hence we decided it would be great to ensure things work if `recover()` has not been called. We can also always call `recover()` even though there is no quota to recover.

Anyway I favour to keep the pause concept regardless which workflow we opt for.


> On Jan. 15, 2016, 6:50 a.m., Ben Mahler wrote:
> > src/master/allocator/mesos/hierarchical.cpp, line 168
> > <https://reviews.apache.org/r/42222/diff/1/?file=1195000#file1195000line168>
> >
> >     Should this comment be moved down?
> >     
> >     I'm also not quite sure what it's expressing, is it a re-hash of the lifecycle documentation?
> >     
> >     ```
> >        * Because it is hard to define recovery for a running allocator, this
> >        * method should be called after `initialize()`, but before actual
> >        * allocation starts (i.e. `addSlave()` is called).
> >     ```
> >     
> >     For example:
> >     
> >     ```
> >     CHECK(initialized);
> >     
> >     // Recovery should be called after initialize and before other calls.
> >     CHECK_EQ(0u, slaves.size());
> >     CHECK_EQ(0, quotaRoleSorter->count());
> >     ```
> >     
> >     Could we extend our existing 'initialized' state boolean into an enum to express UNINITIALIZED, INITIALIZED, RECOVERED? As far as I can tell, the existing methods with `CHECK(initialized)` are also interested in checking that recover was called?

I think the question here is whether we want recovery be optional or not. If it is optional, tracking state transitions may become tedious.


> On Jan. 15, 2016, 6:50 a.m., Ben Mahler wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 171-172
> > <https://reviews.apache.org/r/42222/diff/1/?file=1195000#file1195000line171>
> >
> >     Not for this review, but looks like we should update `Sorter::count` to return a size_t.
> >     
> >     The following CHECK implies that `_expectedAgentCount` should just be a size_t?
> >     
> >     ```
> >     CHECK(_expectedAgentCount >= 0);
> >     ```
> >     
> >     With a size_t we don't have to do this kind of check :)

This is a general question: shall we use `size_t` for non-negative integers or follow google style guide (and protobufs!) and use int. I'm not sure we have an agreement on this one.


> On Jan. 15, 2016, 6:50 a.m., Ben Mahler wrote:
> > src/master/allocator/mesos/hierarchical.cpp, line 468
> > <https://reviews.apache.org/r/42222/diff/1/?file=1195000#file1195000line468>
> >
> >     Hm.. did you need the static cast here?

I believe so, one is `size_t` and the other one is `int`.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42222/#review114628
-----------------------------------------------------------


On Jan. 12, 2016, 11:06 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42222/
> -----------------------------------------------------------
> 
> (Updated Jan. 12, 2016, 11:06 p.m.)
> 
> 
> Review request for mesos, Ben Mahler and Joris Van Remoortere.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp df8bccaf2b8cfc0cb5ca18d4867371ae7a84c12f 
> 
> Diff: https://reviews.apache.org/r/42222/diff/
> 
> 
> Testing
> -------
> 
> None: not a functional change.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Re: Review Request 42222: Added a comment on allocator recovery.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42222/#review114628
-----------------------------------------------------------


Thanks Alex! I ended up going over the structure of the recover code and left some higher level comments. There also appears to be a bug that will crash the master that I marked as an issue :)

Have we convinced ourselves that we want to act on partial state when quota is *not* set? It seems to complicate things here, and acting on partial state w/o quota has some issues as well?


src/master/allocator/mesos/hierarchical.cpp (line 149)
<https://reviews.apache.org/r/42222/#comment175526>

    It now seems odd to start the allocation loop within initialize rather than within recover, we'll do the following:
    
    intialize -> unpauses, starts allocation loop but the allocation loop should do nothing
    recover -> pauses allocations, waits for condition to be met, resumes allocation.
    
    It seems a simpler way to think about this would be to just start allocating after the recovery condition is met. This would consist of (1) started = true* and (2) the initial delay to 'batch' that starts the delay loop.
    
    * Although keeping the pause concept may be useful if we want to support pausing/resuming allocation from endpoints.



src/master/allocator/mesos/hierarchical.cpp (lines 153 - 163)
<https://reviews.apache.org/r/42222/#comment175480>

    It seems a little odd to document at the function level when this interface function is already documented:
    
    ```
      /**
       * Informs the allocator of the recovered state from the master.
       *
       * Because it is hard to define recovery for a running allocator, this
       * method should be called after `initialize()`, but before actual
       * allocation starts (i.e. `addSlave()` is called).
       *
       * TODO(alexr): Consider extending the signature with expected
       * frameworks count once it is available upon the master failover.
       *
       * @param quotas A (possibly empty) collection of quotas, keyed by
       *     their role, known to the master.
       */
      virtual void recover(
          const int expectedAgentCount,
          const hashmap<std::string, Quota>& quotas) = 0;
    ```
    
    How about moving it in to the body after the checks, with some slight rephrasing:
    
    ```
    void HierarchicalAllocatorProcess::recover(
        const int _expectedAgentCount,
        const hashmap<string, Quota>& quotas)
    {
      // Recovery should start before actual allocation starts.
      CHECK(initialized);
      CHECK_EQ(0u, slaves.size());
      CHECK_EQ(0, quotaRoleSorter->count());
      CHECK(_expectedAgentCount >= 0);
    
      // If there is no quota, recovery is a no-op. Otherwise, we need
      // to delay allocations while agents are re-registering because 
      // otherwise we perform allocations on a partial view of resources!
      // We would consequently perform unnecessary allocations to satisfy
      // quota constraints, which can over-allocate non-revocable resources
      // to roles using quota. Then, frameworks in roles without quota can
      // be unnecessarily deprived of resources. We may also be unable to
      // satisfy all of the quota constraints. Repeated master failovers
      // exacerbate the issue.
      
      if (quotas.empty()) {
        VLOG(1) << "Skipping recovery of hierarchical allocator: "
                << "nothing to recover";
    
        return;
      }
      
      ...
      
    }
    ```
    
    It's great that we went through the exercise of writing this comment because it seems we should consider whether we want to avoid acting on partial state even when no quota is set.



src/master/allocator/mesos/hierarchical.cpp (line 165)
<https://reviews.apache.org/r/42222/#comment175485>

    I wonder if we can use a better word than "expected" here; something that conveys that this represents the count of agents that were registered prior to failover / persisted in the registry. For example, the following would be really clear:
    
    ```
    void HierarchicalAllocatorProcess::recover(
        const registry::Slaves& recoveredSlaves,
        const hashmap<string, Quota>& quotas)
    ```



src/master/allocator/mesos/hierarchical.cpp (line 168)
<https://reviews.apache.org/r/42222/#comment175510>

    Should this comment be moved down?
    
    I'm also not quite sure what it's expressing, is it a re-hash of the lifecycle documentation?
    
    ```
       * Because it is hard to define recovery for a running allocator, this
       * method should be called after `initialize()`, but before actual
       * allocation starts (i.e. `addSlave()` is called).
    ```
    
    For example:
    
    ```
    CHECK(initialized);
    
    // Recovery should be called after initialize and before other calls.
    CHECK_EQ(0u, slaves.size());
    CHECK_EQ(0, quotaRoleSorter->count());
    ```
    
    Could we extend our existing 'initialized' state boolean into an enum to express UNINITIALIZED, INITIALIZED, RECOVERED? As far as I can tell, the existing methods with `CHECK(initialized)` are also interested in checking that recover was called?



src/master/allocator/mesos/hierarchical.cpp (lines 171 - 172)
<https://reviews.apache.org/r/42222/#comment175481>

    Not for this review, but looks like we should update `Sorter::count` to return a size_t.
    
    The following CHECK implies that `_expectedAgentCount` should just be a size_t?
    
    ```
    CHECK(_expectedAgentCount >= 0);
    ```
    
    With a size_t we don't have to do this kind of check :)



src/master/allocator/mesos/hierarchical.cpp (lines 174 - 199)
<https://reviews.apache.org/r/42222/#comment175483>

    As a terminology thing, it seems a little strange to say that we're skipping recovery. Rather, we always recover the allocator, but in certain cases recovery might do different things. Still, not sure that we want to have the differing behavior depending on whether quota is set. For example, we could do the following:
    
    ```
    void HierarchicalAllocatorProcess::recover(
        const registry::Slaves& slaves,
        const hashmap<string, Quota>& quotas)
    {
      CHECK_EQ(INITIALIZED, state);
     
      recoveredSlaveCount = slaves.slaves().size();
     
      foreachpair (const string& role, const Quota& quota, quotas) {
        setQuota(role, quota.info);
      }
      
      // We'll start allocating once a sufficient percentage
      // of agents re-register (or the timeout elapses) to
      // avoid acting on partial state (which can lead to
      // a number of issues!).
      
      Duration recoveryTimeout = ALLOCATOR_RECOVERY_TIMEOUT;
     
      LOG(INFO) << "Allocation will start within " << recoveryTimeout
                << " once " << ALLOCATOR_RECOVERY_AGENT_THRESHOLD_PERCENT << "%"
                << " of the " << slaves.slaves().size()
                << " recovered slaves re-register";
    
      delay(recoveryTimeout, self(), &Self::_recover);
    
      state = RECOVERING;
    }
    ```
    
    We'll need to either check the threshold condition in addSlave (as is currently done) to trigger the completion of recovery, have some abstraction that satisfies a future when a threshold is met. The latter is tricky if we want to finish recovery when slaves.size() reaches the threshold (gauge-like abstraction but we have to know when to check the gauge) rather than when the count of addSlave calls reaches the threshold (counter-like abstraction, easier).



src/master/allocator/mesos/hierarchical.cpp (lines 205 - 206)
<https://reviews.apache.org/r/42222/#comment175525>

    It looks like if we trip the `resume` call in `addSlave`, this delayed resume will crash the master due to the `CHECK(paused)` that currently resides in `resume`.
    
    Something I'm missing?



src/master/allocator/mesos/hierarchical.cpp (line 468)
<https://reviews.apache.org/r/42222/#comment175533>

    Hm.. did you need the static cast here?


- Ben Mahler


On Jan. 12, 2016, 11:06 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42222/
> -----------------------------------------------------------
> 
> (Updated Jan. 12, 2016, 11:06 p.m.)
> 
> 
> Review request for mesos, Ben Mahler and Joris Van Remoortere.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp df8bccaf2b8cfc0cb5ca18d4867371ae7a84c12f 
> 
> Diff: https://reviews.apache.org/r/42222/diff/
> 
> 
> Testing
> -------
> 
> None: not a functional change.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Re: Review Request 42222: Added a comment on allocator recovery.

Posted by Joris Van Remoortere <jo...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42222/#review114888
-----------------------------------------------------------

Ship it!


I've rephrased and moved the comment as per BenM's suggestion.
Let's tackle the other comments in separate reviews. I think it is valuable to add this comment now regardless.

- Joris Van Remoortere


On Jan. 12, 2016, 11:06 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42222/
> -----------------------------------------------------------
> 
> (Updated Jan. 12, 2016, 11:06 p.m.)
> 
> 
> Review request for mesos, Ben Mahler and Joris Van Remoortere.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp df8bccaf2b8cfc0cb5ca18d4867371ae7a84c12f 
> 
> Diff: https://reviews.apache.org/r/42222/diff/
> 
> 
> Testing
> -------
> 
> None: not a functional change.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Re: Review Request 42222: Added a comment on allocator recovery.

Posted by Guangya Liu <gy...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42222/#review114125
-----------------------------------------------------------

Ship it!


Ship It!

- Guangya Liu


On 一月 12, 2016, 11:06 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42222/
> -----------------------------------------------------------
> 
> (Updated 一月 12, 2016, 11:06 p.m.)
> 
> 
> Review request for mesos, Ben Mahler and Joris Van Remoortere.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp df8bccaf2b8cfc0cb5ca18d4867371ae7a84c12f 
> 
> Diff: https://reviews.apache.org/r/42222/diff/
> 
> 
> Testing
> -------
> 
> None: not a functional change.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>


Re: Review Request 42222: Added a comment on allocator recovery.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42222/#review114161
-----------------------------------------------------------


Patch looks great!

Reviews applied: [42221, 42222]

Passed command: export OS=ubuntu:14.04;export CONFIGURATION="--verbose";export COMPILER=gcc; ./support/docker_build.sh

- Mesos ReviewBot


On Jan. 12, 2016, 11:06 p.m., Alexander Rukletsov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42222/
> -----------------------------------------------------------
> 
> (Updated Jan. 12, 2016, 11:06 p.m.)
> 
> 
> Review request for mesos, Ben Mahler and Joris Van Remoortere.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp df8bccaf2b8cfc0cb5ca18d4867371ae7a84c12f 
> 
> Diff: https://reviews.apache.org/r/42222/diff/
> 
> 
> Testing
> -------
> 
> None: not a functional change.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>