You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Bannier <bb...@apache.org> on 2019/06/26 23:55:11 UTC

Review Request 70959: Cleared agent drain state when draining is finished.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70959/
-----------------------------------------------------------

Review request for mesos, Greg Mann and Joseph Wu.


Bugs: MESOS-9860
    https://issues.apache.org/jira/browse/MESOS-9860


Repository: mesos


Description
-------

Once a draining agent has neither frameworks with pending tasks nor any
executors with either queued or launched tasks it has finished draining.
This patch adds handling of that case which clears both the in-memory
and persisted drain configuration.


Diffs
-----

  src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 


Diff: https://reviews.apache.org/r/70959/diff/1/


Testing
-------

`make check`

tested as part of https://reviews.apache.org/r/70960/


Thanks,

Benjamin Bannier


Re: Review Request 70959: Cleared agent drain state when draining is finished.

Posted by Benjamin Bannier <bb...@apache.org>.

> On June 28, 2019, 3:38 a.m., Greg Mann wrote:
> > src/slave/slave.cpp
> > Lines 9783-9795 (patched)
> > <https://reviews.apache.org/r/70959/diff/3/?file=2152676#file2152676line9783>
> >
> >     I don't think it's sufficient to check `framework->idle()`, since `Framework::idle()` does not check `operations` at all. Could this block be replaced by:
> >     
> >     ```
> >     if (!frameworks.empty() || !operations.empty()) {
> >       return;
> >     }
> >     ```
> >     ?

The first check this expression performs is actually `operations.empty()`, but this is a good point about `idle`: one scenario we call this function from is for `idle` frameworks which are `removeFramework`'d, i.e., it is sufficient to check that `frameworks.empty()`.

I have rewritten the code as suggested by you, but kept the temporary for readability.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70959/#review216218
-----------------------------------------------------------


On June 28, 2019, 11:27 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70959/
> -----------------------------------------------------------
> 
> (Updated June 28, 2019, 11:27 a.m.)
> 
> 
> Review request for mesos, Greg Mann and Joseph Wu.
> 
> 
> Bugs: MESOS-9860
>     https://issues.apache.org/jira/browse/MESOS-9860
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Once a draining agent has neither frameworks with pending tasks nor any
> executors with either queued or launched tasks it has finished draining.
> This patch adds handling of that case which clears both the in-memory
> and persisted drain configuration.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70959/diff/4/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> tested as part of https://reviews.apache.org/r/70960/
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 70959: Cleared agent drain state when draining is finished.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70959/#review216218
-----------------------------------------------------------




src/slave/slave.cpp
Lines 9783-9795 (patched)
<https://reviews.apache.org/r/70959/#comment303326>

    I don't think it's sufficient to check `framework->idle()`, since `Framework::idle()` does not check `operations` at all. Could this block be replaced by:
    
    ```
    if (!frameworks.empty() || !operations.empty()) {
      return;
    }
    ```
    ?


- Greg Mann


On June 27, 2019, 11:11 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70959/
> -----------------------------------------------------------
> 
> (Updated June 27, 2019, 11:11 p.m.)
> 
> 
> Review request for mesos, Greg Mann and Joseph Wu.
> 
> 
> Bugs: MESOS-9860
>     https://issues.apache.org/jira/browse/MESOS-9860
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Once a draining agent has neither frameworks with pending tasks nor any
> executors with either queued or launched tasks it has finished draining.
> This patch adds handling of that case which clears both the in-memory
> and persisted drain configuration.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70959/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> tested as part of https://reviews.apache.org/r/70960/
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 70959: Cleared agent drain state when draining is finished.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70959/#review216611
-----------------------------------------------------------


Ship it!




Ship It!

- Greg Mann


On June 28, 2019, 9:27 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70959/
> -----------------------------------------------------------
> 
> (Updated June 28, 2019, 9:27 a.m.)
> 
> 
> Review request for mesos, Greg Mann and Joseph Wu.
> 
> 
> Bugs: MESOS-9860
>     https://issues.apache.org/jira/browse/MESOS-9860
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Once a draining agent has neither frameworks with pending tasks nor any
> executors with either queued or launched tasks it has finished draining.
> This patch adds handling of that case which clears both the in-memory
> and persisted drain configuration.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70959/diff/4/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> tested as part of https://reviews.apache.org/r/70960/
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 70959: Cleared agent drain state when draining is finished.

Posted by Benjamin Bannier <bb...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70959/
-----------------------------------------------------------

(Updated June 28, 2019, 11:27 a.m.)


Review request for mesos, Greg Mann and Joseph Wu.


Changes
-------

Address comment from Greg, fix formatting


Bugs: MESOS-9860
    https://issues.apache.org/jira/browse/MESOS-9860


Repository: mesos


Description
-------

Once a draining agent has neither frameworks with pending tasks nor any
executors with either queued or launched tasks it has finished draining.
This patch adds handling of that case which clears both the in-memory
and persisted drain configuration.


Diffs (updated)
-----

  src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
  src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 


Diff: https://reviews.apache.org/r/70959/diff/4/

Changes: https://reviews.apache.org/r/70959/diff/3-4/


Testing
-------

`make check`

tested as part of https://reviews.apache.org/r/70960/


Thanks,

Benjamin Bannier


Re: Review Request 70959: Cleared agent drain state when draining is finished.

Posted by Benjamin Bannier <bb...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70959/
-----------------------------------------------------------

(Updated June 28, 2019, 1:11 a.m.)


Review request for mesos, Greg Mann and Joseph Wu.


Changes
-------

Move draining to removal of operations and frameworks instead of being tied to acknowledgements


Bugs: MESOS-9860
    https://issues.apache.org/jira/browse/MESOS-9860


Repository: mesos


Description
-------

Once a draining agent has neither frameworks with pending tasks nor any
executors with either queued or launched tasks it has finished draining.
This patch adds handling of that case which clears both the in-memory
and persisted drain configuration.


Diffs (updated)
-----

  src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
  src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 


Diff: https://reviews.apache.org/r/70959/diff/3/

Changes: https://reviews.apache.org/r/70959/diff/2-3/


Testing
-------

`make check`

tested as part of https://reviews.apache.org/r/70960/


Thanks,

Benjamin Bannier


Re: Review Request 70959: Cleared agent drain state when draining is finished.

Posted by Benjamin Bannier <bb...@apache.org>.

> On June 27, 2019, 10:17 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp
> > Lines 4876 (patched)
> > <https://reviews.apache.org/r/70959/diff/2/?file=2152613#file2152613line4876>
> >
> >     there are several places in this file where `removeFramework` is called, why is this only being called here? i think it would be less error prone and future proof to call this from within `removeFramework()` ?

I have moved this into `removeFramework` and similar moved the operation-related trigger to `removeOperation`.


> On June 27, 2019, 10:17 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp
> > Lines 9791-9801 (patched)
> > <https://reviews.apache.org/r/70959/diff/2/?file=2152613#file2152613line9791>
> >
> >     can this be replaced with `return framework->idle()`?

This was not have been possible with the function being called from the status update continuations since in that case executors would only have been removed once the containerizer had finished waiting for it. Now that this is invoked during framework (or operation) removal the executors are gone, and this works. Fixed.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70959/#review216201
-----------------------------------------------------------


On June 28, 2019, 1:11 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70959/
> -----------------------------------------------------------
> 
> (Updated June 28, 2019, 1:11 a.m.)
> 
> 
> Review request for mesos, Greg Mann and Joseph Wu.
> 
> 
> Bugs: MESOS-9860
>     https://issues.apache.org/jira/browse/MESOS-9860
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Once a draining agent has neither frameworks with pending tasks nor any
> executors with either queued or launched tasks it has finished draining.
> This patch adds handling of that case which clears both the in-memory
> and persisted drain configuration.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70959/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> tested as part of https://reviews.apache.org/r/70960/
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 70959: Cleared agent drain state when draining is finished.

Posted by Vinod Kone <vi...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70959/#review216201
-----------------------------------------------------------




src/slave/slave.cpp
Lines 4876 (patched)
<https://reviews.apache.org/r/70959/#comment303285>

    there are several places in this file where `removeFramework` is called, why is this only being called here? i think it would be less error prone and future proof to call this from within `removeFramework()` ?



src/slave/slave.cpp
Lines 9791-9801 (patched)
<https://reviews.apache.org/r/70959/#comment303284>

    can this be replaced with `return framework->idle()`?


- Vinod Kone


On June 27, 2019, 3:24 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70959/
> -----------------------------------------------------------
> 
> (Updated June 27, 2019, 3:24 p.m.)
> 
> 
> Review request for mesos, Greg Mann and Joseph Wu.
> 
> 
> Bugs: MESOS-9860
>     https://issues.apache.org/jira/browse/MESOS-9860
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Once a draining agent has neither frameworks with pending tasks nor any
> executors with either queued or launched tasks it has finished draining.
> This patch adds handling of that case which clears both the in-memory
> and persisted drain configuration.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70959/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> tested as part of https://reviews.apache.org/r/70960/
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 70959: Cleared agent drain state when draining is finished.

Posted by Benjamin Bannier <bb...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70959/
-----------------------------------------------------------

(Updated June 27, 2019, 5:24 p.m.)


Review request for mesos, Greg Mann and Joseph Wu.


Changes
-------

Include pending operations when checking whether draining is done


Bugs: MESOS-9860
    https://issues.apache.org/jira/browse/MESOS-9860


Repository: mesos


Description
-------

Once a draining agent has neither frameworks with pending tasks nor any
executors with either queued or launched tasks it has finished draining.
This patch adds handling of that case which clears both the in-memory
and persisted drain configuration.


Diffs (updated)
-----

  src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
  src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 


Diff: https://reviews.apache.org/r/70959/diff/2/

Changes: https://reviews.apache.org/r/70959/diff/1-2/


Testing
-------

`make check`

tested as part of https://reviews.apache.org/r/70960/


Thanks,

Benjamin Bannier


Re: Review Request 70959: Cleared agent drain state when draining is finished.

Posted by Greg Mann <gr...@mesosphere.io>.

> On June 27, 2019, 8:20 a.m., Greg Mann wrote:
> > src/slave/slave.cpp
> > Lines 4879-4880 (patched)
> > <https://reviews.apache.org/r/70959/diff/1/?file=2152405#file2152405line4879>
> >
> >     I'm not sure, but will any frameworks exist on the agent once all tasks are terminal and ACKed? I'm wondering if we can just check whether or not `frameworks` is empty to determine if all tasks are gone?
> 
> Benjamin Bannier wrote:
>     This is exactly the point where all tasks could be ack'ed, but frameworks might still not be `idle` so `frameworks` wouldn't be empty. Do we care about including whether there are still executors running on the agent when checking whether we are done or not? In the former case we'd need to perform this check elsewhere.

Ohh I see good point, executors will still be running. Nope I think it's find if executors are running, we don't need to wait for them to be shutdown.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70959/#review216181
-----------------------------------------------------------


On June 27, 2019, 3:24 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70959/
> -----------------------------------------------------------
> 
> (Updated June 27, 2019, 3:24 p.m.)
> 
> 
> Review request for mesos, Greg Mann and Joseph Wu.
> 
> 
> Bugs: MESOS-9860
>     https://issues.apache.org/jira/browse/MESOS-9860
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Once a draining agent has neither frameworks with pending tasks nor any
> executors with either queued or launched tasks it has finished draining.
> This patch adds handling of that case which clears both the in-memory
> and persisted drain configuration.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70959/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> tested as part of https://reviews.apache.org/r/70960/
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 70959: Cleared agent drain state when draining is finished.

Posted by Benjamin Bannier <bb...@apache.org>.

> On June 27, 2019, 10:20 a.m., Greg Mann wrote:
> > src/slave/slave.cpp
> > Lines 4879-4880 (patched)
> > <https://reviews.apache.org/r/70959/diff/1/?file=2152405#file2152405line4879>
> >
> >     I'm not sure, but will any frameworks exist on the agent once all tasks are terminal and ACKed? I'm wondering if we can just check whether or not `frameworks` is empty to determine if all tasks are gone?

This is exactly the point where all tasks could be ack'ed, but frameworks might still not be `idle` so `frameworks` wouldn't be empty. Do we care about including whether there are still executors running on the agent when checking whether we are done or not? In the former case we'd need to perform this check elsewhere.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70959/#review216181
-----------------------------------------------------------


On June 27, 2019, 5:24 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70959/
> -----------------------------------------------------------
> 
> (Updated June 27, 2019, 5:24 p.m.)
> 
> 
> Review request for mesos, Greg Mann and Joseph Wu.
> 
> 
> Bugs: MESOS-9860
>     https://issues.apache.org/jira/browse/MESOS-9860
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Once a draining agent has neither frameworks with pending tasks nor any
> executors with either queued or launched tasks it has finished draining.
> This patch adds handling of that case which clears both the in-memory
> and persisted drain configuration.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70959/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> tested as part of https://reviews.apache.org/r/70960/
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 70959: Cleared agent drain state when draining is finished.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70959/#review216181
-----------------------------------------------------------



High level comment: the agent is only finished draining once all terminal task status updates are ACKed, AND once all terminal operation updates are ACKed. So we need similar code in the operation update acknowledgement path, and we need to have a way to verify that both tasks and operations are all finished.


src/slave/slave.cpp
Lines 4879-4880 (patched)
<https://reviews.apache.org/r/70959/#comment303244>

    I'm not sure, but will any frameworks exist on the agent once all tasks are terminal and ACKed? I'm wondering if we can just check whether or not `frameworks` is empty to determine if all tasks are gone?


- Greg Mann


On June 26, 2019, 11:55 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70959/
> -----------------------------------------------------------
> 
> (Updated June 26, 2019, 11:55 p.m.)
> 
> 
> Review request for mesos, Greg Mann and Joseph Wu.
> 
> 
> Bugs: MESOS-9860
>     https://issues.apache.org/jira/browse/MESOS-9860
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Once a draining agent has neither frameworks with pending tasks nor any
> executors with either queued or launched tasks it has finished draining.
> This patch adds handling of that case which clears both the in-memory
> and persisted drain configuration.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70959/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> tested as part of https://reviews.apache.org/r/70960/
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>