You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Adam B <ad...@mesosphere.io> on 2014/03/11 03:18:05 UTC

Review Request 19006: Moved slave deactivation code out of Master::exited() into new deactivate(Slave).

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/
-----------------------------------------------------------

Review request for mesos and Vinod Kone.


Bugs: MESOS-804
    https://issues.apache.org/jira/browse/MESOS-804


Repository: mesos-git


Description
-------

Moved slave deactivation code out of Master::exited() into new deactivate(Slave).

Still need to reconcile this code with deactivateSlave(SlaveId) which right now just calls removeSlave(Slave).
>From what I can tell, checkpointing slaves can just be deactivated, but non-checkpointing slaves get removed altogether. Correct?
I welcome thoughts/suggestions on how to clean this up.


Diffs
-----

  src/master/master.hpp 49a3e15 
  src/master/master.cpp 2a40333 

Diff: https://reviews.apache.org/r/19006/diff/


Testing
-------

make check


Thanks,

Adam B


Re: Review Request 19006: Moved slave deactivation code out of Master::exited() into new deactivate(Slave).

Posted by Adam B <ad...@mesosphere.io>.

> On March 18, 2014, 9:39 p.m., Ben Mahler wrote:
> > src/master/master.cpp, line 1114
> > <https://reviews.apache.org/r/19006/diff/2/?file=526925#file526925line1114>
> >
> >     I applaud the effort here to clean this up but it looks like the meaning of the words used for slave states are inconsistent now:
> >     
> >     1. The SlaveObserver "disconnects" a slave, which shuts the slave down, and marks it as "deactivated" once removed.
> >     
> >     2. We have "deactivate", which marks the slave as "disconnected".
> >     
> >     Seems like we should either keep the terminology consistent or choose better terminology if we feel "activated" and "deactivated" were a poor choice in retrospect.

There's definitely some confusing language around various slave states (and framework states).
In response to your comments, I made the following changes:
1. Renamed SlaveObserver.disconnect() (formerly deactivate()) to shutdown() since it calls what is now shutdownSlave() (was deactivateSlave).
2. Renamed deactivate(Slave) to disconnect(Slave).
Now the only remaining [de]activate terminology associated with slaves are the slaves.activated/deactivated collections.

I now identify the following 3 states for slaves:
A. Connected: Slave exists in slaves.activated, slave.disconnected=false; disconnects when a checkpointing slave hits exited().
B. Disconnected: Slave exists in slaves.activated, slave.disconnected=true; reconnects on reregisterSlave.
C. Shutdown: Slave removed from slaves.activated, pid added to slaves.deactivated; readded to slaves.activated on registerSlave.
I propose that we rename slaves.activated/deactivated to slaves.running/shutdown to avoid confusion with the framework.active state and deactivateFramework message/action.

The "deactivate" terminology for slaves is especially confusing because "disconnecting" a slave is analagous to "deactivating" a framework.
Here are the framework states:
A. Active: Framework exists in frameworks.activated, framework.active=true; goes inactive on exited().
B. Inactive: Framework exists in frameworks.activated, framework.active=false; reactivated on reregister (if before failoverTimeout).
C. Completed: Framework moved to frameworks.completed; never goes back.
I propose that we rename frameworks.activated to frameworks.running, because you shouldn't have an inactive slave in slaves.activated, but you could in slaves.running.

Renaming slaves.activated/deactivated and frameworks.activated can happen in a subsequent review.


- Adam


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/#review37684
-----------------------------------------------------------


On March 25, 2014, 3:22 a.m., Adam B wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19006/
> -----------------------------------------------------------
> 
> (Updated March 25, 2014, 3:22 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-804
>     https://issues.apache.org/jira/browse/MESOS-804
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Moved slave deactivation code out of Master::exited() into new deactivate(Slave).
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp a8ed5ec 
>   src/master/master.cpp a951a7a 
> 
> Diff: https://reviews.apache.org/r/19006/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Adam B
> 
>


Re: Review Request 19006: Moved slave deactivation code out of Master::exited() into new deactivate(Slave).

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/#review37684
-----------------------------------------------------------



src/master/master.cpp
<https://reviews.apache.org/r/19006/#comment69326>

    I applaud the effort here to clean this up but it looks like the meaning of the words used for slave states are inconsistent now:
    
    1. The SlaveObserver "disconnects" a slave, which shuts the slave down, and marks it as "deactivated" once removed.
    
    2. We have "deactivate", which marks the slave as "disconnected".
    
    Seems like we should either keep the terminology consistent or choose better terminology if we feel "activated" and "deactivated" were a poor choice in retrospect.


- Ben Mahler


On March 19, 2014, 2:59 a.m., Adam B wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19006/
> -----------------------------------------------------------
> 
> (Updated March 19, 2014, 2:59 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-804
>     https://issues.apache.org/jira/browse/MESOS-804
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Moved slave deactivation code out of Master::exited() into new deactivate(Slave).
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 0c7c520 
>   src/master/master.cpp 6da7766 
> 
> Diff: https://reviews.apache.org/r/19006/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Adam B
> 
>


Re: Review Request 19006: Moved slave disconnection code out of Master::exited() into new disconnect(Slave).

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/#review38999
-----------------------------------------------------------


Patch looks great!

Reviews applied: [19006]

All tests passed.

- Mesos ReviewBot


On March 27, 2014, 4:41 a.m., Adam B wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19006/
> -----------------------------------------------------------
> 
> (Updated March 27, 2014, 4:41 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-804
>     https://issues.apache.org/jira/browse/MESOS-804
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Moved code to disconnect a checkpointing slave out of Master::exited() and into a new disconnect(Slave), to be re-used by slave authentication code.
> Also renamed deactivateSlave() to shutdownSlave(), since it actually sends a ShutdownMessage and calls removeSlave(). This avoids confusion with deactivateFramework, which is more analagous to disconnect(Slave).
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp b6b9983 
>   src/master/master.cpp 5d0ddb0 
> 
> Diff: https://reviews.apache.org/r/19006/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Adam B
> 
>


Re: Review Request 19006: Moved slave disconnection code out of Master::exited() into new disconnect(Slave).

Posted by Benjamin Hindman <be...@berkeley.edu>.

> On April 2, 2014, 1:29 a.m., Benjamin Hindman wrote:
> > Ship It!

This change is independent of the naming in 'Slaves', so let's get it committed and follow up with another review. Thanks Adam.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/#review39227
-----------------------------------------------------------


On March 27, 2014, 4:41 a.m., Adam B wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19006/
> -----------------------------------------------------------
> 
> (Updated March 27, 2014, 4:41 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-804
>     https://issues.apache.org/jira/browse/MESOS-804
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Moved code to disconnect a checkpointing slave out of Master::exited() and into a new disconnect(Slave), to be re-used by slave authentication code.
> Also renamed deactivateSlave() to shutdownSlave(), since it actually sends a ShutdownMessage and calls removeSlave(). This avoids confusion with deactivateFramework, which is more analagous to disconnect(Slave).
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp b6b9983 
>   src/master/master.cpp 5d0ddb0 
> 
> Diff: https://reviews.apache.org/r/19006/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Adam B
> 
>


Re: Review Request 19006: Moved slave disconnection code out of Master::exited() into new disconnect(Slave).

Posted by Benjamin Hindman <be...@berkeley.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/#review39227
-----------------------------------------------------------

Ship it!


Ship It!

- Benjamin Hindman


On March 27, 2014, 4:41 a.m., Adam B wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19006/
> -----------------------------------------------------------
> 
> (Updated March 27, 2014, 4:41 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-804
>     https://issues.apache.org/jira/browse/MESOS-804
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Moved code to disconnect a checkpointing slave out of Master::exited() and into a new disconnect(Slave), to be re-used by slave authentication code.
> Also renamed deactivateSlave() to shutdownSlave(), since it actually sends a ShutdownMessage and calls removeSlave(). This avoids confusion with deactivateFramework, which is more analagous to disconnect(Slave).
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp b6b9983 
>   src/master/master.cpp 5d0ddb0 
> 
> Diff: https://reviews.apache.org/r/19006/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Adam B
> 
>


Re: Review Request 19006: Moved slave disconnection code out of Master::exited() into new disconnect(Slave).

Posted by Adam B <ad...@mesosphere.io>.

> On April 1, 2014, 4:57 p.m., Benjamin Hindman wrote:
> > src/master/master.hpp, line 390
> > <https://reviews.apache.org/r/19006/diff/4/?file=537795#file537795line390>
> >
> >     What about the use of activated/deactivated here? Can we have 'connected', 'disconnected', and 'removed' where 'removed' will eventually be removed (no pun intended) once the registrar is fully in place?

Filed new JIRA MESOS-1188 to "Rename slaves/frameworks.activated/deactivated". We can discuss naming suggestions there.


- Adam


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/#review39219
-----------------------------------------------------------


On March 26, 2014, 9:41 p.m., Adam B wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19006/
> -----------------------------------------------------------
> 
> (Updated March 26, 2014, 9:41 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-804
>     https://issues.apache.org/jira/browse/MESOS-804
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Moved code to disconnect a checkpointing slave out of Master::exited() and into a new disconnect(Slave), to be re-used by slave authentication code.
> Also renamed deactivateSlave() to shutdownSlave(), since it actually sends a ShutdownMessage and calls removeSlave(). This avoids confusion with deactivateFramework, which is more analagous to disconnect(Slave).
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp b6b9983 
>   src/master/master.cpp 5d0ddb0 
> 
> Diff: https://reviews.apache.org/r/19006/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Adam B
> 
>


Re: Review Request 19006: Moved slave disconnection code out of Master::exited() into new disconnect(Slave).

Posted by Benjamin Hindman <be...@berkeley.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/#review39219
-----------------------------------------------------------



src/master/master.hpp
<https://reviews.apache.org/r/19006/#comment71584>

    What about the use of activated/deactivated here? Can we have 'connected', 'disconnected', and 'removed' where 'removed' will eventually be removed (no pun intended) once the registrar is fully in place?


- Benjamin Hindman


On March 27, 2014, 4:41 a.m., Adam B wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19006/
> -----------------------------------------------------------
> 
> (Updated March 27, 2014, 4:41 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-804
>     https://issues.apache.org/jira/browse/MESOS-804
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Moved code to disconnect a checkpointing slave out of Master::exited() and into a new disconnect(Slave), to be re-used by slave authentication code.
> Also renamed deactivateSlave() to shutdownSlave(), since it actually sends a ShutdownMessage and calls removeSlave(). This avoids confusion with deactivateFramework, which is more analagous to disconnect(Slave).
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp b6b9983 
>   src/master/master.cpp 5d0ddb0 
> 
> Diff: https://reviews.apache.org/r/19006/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Adam B
> 
>


Re: Review Request 19006: Moved slave disconnection code out of Master::exited() into new disconnect(Slave).

Posted by Adam B <ad...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/
-----------------------------------------------------------

(Updated March 26, 2014, 9:41 p.m.)


Review request for mesos and Ben Mahler.


Changes
-------

Rebased again, to prepare for rebasing dependent review 18381 (slave authentication). Please review when you can, bmahler.


Bugs: MESOS-804
    https://issues.apache.org/jira/browse/MESOS-804


Repository: mesos-git


Description
-------

Moved code to disconnect a checkpointing slave out of Master::exited() and into a new disconnect(Slave), to be re-used by slave authentication code.
Also renamed deactivateSlave() to shutdownSlave(), since it actually sends a ShutdownMessage and calls removeSlave(). This avoids confusion with deactivateFramework, which is more analagous to disconnect(Slave).


Diffs (updated)
-----

  src/master/master.hpp b6b9983 
  src/master/master.cpp 5d0ddb0 

Diff: https://reviews.apache.org/r/19006/diff/


Testing
-------

make check


Thanks,

Adam B


Re: Review Request 19006: Moved slave disconnection code out of Master::exited() into new disconnect(Slave).

Posted by Adam B <ad...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/
-----------------------------------------------------------

(Updated March 25, 2014, 12:09 p.m.)


Review request for mesos and Ben Mahler.


Changes
-------

Summary/Description updates


Summary (updated)
-----------------

Moved slave disconnection code out of Master::exited() into new disconnect(Slave).


Bugs: MESOS-804
    https://issues.apache.org/jira/browse/MESOS-804


Repository: mesos-git


Description (updated)
-------

Moved code to disconnect a checkpointing slave out of Master::exited() and into a new disconnect(Slave), to be re-used by slave authentication code.
Also renamed deactivateSlave() to shutdownSlave(), since it actually sends a ShutdownMessage and calls removeSlave(). This avoids confusion with deactivateFramework, which is more analagous to disconnect(Slave).


Diffs
-----

  src/master/master.hpp a8ed5ec 
  src/master/master.cpp a951a7a 

Diff: https://reviews.apache.org/r/19006/diff/


Testing
-------

make check


Thanks,

Adam B


Re: Review Request 19006: Moved slave deactivation code out of Master::exited() into new deactivate(Slave).

Posted by Adam B <ad...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/
-----------------------------------------------------------

(Updated March 25, 2014, 6:53 p.m.)


Review request for mesos and Ben Mahler.


Changes
-------

assigning to @bmahler to shepherd since he is planning to rename these methods as part of registrar. -- Vinod


Bugs: MESOS-804
    https://issues.apache.org/jira/browse/MESOS-804


Repository: mesos-git


Description
-------

Moved slave deactivation code out of Master::exited() into new deactivate(Slave).


Diffs
-----

  src/master/master.hpp a8ed5ec 
  src/master/master.cpp a951a7a 

Diff: https://reviews.apache.org/r/19006/diff/


Testing
-------

make check


Thanks,

Adam B


Re: Review Request 19006: Moved slave deactivation code out of Master::exited() into new deactivate(Slave).

Posted by Adam B <ad...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/
-----------------------------------------------------------

(Updated March 25, 2014, 3:22 a.m.)


Review request for mesos and Vinod Kone.


Changes
-------

1. Renamed SlaveObserver.disconnect() (formerly deactivate) to shutdown() since it calls what is now shutdownSlave() (was deactivateSlave).
2. Renamed deactivate(Slave) to disconnect(Slave).
Now the only remaining [de]activate terminology associated with slaves are the slaves.activated/deactivated collections.


Bugs: MESOS-804
    https://issues.apache.org/jira/browse/MESOS-804


Repository: mesos-git


Description
-------

Moved slave deactivation code out of Master::exited() into new deactivate(Slave).


Diffs (updated)
-----

  src/master/master.hpp a8ed5ec 
  src/master/master.cpp a951a7a 

Diff: https://reviews.apache.org/r/19006/diff/


Testing
-------

make check


Thanks,

Adam B


Re: Review Request 19006: Moved slave deactivation code out of Master::exited() into new deactivate(Slave).

Posted by Adam B <ad...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/
-----------------------------------------------------------

(Updated March 18, 2014, 7:59 p.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Rebased, made changes based on Vinod's feedback.


Bugs: MESOS-804
    https://issues.apache.org/jira/browse/MESOS-804


Repository: mesos-git


Description (updated)
-------

Moved slave deactivation code out of Master::exited() into new deactivate(Slave).


Diffs (updated)
-----

  src/master/master.hpp 0c7c520 
  src/master/master.cpp 6da7766 

Diff: https://reviews.apache.org/r/19006/diff/


Testing
-------

make check


Thanks,

Adam B


Re: Review Request 19006: Moved slave deactivation code out of Master::exited() into new deactivate(Slave).

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/#review36751
-----------------------------------------------------------


Patch looks great!

Reviews applied: [19006]

All tests passed.

- Mesos ReviewBot


On March 11, 2014, 2:18 a.m., Adam B wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19006/
> -----------------------------------------------------------
> 
> (Updated March 11, 2014, 2:18 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-804
>     https://issues.apache.org/jira/browse/MESOS-804
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Moved slave deactivation code out of Master::exited() into new deactivate(Slave).
> 
> Still need to reconcile this code with deactivateSlave(SlaveId) which right now just calls removeSlave(Slave).
> From what I can tell, checkpointing slaves can just be deactivated, but non-checkpointing slaves get removed altogether. Correct?
> I welcome thoughts/suggestions on how to clean this up.
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 49a3e15 
>   src/master/master.cpp 2a40333 
> 
> Diff: https://reviews.apache.org/r/19006/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Adam B
> 
>


Re: Review Request 19006: Moved slave deactivation code out of Master::exited() into new deactivate(Slave).

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19006/#review37439
-----------------------------------------------------------



src/master/master.cpp
<https://reviews.apache.org/r/19006/#comment68971>

    Lets kill this TODO. It's better to explicitly remove it in exited() rather than doing it here.
    
    As an aside, we should really kill the slave checkpointing flag at this point i.e., slave should always checkpoint.



src/master/master.cpp
<https://reviews.apache.org/r/19006/#comment68973>

    I don't see any "depends on" in this review. So this should not be here?



src/master/master.cpp
<https://reviews.apache.org/r/19006/#comment68975>

    How about renaming this to shutdownSlave to avoid confusion?


- Vinod Kone


On March 11, 2014, 2:18 a.m., Adam B wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19006/
> -----------------------------------------------------------
> 
> (Updated March 11, 2014, 2:18 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-804
>     https://issues.apache.org/jira/browse/MESOS-804
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Moved slave deactivation code out of Master::exited() into new deactivate(Slave).
> 
> Still need to reconcile this code with deactivateSlave(SlaveId) which right now just calls removeSlave(Slave).
> From what I can tell, checkpointing slaves can just be deactivated, but non-checkpointing slaves get removed altogether. Correct?
> I welcome thoughts/suggestions on how to clean this up.
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 49a3e15 
>   src/master/master.cpp 2a40333 
> 
> Diff: https://reviews.apache.org/r/19006/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Adam B
> 
>