You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Alexandra Sava <al...@gmail.com> on 2014/06/10 12:13:55 UTC

Re: Review Request 22367: Second phase: Mesos-slave support for "node drain"

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22367/
-----------------------------------------------------------

(Updated June 10, 2014, 10:13 a.m.)


Review request for mesos and Ben Mahler.


Summary (updated)
-----------------

Second phase: Mesos-slave support for "node drain"


Bugs: MESOS-544
    https://issues.apache.org/jira/browse/MESOS-544


Repository: mesos-git


Description
-------

Second phase: Mesos-slave support for "node drain"
* create a method that sends UnregisterRequestMessage to the master
* dispatch that method in the signal handler, which executes when SIGUSR1 is issued

** This review relies on https://reviews.apache.org/r/21379/


Diffs
-----

  src/slave/slave.hpp 34687e555e6ba07863c45840aa6d07717388cf62 
  src/slave/slave.cpp 643c0882a4bab1b612b3fb6fd1004e09edf5f368 

Diff: https://reviews.apache.org/r/22367/diff/


Testing
-------


Thanks,

Alexandra Sava


Re: Review Request 22367: Second phase: Mesos-slave support for "node drain"

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22367/#review46951
-----------------------------------------------------------


Patch looks great!

Reviews applied: [22367]

All tests passed.

- Mesos ReviewBot


On June 28, 2014, 5:22 p.m., Alexandra Sava wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22367/
> -----------------------------------------------------------
> 
> (Updated June 28, 2014, 5:22 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-544
>     https://issues.apache.org/jira/browse/MESOS-544
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Second phase: Mesos-slave support for "node drain"
> * create a method that sends UnregisterRequestMessage to the master
> * dispatch that method in the signal handler, which executes when SIGUSR1 is issued
> 
> ** This review relies on https://reviews.apache.org/r/21379/
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 5fef35406c2ce2ad11e030aa7752eb691aab5857 
>   src/master/master.cpp 21b07c7f1f445beac29a7781cf441dd79b1b7fb5 
>   src/slave/slave.cpp 24eb8fd9f1e168e1f0917a8c14bf82f42d2e43ce 
> 
> Diff: https://reviews.apache.org/r/22367/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Alexandra Sava
> 
>


Re: Review Request 22367: Second phase: Mesos-slave support for "node drain"

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22367/#review47752
-----------------------------------------------------------

Ship it!


Looks great, I made a few comments below. Since everything is pretty minor I made the updates and will commit this for you!


src/master/master.cpp
<https://reviews.apache.org/r/22367/#comment83955>

    Would this be better as a WARNING, since there is no erroneous condition?
    
    It's also what we use for all of our other ignored messages.



src/master/master.cpp
<https://reviews.apache.org/r/22367/#comment83956>

    Why not just pass 'slave' here since we've already looked it up?



src/slave/slave.cpp
<https://reviews.apache.org/r/22367/#comment83959>

    We might want to still mention that we're shutting down, how about:
    
    "; unregistering and shutting down"



src/tests/slave_recovery_tests.cpp
<https://reviews.apache.org/r/22367/#comment83968>

    Let's store the status so we can verify that it is running:
    
    
      Future<TaskStatus> status;
      EXPECT_CALL(sched, statusUpdate(_, _))
        .WillOnce(FutureArg<1>(&status));
    
      ...
    
      AWAIT_READY(status);
      ASSERT_EQ(TASK_RUNNING, status.get().state());
    
    
    Down below, let's then make sure we get the following:
    
    (1) A TASK_LOST update, this means we don't need the .WillRepeatedly on the statusUpdate above, and
    (2) A slaveLost callback on the scheduler.



src/tests/slave_recovery_tests.cpp
<https://reviews.apache.org/r/22367/#comment83960>

    We can specify the direction of this message, and FUTURE_PROTOBUF saves the need for getting the type name:
    
    FUTURE_MESSAGE(UnregisterSlaveMessage(), slave.get(), master.get());


- Ben Mahler


On June 30, 2014, 1:38 p.m., Alexandra Sava wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22367/
> -----------------------------------------------------------
> 
> (Updated June 30, 2014, 1:38 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-544
>     https://issues.apache.org/jira/browse/MESOS-544
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Second phase: Mesos-slave support for "node drain"
> * send UnregisterRequest message to the master
> * test that the message is being delivered to the master
> 
> ** This review relies on https://reviews.apache.org/r/21379/
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 5fef35406c2ce2ad11e030aa7752eb691aab5857 
>   src/master/master.cpp 474014b24982449f2f64c2e8d835c25a28cbcbb8 
>   src/slave/slave.cpp f42ab60f29f38bcd8857f72fde9fc77cbfc36dde 
>   src/tests/slave_recovery_tests.cpp 582f52d73eba0e3ab089ec573d9a6c43bff0339e 
> 
> Diff: https://reviews.apache.org/r/22367/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Alexandra Sava
> 
>


Re: Review Request 22367: Second phase: Mesos-slave support for "node drain"

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22367/#review46979
-----------------------------------------------------------


Patch looks great!

Reviews applied: [22367]

All tests passed.

- Mesos ReviewBot


On June 30, 2014, 1:38 p.m., Alexandra Sava wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22367/
> -----------------------------------------------------------
> 
> (Updated June 30, 2014, 1:38 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-544
>     https://issues.apache.org/jira/browse/MESOS-544
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Second phase: Mesos-slave support for "node drain"
> * send UnregisterRequest message to the master
> * test that the message is being delivered to the master
> 
> ** This review relies on https://reviews.apache.org/r/21379/
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 5fef35406c2ce2ad11e030aa7752eb691aab5857 
>   src/master/master.cpp 474014b24982449f2f64c2e8d835c25a28cbcbb8 
>   src/slave/slave.cpp f42ab60f29f38bcd8857f72fde9fc77cbfc36dde 
>   src/tests/slave_recovery_tests.cpp 582f52d73eba0e3ab089ec573d9a6c43bff0339e 
> 
> Diff: https://reviews.apache.org/r/22367/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Alexandra Sava
> 
>


Re: Review Request 22367: Second phase: Mesos-slave support for "node drain"

Posted by Alexandra Sava <al...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22367/
-----------------------------------------------------------

(Updated June 30, 2014, 1:38 p.m.)


Review request for mesos and Ben Mahler.


Bugs: MESOS-544
    https://issues.apache.org/jira/browse/MESOS-544


Repository: mesos-git


Description (updated)
-------

Second phase: Mesos-slave support for "node drain"
* send UnregisterRequest message to the master
* test that the message is being delivered to the master

** This review relies on https://reviews.apache.org/r/21379/


Diffs (updated)
-----

  src/master/master.hpp 5fef35406c2ce2ad11e030aa7752eb691aab5857 
  src/master/master.cpp 474014b24982449f2f64c2e8d835c25a28cbcbb8 
  src/slave/slave.cpp f42ab60f29f38bcd8857f72fde9fc77cbfc36dde 
  src/tests/slave_recovery_tests.cpp 582f52d73eba0e3ab089ec573d9a6c43bff0339e 

Diff: https://reviews.apache.org/r/22367/diff/


Testing
-------


Thanks,

Alexandra Sava


Re: Review Request 22367: Second phase: Mesos-slave support for "node drain"

Posted by Alexandra Sava <al...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22367/
-----------------------------------------------------------

(Updated June 28, 2014, 5:22 p.m.)


Review request for mesos and Ben Mahler.


Bugs: MESOS-544
    https://issues.apache.org/jira/browse/MESOS-544


Repository: mesos-git


Description
-------

Second phase: Mesos-slave support for "node drain"
* create a method that sends UnregisterRequestMessage to the master
* dispatch that method in the signal handler, which executes when SIGUSR1 is issued

** This review relies on https://reviews.apache.org/r/21379/


Diffs (updated)
-----

  src/master/master.hpp 5fef35406c2ce2ad11e030aa7752eb691aab5857 
  src/master/master.cpp 21b07c7f1f445beac29a7781cf441dd79b1b7fb5 
  src/slave/slave.cpp 24eb8fd9f1e168e1f0917a8c14bf82f42d2e43ce 

Diff: https://reviews.apache.org/r/22367/diff/


Testing
-------


Thanks,

Alexandra Sava


Re: Review Request 22367: Second phase: Mesos-slave support for "node drain"

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22367/#review46516
-----------------------------------------------------------


Patch looks great!

Reviews applied: [22367]

All tests passed.

- Mesos ReviewBot


On June 19, 2014, 3:24 p.m., Alexandra Sava wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22367/
> -----------------------------------------------------------
> 
> (Updated June 19, 2014, 3:24 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-544
>     https://issues.apache.org/jira/browse/MESOS-544
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Second phase: Mesos-slave support for "node drain"
> * create a method that sends UnregisterRequestMessage to the master
> * dispatch that method in the signal handler, which executes when SIGUSR1 is issued
> 
> ** This review relies on https://reviews.apache.org/r/21379/
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 2844446e2674df6e11f4c2915ed324fcc103532c 
>   src/master/master.cpp 888657dd4bc50085882382908e3c48ccb857c621 
>   src/slave/slave.hpp 605ee4a9cebb3ace4cc6e9387cae1b08b2be8c6c 
>   src/slave/slave.cpp ed3483ff1762d93837f328d1e647c490cf9e14c4 
> 
> Diff: https://reviews.apache.org/r/22367/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Alexandra Sava
> 
>


Re: Review Request 22367: Second phase: Mesos-slave support for "node drain"

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22367/#review46247
-----------------------------------------------------------


Looks great!

The one thing I'm wondering is whether we want to expose 'unregister' as a method in the slave. Would we ever want to call 'unregister' and stay running? If not, it may make sense to just always send the unregister message when we self-terminate (when (!from) in 'shutdown').


src/master/master.cpp
<https://reviews.apache.org/r/22367/#comment81558>

    How about storing the slave in the if condition so that we don't have to look it up over and over?
    
    I would suggest the following log message so that we can identify the PID mismatch in the logs:
    
    Slave* slave = getSlave(slaveId);
    
    if (slave != NULL) {
      if (from != slave->pid) {
        LOG(ERROR) << "Ignoring unregister slave message from " << from << " because it is not the slave " << slave->pid;
      }
      removeSlave(slave);
    }
    
    Notice no period at the end of the log message. :)



src/slave/slave.cpp
<https://reviews.apache.org/r/22367/#comment81559>

    How about:
    
    LOG(INFO) << message << "; sending unregister message to the master";
    
    As is done inside the shutdown() logic when we self-terminate.


- Ben Mahler


On June 19, 2014, 3:24 p.m., Alexandra Sava wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22367/
> -----------------------------------------------------------
> 
> (Updated June 19, 2014, 3:24 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-544
>     https://issues.apache.org/jira/browse/MESOS-544
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Second phase: Mesos-slave support for "node drain"
> * create a method that sends UnregisterRequestMessage to the master
> * dispatch that method in the signal handler, which executes when SIGUSR1 is issued
> 
> ** This review relies on https://reviews.apache.org/r/21379/
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 2844446e2674df6e11f4c2915ed324fcc103532c 
>   src/master/master.cpp 888657dd4bc50085882382908e3c48ccb857c621 
>   src/slave/slave.hpp 605ee4a9cebb3ace4cc6e9387cae1b08b2be8c6c 
>   src/slave/slave.cpp ed3483ff1762d93837f328d1e647c490cf9e14c4 
> 
> Diff: https://reviews.apache.org/r/22367/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Alexandra Sava
> 
>


Re: Review Request 22367: Second phase: Mesos-slave support for "node drain"

Posted by Alexandra Sava <al...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22367/
-----------------------------------------------------------

(Updated June 28, 2014, 5:22 p.m.)


Review request for mesos and Ben Mahler.


Bugs: MESOS-544
    https://issues.apache.org/jira/browse/MESOS-544


Repository: mesos-git


Description
-------

Second phase: Mesos-slave support for "node drain"
* create a method that sends UnregisterRequestMessage to the master
* dispatch that method in the signal handler, which executes when SIGUSR1 is issued

** This review relies on https://reviews.apache.org/r/21379/


Diffs
-----

  src/master/master.hpp 5fef35406c2ce2ad11e030aa7752eb691aab5857 
  src/master/master.cpp 21b07c7f1f445beac29a7781cf441dd79b1b7fb5 
  src/slave/slave.cpp 24eb8fd9f1e168e1f0917a8c14bf82f42d2e43ce 

Diff: https://reviews.apache.org/r/22367/diff/


Testing
-------


Thanks,

Alexandra Sava


Re: Review Request 22367: Second phase: Mesos-slave support for "node drain"

Posted by Alexandra Sava <al...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22367/
-----------------------------------------------------------

(Updated June 19, 2014, 3:24 p.m.)


Review request for mesos and Ben Mahler.


Bugs: MESOS-544
    https://issues.apache.org/jira/browse/MESOS-544


Repository: mesos-git


Description
-------

Second phase: Mesos-slave support for "node drain"
* create a method that sends UnregisterRequestMessage to the master
* dispatch that method in the signal handler, which executes when SIGUSR1 is issued

** This review relies on https://reviews.apache.org/r/21379/


Diffs (updated)
-----

  src/master/master.hpp 2844446e2674df6e11f4c2915ed324fcc103532c 
  src/master/master.cpp 888657dd4bc50085882382908e3c48ccb857c621 
  src/slave/slave.hpp 605ee4a9cebb3ace4cc6e9387cae1b08b2be8c6c 
  src/slave/slave.cpp ed3483ff1762d93837f328d1e647c490cf9e14c4 

Diff: https://reviews.apache.org/r/22367/diff/


Testing
-------


Thanks,

Alexandra Sava


Re: Review Request 22367: Second phase: Mesos-slave support for "node drain"

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22367/#review45326
-----------------------------------------------------------


Bad patch!

Reviews applied: [22367]

Failed command: git apply --index 22367.patch

Error:
 None

- Mesos ReviewBot


On June 10, 2014, 10:13 a.m., Alexandra Sava wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22367/
> -----------------------------------------------------------
> 
> (Updated June 10, 2014, 10:13 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-544
>     https://issues.apache.org/jira/browse/MESOS-544
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Second phase: Mesos-slave support for "node drain"
> * create a method that sends UnregisterRequestMessage to the master
> * dispatch that method in the signal handler, which executes when SIGUSR1 is issued
> 
> ** This review relies on https://reviews.apache.org/r/21379/
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e555e6ba07863c45840aa6d07717388cf62 
>   src/slave/slave.cpp 643c0882a4bab1b612b3fb6fd1004e09edf5f368 
> 
> Diff: https://reviews.apache.org/r/22367/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Alexandra Sava
> 
>


Re: Review Request 22367: Second phase: Mesos-slave support for "node drain"

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22367/#review45525
-----------------------------------------------------------


Could you add a test that ensures the unregister message is sent? You can do this via FUTURE_MESSAGE.

The test could also ensure that the master correctly removes the slave from the registrar.

- Ben Mahler


On June 10, 2014, 10:13 a.m., Alexandra Sava wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22367/
> -----------------------------------------------------------
> 
> (Updated June 10, 2014, 10:13 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-544
>     https://issues.apache.org/jira/browse/MESOS-544
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Second phase: Mesos-slave support for "node drain"
> * create a method that sends UnregisterRequestMessage to the master
> * dispatch that method in the signal handler, which executes when SIGUSR1 is issued
> 
> ** This review relies on https://reviews.apache.org/r/21379/
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 34687e555e6ba07863c45840aa6d07717388cf62 
>   src/slave/slave.cpp 643c0882a4bab1b612b3fb6fd1004e09edf5f368 
> 
> Diff: https://reviews.apache.org/r/22367/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Alexandra Sava
> 
>