You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Ben Mahler <be...@gmail.com> on 2013/03/28 19:25:08 UTC

Review Request: When sending NoMasterDetectedMessage, we need to clear the master sequence number in order to ensure we still send NewMasterDetectedMessage if the master is the same. Otherwise, we can get stuck in a master-less state.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10161/
-----------------------------------------------------------

Review request for mesos, Benjamin Hindman and Vinod Kone.


Description
-------

This is also along the way to fixing MESOS-305.

This fixes a bug where we were not clearing the sequence number when sending a NoMasterDetectedMessage.
As a result, this removes the need for hiding NoMasterDetectedMessages from contending non-leaders.


Diffs
-----

  src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623 

Diff: https://reviews.apache.org/r/10161/diff/


Testing
-------

make check

I'd like to add a test for this, but I don't see how to induce a timedout() call on the ZooKeeperMasterDetectorProcess, unless I add some plumbing to get a handle to the underlying process. Suggestions?


Thanks,

Ben Mahler


Re: Review Request: When sending NoMasterDetectedMessage, we need to clear the master sequence number in order to ensure we still send NewMasterDetectedMessage if the master is the same. Otherwise, we can get stuck in a master-less state.

Posted by Ben Mahler <be...@gmail.com>.

> On March 29, 2013, 7:39 p.m., Vinod Kone wrote:
> > src/detector/detector.cpp, lines 250-252
> > <https://reviews.apache.org/r/10161/diff/1/?file=275661#file275661line250>
> >
> >     isn't this redundant?

Redundant but explicit. I'll drop them since I'm not sure what our convention is.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10161/#review18529
-----------------------------------------------------------


On March 28, 2013, 6:25 p.m., Ben Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10161/
> -----------------------------------------------------------
> 
> (Updated March 28, 2013, 6:25 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Vinod Kone.
> 
> 
> Description
> -------
> 
> This is also along the way to fixing MESOS-305.
> 
> This fixes a bug where we were not clearing the sequence number when sending a NoMasterDetectedMessage.
> As a result, this removes the need for hiding NoMasterDetectedMessages from contending non-leaders.
> 
> 
> Diffs
> -----
> 
>   src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623 
> 
> Diff: https://reviews.apache.org/r/10161/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> I'd like to add a test for this, but I don't see how to induce a timedout() call on the ZooKeeperMasterDetectorProcess, unless I add some plumbing to get a handle to the underlying process. Suggestions?
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>


Re: Review Request: When sending NoMasterDetectedMessage, we need to clear the master sequence number in order to ensure we still send NewMasterDetectedMessage if the master is the same. Otherwise, we can get stuck in a master-less state.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10161/#review18529
-----------------------------------------------------------


will wait for a test(s) before giving a shipit.


src/detector/detector.cpp
<https://reviews.apache.org/r/10161/#comment38838>

    isn't this redundant?


- Vinod Kone


On March 28, 2013, 6:25 p.m., Ben Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10161/
> -----------------------------------------------------------
> 
> (Updated March 28, 2013, 6:25 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Vinod Kone.
> 
> 
> Description
> -------
> 
> This is also along the way to fixing MESOS-305.
> 
> This fixes a bug where we were not clearing the sequence number when sending a NoMasterDetectedMessage.
> As a result, this removes the need for hiding NoMasterDetectedMessages from contending non-leaders.
> 
> 
> Diffs
> -----
> 
>   src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623 
> 
> Diff: https://reviews.apache.org/r/10161/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> I'd like to add a test for this, but I don't see how to induce a timedout() call on the ZooKeeperMasterDetectorProcess, unless I add some plumbing to get a handle to the underlying process. Suggestions?
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>


Re: Review Request: When sending NoMasterDetectedMessage, we need to clear the master sequence number in order to ensure we still send NewMasterDetectedMessage if the master is the same. Otherwise, we can get stuck in a master-less state.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10161/
-----------------------------------------------------------

(Updated April 30, 2013, 1:30 a.m.)


Review request for mesos, Benjamin Hindman and Vinod Kone.


Changes
-------

Rebased off trunk.


Description
-------

This is also along the way to fixing MESOS-305.

This fixes a bug where we were not clearing the sequence number when sending a NoMasterDetectedMessage.
As a result, this removes the need for hiding NoMasterDetectedMessages from contending non-leaders.


Diffs (updated)
-----

  src/detector/detector.hpp ed485bd86d2bff1046a234a5776c1081a4136bc5 
  src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623 
  src/tests/zookeeper_tests.cpp 125b16566d5cd59732fef67d80617724ff71433b 

Diff: https://reviews.apache.org/r/10161/diff/


Testing
-------

make check

I'd like to add a test for this, but I don't see how to induce a timedout() call on the ZooKeeperMasterDetectorProcess, unless I add some plumbing to get a handle to the underlying process. Suggestions?


Thanks,

Ben Mahler


Re: Review Request: When sending NoMasterDetectedMessage, we need to clear the master sequence number in order to ensure we still send NewMasterDetectedMessage if the master is the same. Otherwise, we can get stuck in a master-less state.

Posted by Benjamin Hindman <be...@berkeley.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10161/#review19903
-----------------------------------------------------------

Ship it!



src/detector/detector.cpp
<https://reviews.apache.org/r/10161/#comment41058>

    s/10.0/10/


- Benjamin Hindman


On April 24, 2013, 11:55 p.m., Ben Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10161/
> -----------------------------------------------------------
> 
> (Updated April 24, 2013, 11:55 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Vinod Kone.
> 
> 
> Description
> -------
> 
> This is also along the way to fixing MESOS-305.
> 
> This fixes a bug where we were not clearing the sequence number when sending a NoMasterDetectedMessage.
> As a result, this removes the need for hiding NoMasterDetectedMessages from contending non-leaders.
> 
> 
> Diffs
> -----
> 
>   src/detector/detector.hpp ed485bd86d2bff1046a234a5776c1081a4136bc5 
>   src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623 
>   src/tests/zookeeper_tests.cpp 125b16566d5cd59732fef67d80617724ff71433b 
> 
> Diff: https://reviews.apache.org/r/10161/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> I'd like to add a test for this, but I don't see how to induce a timedout() call on the ZooKeeperMasterDetectorProcess, unless I add some plumbing to get a handle to the underlying process. Suggestions?
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>


Re: Review Request: When sending NoMasterDetectedMessage, we need to clear the master sequence number in order to ensure we still send NewMasterDetectedMessage if the master is the same. Otherwise, we can get stuck in a master-less state.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10161/
-----------------------------------------------------------

(Updated April 24, 2013, 11:55 p.m.)


Review request for mesos, Benjamin Hindman and Vinod Kone.


Changes
-------

Rebased.


Description
-------

This is also along the way to fixing MESOS-305.

This fixes a bug where we were not clearing the sequence number when sending a NoMasterDetectedMessage.
As a result, this removes the need for hiding NoMasterDetectedMessages from contending non-leaders.


Diffs (updated)
-----

  src/detector/detector.hpp ed485bd86d2bff1046a234a5776c1081a4136bc5 
  src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623 
  src/tests/zookeeper_tests.cpp 125b16566d5cd59732fef67d80617724ff71433b 

Diff: https://reviews.apache.org/r/10161/diff/


Testing
-------

make check

I'd like to add a test for this, but I don't see how to induce a timedout() call on the ZooKeeperMasterDetectorProcess, unless I add some plumbing to get a handle to the underlying process. Suggestions?


Thanks,

Ben Mahler


Re: Review Request: When sending NoMasterDetectedMessage, we need to clear the master sequence number in order to ensure we still send NewMasterDetectedMessage if the master is the same. Otherwise, we can get stuck in a master-less state.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10161/
-----------------------------------------------------------

(Updated April 19, 2013, 8:54 p.m.)


Review request for mesos, Benjamin Hindman and Vinod Kone.


Changes
-------

Vinod's review.


Description
-------

This is also along the way to fixing MESOS-305.

This fixes a bug where we were not clearing the sequence number when sending a NoMasterDetectedMessage.
As a result, this removes the need for hiding NoMasterDetectedMessages from contending non-leaders.


Diffs (updated)
-----

  src/detector/detector.hpp ed485bd86d2bff1046a234a5776c1081a4136bc5 
  src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623 
  src/tests/zookeeper_tests.cpp 0855f5aee0baef22c4ecbed1b88f14f16bfff532 

Diff: https://reviews.apache.org/r/10161/diff/


Testing
-------

make check

I'd like to add a test for this, but I don't see how to induce a timedout() call on the ZooKeeperMasterDetectorProcess, unless I add some plumbing to get a handle to the underlying process. Suggestions?


Thanks,

Ben Mahler


Re: Review Request: When sending NoMasterDetectedMessage, we need to clear the master sequence number in order to ensure we still send NewMasterDetectedMessage if the master is the same. Otherwise, we can get stuck in a master-less state.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10161/#review19464
-----------------------------------------------------------

Ship it!


thanks for the test!


src/tests/zookeeper_tests.cpp
<https://reviews.apache.org/r/10161/#comment40204>

    yay!



src/tests/zookeeper_tests.cpp
<https://reviews.apache.org/r/10161/#comment40205>

    s/1.// ?



src/tests/zookeeper_tests.cpp
<https://reviews.apache.org/r/10161/#comment40206>

    ditto



src/tests/zookeeper_tests.cpp
<https://reviews.apache.org/r/10161/#comment40207>

    ditto



src/tests/zookeeper_tests.cpp
<https://reviews.apache.org/r/10161/#comment40210>

    awesome!


- Vinod Kone


On April 19, 2013, 6:53 p.m., Ben Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10161/
> -----------------------------------------------------------
> 
> (Updated April 19, 2013, 6:53 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Vinod Kone.
> 
> 
> Description
> -------
> 
> This is also along the way to fixing MESOS-305.
> 
> This fixes a bug where we were not clearing the sequence number when sending a NoMasterDetectedMessage.
> As a result, this removes the need for hiding NoMasterDetectedMessages from contending non-leaders.
> 
> 
> Diffs
> -----
> 
>   src/detector/detector.hpp ed485bd86d2bff1046a234a5776c1081a4136bc5 
>   src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623 
>   src/tests/zookeeper_tests.cpp 0855f5aee0baef22c4ecbed1b88f14f16bfff532 
> 
> Diff: https://reviews.apache.org/r/10161/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> I'd like to add a test for this, but I don't see how to induce a timedout() call on the ZooKeeperMasterDetectorProcess, unless I add some plumbing to get a handle to the underlying process. Suggestions?
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>


Re: Review Request: When sending NoMasterDetectedMessage, we need to clear the master sequence number in order to ensure we still send NewMasterDetectedMessage if the master is the same. Otherwise, we can get stuck in a master-less state.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10161/
-----------------------------------------------------------

(Updated April 19, 2013, 6:53 p.m.)


Review request for mesos, Benjamin Hindman and Vinod Kone.


Changes
-------

Vinod + rebase.


Description
-------

This is also along the way to fixing MESOS-305.

This fixes a bug where we were not clearing the sequence number when sending a NoMasterDetectedMessage.
As a result, this removes the need for hiding NoMasterDetectedMessages from contending non-leaders.


Diffs (updated)
-----

  src/detector/detector.hpp ed485bd86d2bff1046a234a5776c1081a4136bc5 
  src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623 
  src/tests/zookeeper_tests.cpp 0855f5aee0baef22c4ecbed1b88f14f16bfff532 

Diff: https://reviews.apache.org/r/10161/diff/


Testing
-------

make check

I'd like to add a test for this, but I don't see how to induce a timedout() call on the ZooKeeperMasterDetectorProcess, unless I add some plumbing to get a handle to the underlying process. Suggestions?


Thanks,

Ben Mahler


Re: Review Request: When sending NoMasterDetectedMessage, we need to clear the master sequence number in order to ensure we still send NewMasterDetectedMessage if the master is the same. Otherwise, we can get stuck in a master-less state.

Posted by Ben Mahler <be...@gmail.com>.

> On April 16, 2013, 6:25 p.m., Vinod Kone wrote:
> > src/detector/detector.hpp, line 45
> > <https://reviews.apache.org/r/10161/diff/2/?file=281414#file281414line45>
> >
> >     Why did you kill the comment? This has been our convention.


> On April 16, 2013, 6:25 p.m., Vinod Kone wrote:
> > src/tests/zookeeper_tests.cpp, lines 307-310
> > <https://reviews.apache.org/r/10161/diff/2/?file=281416#file281416line307>
> >
> >     I like this test!
> >     
> >     But, wouldn't this test pass even with old code?
> >     
> >     I thought (correct me if I'm wrong) this patch makes sure a non-leading master gets NoMasterDetected message? If yes, a test for that would be great.


> On April 16, 2013, 6:25 p.m., Vinod Kone wrote:
> > src/tests/zookeeper_tests.cpp, line 339
> > <https://reviews.apache.org/r/10161/diff/2/?file=281416#file281416line339>
> >
> >     Is there a reason why you had specifically match to detector.process and not just "_"? if you can use "_" here then you can make process back to private.

This is not a match, this is actually dispatching, I couldn't find a way to induce the reconnection call otherwise.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10161/#review19264
-----------------------------------------------------------


On April 16, 2013, 12:40 a.m., Ben Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10161/
> -----------------------------------------------------------
> 
> (Updated April 16, 2013, 12:40 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Vinod Kone.
> 
> 
> Description
> -------
> 
> This is also along the way to fixing MESOS-305.
> 
> This fixes a bug where we were not clearing the sequence number when sending a NoMasterDetectedMessage.
> As a result, this removes the need for hiding NoMasterDetectedMessages from contending non-leaders.
> 
> 
> Diffs
> -----
> 
>   src/detector/detector.hpp ed485bd86d2bff1046a234a5776c1081a4136bc5 
>   src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623 
>   src/tests/zookeeper_tests.cpp 37e1b77a9c728034936eddd451c75a46605885db 
> 
> Diff: https://reviews.apache.org/r/10161/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> I'd like to add a test for this, but I don't see how to induce a timedout() call on the ZooKeeperMasterDetectorProcess, unless I add some plumbing to get a handle to the underlying process. Suggestions?
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>


Re: Review Request: When sending NoMasterDetectedMessage, we need to clear the master sequence number in order to ensure we still send NewMasterDetectedMessage if the master is the same. Otherwise, we can get stuck in a master-less state.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10161/#review19264
-----------------------------------------------------------



src/detector/detector.hpp
<https://reviews.apache.org/r/10161/#comment39876>

    Why did you kill the comment? This has been our convention.



src/tests/zookeeper_tests.cpp
<https://reviews.apache.org/r/10161/#comment39879>

    I like this test!
    
    But, wouldn't this test pass even with old code?
    
    I thought (correct me if I'm wrong) this patch makes sure a non-leading master gets NoMasterDetected message? If yes, a test for that would be great.



src/tests/zookeeper_tests.cpp
<https://reviews.apache.org/r/10161/#comment39877>

    Is there a reason why you had specifically match to detector.process and not just "_"? if you can use "_" here then you can make process back to private.



src/tests/zookeeper_tests.cpp
<https://reviews.apache.org/r/10161/#comment39878>

    Can you use AWAIT_READY instead since this is getting deprecated?


- Vinod Kone


On April 16, 2013, 12:40 a.m., Ben Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10161/
> -----------------------------------------------------------
> 
> (Updated April 16, 2013, 12:40 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Vinod Kone.
> 
> 
> Description
> -------
> 
> This is also along the way to fixing MESOS-305.
> 
> This fixes a bug where we were not clearing the sequence number when sending a NoMasterDetectedMessage.
> As a result, this removes the need for hiding NoMasterDetectedMessages from contending non-leaders.
> 
> 
> Diffs
> -----
> 
>   src/detector/detector.hpp ed485bd86d2bff1046a234a5776c1081a4136bc5 
>   src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623 
>   src/tests/zookeeper_tests.cpp 37e1b77a9c728034936eddd451c75a46605885db 
> 
> Diff: https://reviews.apache.org/r/10161/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> I'd like to add a test for this, but I don't see how to induce a timedout() call on the ZooKeeperMasterDetectorProcess, unless I add some plumbing to get a handle to the underlying process. Suggestions?
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>


Re: Review Request: When sending NoMasterDetectedMessage, we need to clear the master sequence number in order to ensure we still send NewMasterDetectedMessage if the master is the same. Otherwise, we can get stuck in a master-less state.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10161/
-----------------------------------------------------------

(Updated April 16, 2013, 12:40 a.m.)


Review request for mesos, Benjamin Hindman and Vinod Kone.


Changes
-------

Added a test.


Description
-------

This is also along the way to fixing MESOS-305.

This fixes a bug where we were not clearing the sequence number when sending a NoMasterDetectedMessage.
As a result, this removes the need for hiding NoMasterDetectedMessages from contending non-leaders.


Diffs (updated)
-----

  src/detector/detector.hpp ed485bd86d2bff1046a234a5776c1081a4136bc5 
  src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623 
  src/tests/zookeeper_tests.cpp 37e1b77a9c728034936eddd451c75a46605885db 

Diff: https://reviews.apache.org/r/10161/diff/


Testing
-------

make check

I'd like to add a test for this, but I don't see how to induce a timedout() call on the ZooKeeperMasterDetectorProcess, unless I add some plumbing to get a handle to the underlying process. Suggestions?


Thanks,

Ben Mahler