You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Ben Mahler <be...@gmail.com> on 2013/04/16 02:58:17 UTC
Re: Review Request: Send NoMasterDetectedMessage to non-contending
detectors. Added a disconnected slave map to the master to track
disconnected slaves,
in order to disallow slave re-registration after a network partition.
> On March 29, 2013, 9:34 p.m., Vinod Kone wrote:
> > src/master/http.cpp, line 268
> > <https://reviews.apache.org/r/10172/diff/1/?file=275912#file275912line268>
> >
> > what is the difference between activated and connected slaves?
Sent out another review that fixes this.
> On March 29, 2013, 9:34 p.m., Vinod Kone wrote:
> > src/master/master.hpp, lines 232-233
> > <https://reviews.apache.org/r/10172/diff/1/?file=275913#file275913line232>
> >
> > kill slavePIDs. just use slaves.
> >
> > use hashset<SlaveID> deactivated.
> >
> > kill active or connected. just maintain one variable.
Fixed in the separate review I sent out.
> On March 29, 2013, 9:34 p.m., Vinod Kone wrote:
> > src/tests/master_detector_tests.cpp, line 92
> > <https://reviews.apache.org/r/10172/diff/1/?file=275916#file275916line92>
> >
> > why write stuff to the work directory?
> >
> > i thought you added a "sandbox" in MesosTest for this stuff?
After discussing in an earlier, we removed the sandbox as the slave work directory is effectively a sandbox already.
- Ben
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10172/#review18532
-----------------------------------------------------------
On March 29, 2013, 1:38 a.m., Ben Mahler wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10172/
> -----------------------------------------------------------
>
> (Updated March 29, 2013, 1:38 a.m.)
>
>
> Review request for mesos, Benjamin Hindman and Vinod Kone.
>
>
> Description
> -------
>
> See above. This is a fix of MESOS-305.
>
> This also fixes MESOS-362.
>
>
> This addresses bugs MESOS-305 and MESOS-362.
> https://issues.apache.org/jira/browse/MESOS-305
> https://issues.apache.org/jira/browse/MESOS-362
>
>
> Diffs
> -----
>
> src/detector/detector.cpp 7a8355162d543e017505dd58efd2d7bf96f99623
> src/master/http.cpp 71b04f01f45ee73d9c246f469e1368223903abed
> src/master/master.hpp 9776a7cb8448e41e5d52288e3c637737cee15a08
> src/master/master.cpp 5b0e8c03c516f9fc8bb729c21e876bdde89baf9c
> src/tests/fault_tolerance_tests.cpp 9d3f8b1bfb58d459b1719d2ba1dbb2e93858fc92
> src/tests/master_detector_tests.cpp fe3b91fb375e0b09f8f2de3e69e736cd5f5b94ba
>
> Diff: https://reviews.apache.org/r/10172/diff/
>
>
> Testing
> -------
>
> make check
>
> Added tests for the partitioned slave re-registration.
> ./bin/mesos-tests.sh --gtest_filter="FaultToleranceTest.PartitionedSlaveReregistration" --verbose --gtest_break_on_failure --gtest_repeat=3000
>
> Ran into MESOS-406, but otherwise no issues.
>
> Will be adding ZK master detector tests shortly to test that the NoMasterDetectedMessages are being sent.
>
>
> Thanks,
>
> Ben Mahler
>
>