You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Klaus Ma <kl...@cguru.net> on 2015/09/05 04:46:09 UTC

Re: Review Request 38003: MESOS-3351 (duplicated slave id in master after master failover)

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38003/
-----------------------------------------------------------

(Updated Sept. 5, 2015, 2:46 a.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Address Vinod's comments


Summary (updated)
-----------------

MESOS-3351 (duplicated slave id in master after master failover)


Bugs: MESOS-3351
    https://issues.apache.org/jira/browse/MESOS-3351


Repository: mesos


Description (updated)
-------

__Phenomenon:__
In some race condition, the slave was shutdown when after master failover.

__Root Cause:__
The slave was shutdown because of duplicated SlavID: in master, the SlaveID is genereated by masterInfo.id + "-S" + nextSlaveId; when master failover, nextSlaveId was reset to 0 and masterInfo.id (generated by date + ip + port + pid) maybe un-changed which lead to duplicated SlaveID. 

__Solution/Fix:__
Generate masterInfo.id by UUID instead of "date + ip + port + pid".


Diffs (updated)
-----

  src/master/master.cpp 5589eca 
  src/tests/master_tests.cpp 8a6b98b 

Diff: https://reviews.apache.org/r/38003/diff/


Testing
-------

make
make check


Thanks,

Klaus Ma


Re: Review Request 38003: MESOS-3351 (duplicated slave id in master after master failover)

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38003/#review97850
-----------------------------------------------------------


Patch looks great!

Reviews applied: [38003]

All tests passed.

- Mesos ReviewBot


On Sept. 5, 2015, 2:46 a.m., Klaus Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38003/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2015, 2:46 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-3351
>     https://issues.apache.org/jira/browse/MESOS-3351
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> __Phenomenon:__
> In some race condition, the slave was shutdown when after master failover.
> 
> __Root Cause:__
> The slave was shutdown because of duplicated SlavID: in master, the SlaveID is genereated by masterInfo.id + "-S" + nextSlaveId; when master failover, nextSlaveId was reset to 0 and masterInfo.id (generated by date + ip + port + pid) maybe un-changed which lead to duplicated SlaveID. 
> 
> __Solution/Fix:__
> Generate masterInfo.id by UUID instead of "date + ip + port + pid".
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 5589eca 
>   src/tests/master_tests.cpp 8a6b98b 
> 
> Diff: https://reviews.apache.org/r/38003/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
> 
> 
> Thanks,
> 
> Klaus Ma
> 
>