You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Till Toenshoff (JIRA)" <ji...@apache.org> on 2015/02/14 19:29:11 UTC
[jira] [Commented] (MESOS-2354) Under certain circumstances master
assigns the same ID to different slaves.
[ https://issues.apache.org/jira/browse/MESOS-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321617#comment-14321617 ]
Till Toenshoff commented on MESOS-2354:
---------------------------------------
[~arojas] Did you find out yet on why the master is creating dupe slave id's?
> Under certain circumstances master assigns the same ID to different slaves.
> ---------------------------------------------------------------------------
>
> Key: MESOS-2354
> URL: https://issues.apache.org/jira/browse/MESOS-2354
> Project: Mesos
> Issue Type: Bug
> Components: master
> Affects Versions: 0.20.1
> Reporter: Alexander Rojas
>
> If two slaves are created one after the other in quick succession, sometimes the master assigns both slaves the same ID. Example of this is the following test (use in {{master_tests.cpp}}):
> {code}
> TEST_F(MasterTest, SlavesWithTheSameID)
> {
> // Start up the master.
> Try<PID<Master>> master = StartMaster();
> ASSERT_SOME(master);
> // Start a couple of slaves. Their only use is for them to register
> // to the master.
> Future<SlaveRegisteredMessage> slave1RegisteredMessage =
> FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _);
> StartSlave();
> AWAIT_READY(slave1RegisteredMessage);
> Future<SlaveRegisteredMessage> slave2RegisteredMessage =
> FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _);
> StartSlave();
> AWAIT_READY(slave2RegisteredMessage);
> ASSERT_FALSE(
> slave1RegisteredMessage.get().slave_id() ==
> slave2RegisteredMessage.get().slave_id());
> Shutdown();
> }
> {code}
> The test needs to be ran multiple times for it to at some point fail. ie.
> {noformat}./bin/mesos-tests.sh --gtest_filter="MasterTest.SlavesWithTheSameID" --gtest_repeat=1000 --gtest_break_on_failure{noformat}
> At some point, the output will be:
> {noformat}
> ../../src/tests/master_tests.cpp:1618: Failure
> Value of: slave1RegisteredMessage.get().slave_id() == slave2RegisteredMessage.get().slave_id()
> Actual: true
> Expected: false
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)