You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Alexander Rojas (JIRA)" <ji...@apache.org> on 2015/02/17 15:54:11 UTC

[jira] [Closed] (MESOS-2354) Under certain circumstances master assigns the same ID to different slaves.

     [ https://issues.apache.org/jira/browse/MESOS-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Rojas closed MESOS-2354.
----------------------------------
    Resolution: Not a Problem

So the issue was the line {{FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _);}}. Very sporadically, the first slave would try to register again to the master before the second slave registered for the first time, which would satisfy the second future.

To fix it, I only changed the code of the test to:

{code}
TEST_F(MasterTest, SlavesEndpointTwoSlaves)
{
  // Start up the master.
  Try<PID<Master>> master = StartMaster();
  ASSERT_SOME(master);

  // Start a couple of slaves. Their only use is for them to register
  // to the master.
  Try<PID<Slave>> slave1 = StartSlave();
  Future<SlaveRegisteredMessage> slave1RegisteredMessage =
    FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), slave1.get());

  Try<PID<Slave>> slave2 = StartSlave();
  Future<SlaveRegisteredMessage> slave2RegisteredMessage =
    FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), slave2.get());

  // Wait for the slaves to be registered.
  AWAIT_READY(slave1RegisteredMessage);
  AWAIT_READY(slave2RegisteredMessage);

  Shutdown();
}
{code}

> Under certain circumstances master assigns the same ID to different slaves.
> ---------------------------------------------------------------------------
>
>                 Key: MESOS-2354
>                 URL: https://issues.apache.org/jira/browse/MESOS-2354
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.20.1
>            Reporter: Alexander Rojas
>
> If two slaves are created one after the other in quick succession, sometimes the master assigns both slaves the same ID. Example of this is the following test (use in {{master_tests.cpp}}):
> {code}
> TEST_F(MasterTest, SlavesWithTheSameID)
> {
>   // Start up the master.
>   Try<PID<Master>> master = StartMaster();
>   ASSERT_SOME(master);
>   // Start a couple of slaves. Their only use is for them to register
>   // to the master.
>   Future<SlaveRegisteredMessage> slave1RegisteredMessage =
>     FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _);
>   StartSlave();
>   AWAIT_READY(slave1RegisteredMessage);
>   Future<SlaveRegisteredMessage> slave2RegisteredMessage =
>     FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _);
>   StartSlave();
>   AWAIT_READY(slave2RegisteredMessage);
>   ASSERT_FALSE(
>       slave1RegisteredMessage.get().slave_id() ==
>         slave2RegisteredMessage.get().slave_id());
>   Shutdown();
> }
> {code}
> The test needs to be ran multiple times for it to at some point fail. ie. 
> {noformat}./bin/mesos-tests.sh --gtest_filter="MasterTest.SlavesWithTheSameID" --gtest_repeat=1000 --gtest_break_on_failure{noformat}
> At some point, the output will be:
> {noformat}
> ../../src/tests/master_tests.cpp:1618: Failure
> Value of: slave1RegisteredMessage.get().slave_id() == slave2RegisteredMessage.get().slave_id()
>   Actual: true
> Expected: false
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)