You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Alexander Rojas (JIRA)" <ji...@apache.org> on 2015/02/14 08:36:11 UTC
[jira] [Updated] (MESOS-2354) Under certain circumstances master assigns the same ID to different slaves.

     [ https://issues.apache.org/jira/browse/MESOS-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Rojas updated MESOS-2354:
-----------------------------------
    Description: 
If two slaves are created one after the other in quick succession, sometimes the master assigns both slaves the same ID. Example of this is the following test (use in {{master_tests.cpp}}):

{code}
TEST_F(MasterTest, SlavesWithTheSameID)
{
  // Start up the master.
  Try<PID<Master>> master = StartMaster();
  ASSERT_SOME(master);

  // Start a couple of slaves. Their only use is for them to register
  // to the master.
  Future<SlaveRegisteredMessage> slave1RegisteredMessage =
    FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _);
  StartSlave();
  AWAIT_READY(slave1RegisteredMessage);

  Future<SlaveRegisteredMessage> slave2RegisteredMessage =
    FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _);
  StartSlave();
  AWAIT_READY(slave2RegisteredMessage);

  ASSERT_FALSE(
      slave1RegisteredMessage.get().slave_id() ==
        slave2RegisteredMessage.get().slave_id());

  Shutdown();
}
{code}

The test needs to be ran multiple times for it to at some point fail. ie. 
{noformat}./bin/mesos-tests.sh --gtest_filter="MasterTest.SlavesWithTheSameID" --gtest_repeat=1000 --gtest_break_on_failure{noformat}

At some point, the output will be:

{noformat}
../../src/tests/master_tests.cpp:1618: Failure
Value of: slave1RegisteredMessage.get().slave_id() == slave2RegisteredMessage.get().slave_id()
  Actual: true
Expected: false
{noformat}

  was:
If two slaves are created one after the other in quick succession, sometimes the master assigns both slaves the same ID. Example of this is the following test (use in {{master_tests.cpp}}):

{code}
TEST_F(MasterTest, SlavesWithTheSameID)
{
  // Start up the master.
  Try<PID<Master>> master = StartMaster();
  ASSERT_SOME(master);

  // Start a couple of slaves. Their only use is for them to register
  // to the master.
  Future<SlaveRegisteredMessage> slave1RegisteredMessage =
    FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _);
  StartSlave();
  AWAIT_READY(slave1RegisteredMessage);

  Future<SlaveRegisteredMessage> slave2RegisteredMessage =
    FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _);
  StartSlave();
  AWAIT_READY(slave2RegisteredMessage);

  ASSERT_FALSE(
      slave1RegisteredMessage.get().slave_id() ==
        slave2RegisteredMessage.get().slave_id());

  Shutdown();
}
{code}

The test needs to be ran multiple times for it to at some point fail. ie. {{./bin/mesos-tests.sh --gtest_filter="MasterTest.SlavesWithTheSameID" --gtest_repeat=1000 --gtest_break_on_failure}}

At some point, the output will be:

{noformat}
../../src/tests/master_tests.cpp:1618: Failure
Value of: slave1RegisteredMessage.get().slave_id() == slave2RegisteredMessage.get().slave_id()
  Actual: true
Expected: false
{noformat}


> Under certain circumstances master assigns the same ID to different slaves.
> ---------------------------------------------------------------------------
>
>                 Key: MESOS-2354
>                 URL: https://issues.apache.org/jira/browse/MESOS-2354
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.20.1
>            Reporter: Alexander Rojas
>
> If two slaves are created one after the other in quick succession, sometimes the master assigns both slaves the same ID. Example of this is the following test (use in {{master_tests.cpp}}):
> {code}
> TEST_F(MasterTest, SlavesWithTheSameID)
> {
>   // Start up the master.
>   Try<PID<Master>> master = StartMaster();
>   ASSERT_SOME(master);
>   // Start a couple of slaves. Their only use is for them to register
>   // to the master.
>   Future<SlaveRegisteredMessage> slave1RegisteredMessage =
>     FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _);
>   StartSlave();
>   AWAIT_READY(slave1RegisteredMessage);
>   Future<SlaveRegisteredMessage> slave2RegisteredMessage =
>     FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _);
>   StartSlave();
>   AWAIT_READY(slave2RegisteredMessage);
>   ASSERT_FALSE(
>       slave1RegisteredMessage.get().slave_id() ==
>         slave2RegisteredMessage.get().slave_id());
>   Shutdown();
> }
> {code}
> The test needs to be ran multiple times for it to at some point fail. ie. 
> {noformat}./bin/mesos-tests.sh --gtest_filter="MasterTest.SlavesWithTheSameID" --gtest_repeat=1000 --gtest_break_on_failure{noformat}
> At some point, the output will be:
> {noformat}
> ../../src/tests/master_tests.cpp:1618: Failure
> Value of: slave1RegisteredMessage.get().slave_id() == slave2RegisteredMessage.get().slave_id()
>   Actual: true
> Expected: false
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)