You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Yifan Gu <yi...@mesosphere.io> on 2014/06/11 21:18:30 UTC

Re: Review Request 22472: Fixed SlaveTest.TerminatingSlaveDoesNotRegister

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22472/
-----------------------------------------------------------

(Updated June 11, 2014, 7:18 p.m.)


Review request for mesos, Ben Mahler, Dominic Hamon, and Vinod Kone.


Bugs: MESOS-1460
    https://issues.apache.org/jira/browse/MESOS-1460


Repository: mesos-git


Description
-------

Ignored subsequent status updates.
Muted warnings by catching mock calls.


Diffs
-----

  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22472/diff/


Testing
-------

make check


Thanks,

Yifan Gu


Re: Review Request 22472: Fixed SlaveTest.TerminatingSlaveDoesNotReregister

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 12, 2014, 6:09 p.m., Ben Mahler wrote:
> > I think the subject is a bit off, should say "Reregister", not "Register", right?
> > 
> > Did you run this with repetition to see if it is flaky still?
> > 
> > $ ./bin/mesos-tests.sh --gtest_filter="SlaveTest.TerminatingSlaveDoesNotReregister" --gtest_repeat=-1 --gtest_break_on_failure --verbose
> 
> Yifan Gu wrote:
>     Thanks for the cool advice. I run 
>     $ ./bin/mesos-tests.sh --gtest_filter="SlaveTest.TerminatingSlaveDoesNotReregister" --gtest_repeat=-1 --gtest_break_on_failure --verbose
>     
>     And in the 13454th iteration, it gets a new error, looks like the master failed to start.
>     
>     
>     Repeating all tests (iteration 13454) . . .
>     
>     Note: Google Test filter = SlaveTest.TerminatingSlaveDoesNotReregister-CpuIsolatorTest/1.UserCpuUsage:CpuIsolatorTest/1.SystemCpuUsage:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota:MemIsolatorTest/0.MemUsage:MemIsolatorTest/1.MemUsage:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DISABLED_ROOT_RunTaskWithCommandInfoWithUser:ContainerizerTest.ROOT_CGROUPS_BalloonFramework:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Enabled:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Subsystems:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Mounted:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Get:CgroupsAnyHierarchyTest.ROOT_CGROUPS_NestedCgroups:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Tasks:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Read:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Write:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Cfs_Big_Quota:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Busy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_SubsystemsHierarchy:CgroupsAnyHierarchyWithCpuM
 emoryTest.ROOT_CGROUPS_MountedSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_CreateRemove:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen:CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy:CgroupsAnyHierarchyWithCpuAcctMemoryTest.ROOT_CGROUPS_Stat:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Freeze:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Kill:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Destroy:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_AssignThreads:SlaveCount/Registrar_BENCHMARK_Test.performance/0:SlaveCount/Registrar_BENCHMARK_Test.performance/1:SlaveCount/Registrar_BENCHMARK_Test.performance/2:SlaveCount/Registrar_BENCHMARK_Test.performance/3:
>     [==========] Running 1 test from 1 test case.
>     [----------] Global test environment set-up.
>     [----------] 1 test from SlaveTest
>     [ RUN      ] SlaveTest.TerminatingSlaveDoesNotReregister
>     Using temporary directory '/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_O9kh4V'
>     I0612 19:03:17.706805  2910 leveldb.cpp:176] Opened db in 15.704031ms
>     I0612 19:03:17.712888  2910 leveldb.cpp:183] Compacted db in 6.057101ms
>     I0612 19:03:17.712910  2910 leveldb.cpp:198] Created db iterator in 2075ns
>     I0612 19:03:17.712920  2910 leveldb.cpp:204] Seeked to beginning of db in 365ns
>     I0612 19:03:17.712929  2910 leveldb.cpp:273] Iterated through 0 keys in the db in 96ns
>     I0612 19:03:17.712939  2910 replica.cpp:741] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
>     I0612 19:03:17.713034  2933 recover.cpp:425] Starting replica recovery
>     I0612 19:03:17.713165  2925 recover.cpp:451] Replica is in EMPTY status
>     I0612 19:03:17.713366  2925 replica.cpp:638] Replica in EMPTY status received a broadcasted recover request
>     I0612 19:03:17.713471  2924 master.cpp:280] Master 20140612-190317-3823062160-44846-2910 (chimney.mesosphere.io) started on 144.76.223.227:44846
>     I0612 19:03:17.713497  2924 master.cpp:317] Master only allowing authenticated frameworks to register
>     I0612 19:03:17.713507  2924 master.cpp:322] Master only allowing authenticated slaves to register
>     I0612 19:03:17.713515  2924 credentials.hpp:35] Loading credentials for authentication from '/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_O9kh4V/credentials'
>     I0612 19:03:17.713517  2933 recover.cpp:188] Received a recover response from a replica in EMPTY status
>     I0612 19:03:17.713564  2924 master.cpp:348] Authorization enabled
>     I0612 19:03:17.713625  2928 recover.cpp:542] Updating replica status to STARTING
>     I0612 19:03:17.713819  2933 master.cpp:961] The newly elected leader is master@144.76.223.227:44846 with id 20140612-190317-3823062160-44846-2910
>     I0612 19:03:17.719408  2934 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 5.73482ms
>     I0612 19:03:32.107343  2933 master.cpp:974] Elected as the leading master!
>     I0612 19:03:32.107364  2934 replica.cpp:320] Persisted replica status to STARTING
>     F0612 19:03:27.714102  2910 cluster.hpp:427] Failed to wait for _recover
>     *** Check failure stack trace: ***
>     I0612 19:03:32.107374  2933 master.cpp:792] Recovering from registrar
>     I0612 19:03:32.107522  2934 recover.cpp:451] Replica is in STARTING status
>     I0612 19:03:32.107746  2929 registrar.cpp:313] Recovering registrar
>     I0612 19:03:32.108326  2925 replica.cpp:638] Replica in STARTING status received a broadcasted recover request
>     I0612 19:03:32.108497  2931 recover.cpp:188] Received a recover response from a replica in STARTING status
>     I0612 19:03:32.108778  2929 recover.cpp:542] Updating replica status to VOTING
>         @     0x7f4c0cc3dc3d  google::LogMessage::Fail()
>         @     0x7f4c0cc3fa7d  google::LogMessage::SendToLog()
>         @     0x7f4c0cc3d82c  google::LogMessage::Flush()
>         @     0x7f4c0cc40379  google::LogMessageFatal::~LogMessageFatal()
>         @           0x73b9db  mesos::internal::tests::Cluster::Masters::start()
>         @           0x736885  mesos::internal::tests::MesosTest::StartMaster()
>         @           0x826fbf  SlaveTest_TerminatingSlaveDoesNotReregister_Test::TestBody()
>         @           0x8cfbb3  testing::internal::HandleExceptionsInMethodIfSupported<>()
>         @           0x8c8e87  testing::Test::Run()
>         @           0x8c8f2e  testing::TestInfo::Run()
>         @           0x8c9035  testing::TestCase::Run()
>         @           0x8c92d8  testing::internal::UnitTestImpl::RunAllTests()
>     I0612 19:03:32.117660  2932 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 8.736907ms
>     I0612 19:03:32.117678  2932 replica.cpp:320] Persisted replica status to VOTING
>     I0612 19:03:32.117710  2931 recover.cpp:556] Successfully joined the Paxos group
>         @           0x8c9577  testing::UnitTest::Run()
>     I0612 19:03:32.117769  2931 recover.cpp:440] Recover process terminated
>         @           0x48b01d  main
>     I0612 19:03:32.117884  2928 log.cpp:656] Attempting to start the writer
>         @     0x7f4c0af73de5  (unknown)
>     I0612 19:03:32.118140  2929 replica.cpp:474] Replica received implicit promise request with proposal 1
>         @           0x498144  (unknown)
>     Aborted
>     
>     
>
> 
> Ben Mahler wrote:
>     Thanks Yifan, that looks like an orthogonal issue (strange that the master took more than 10 seconds to realize it was elected).
>     
>     Will get this committed for you.

Thanks Ben!


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22472/#review45516
-----------------------------------------------------------


On June 12, 2014, 7:15 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22472/
> -----------------------------------------------------------
> 
> (Updated June 12, 2014, 7:15 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Dominic Hamon, and Vinod Kone.
> 
> 
> Bugs: MESOS-1460
>     https://issues.apache.org/jira/browse/MESOS-1460
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Ignored subsequent status updates.
> Muted warnings by catching mock calls.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22472/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22472: Fixed SlaveTest.TerminatingSlaveDoesNotReregister

Posted by Yifan Gu <yi...@mesosphere.io>.

> On June 12, 2014, 6:09 p.m., Ben Mahler wrote:
> > I think the subject is a bit off, should say "Reregister", not "Register", right?
> > 
> > Did you run this with repetition to see if it is flaky still?
> > 
> > $ ./bin/mesos-tests.sh --gtest_filter="SlaveTest.TerminatingSlaveDoesNotReregister" --gtest_repeat=-1 --gtest_break_on_failure --verbose

Thanks for the cool advice. I run 
$ ./bin/mesos-tests.sh --gtest_filter="SlaveTest.TerminatingSlaveDoesNotReregister" --gtest_repeat=-1 --gtest_break_on_failure --verbose

And in the 13454th iteration, it gets a new error, looks like the master failed to start.


Repeating all tests (iteration 13454) . . .

Note: Google Test filter = SlaveTest.TerminatingSlaveDoesNotReregister-CpuIsolatorTest/1.UserCpuUsage:CpuIsolatorTest/1.SystemCpuUsage:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota:MemIsolatorTest/0.MemUsage:MemIsolatorTest/1.MemUsage:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DISABLED_ROOT_RunTaskWithCommandInfoWithUser:ContainerizerTest.ROOT_CGROUPS_BalloonFramework:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Enabled:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Subsystems:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Mounted:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Get:CgroupsAnyHierarchyTest.ROOT_CGROUPS_NestedCgroups:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Tasks:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Read:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Write:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Cfs_Big_Quota:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Busy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_SubsystemsHierarchy:CgroupsAnyHierarchyWithCpuMemoryT
 est.ROOT_CGROUPS_MountedSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_CreateRemove:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen:CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy:CgroupsAnyHierarchyWithCpuAcctMemoryTest.ROOT_CGROUPS_Stat:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Freeze:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Kill:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Destroy:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_AssignThreads:SlaveCount/Registrar_BENCHMARK_Test.performance/0:SlaveCount/Registrar_BENCHMARK_Test.performance/1:SlaveCount/Registrar_BENCHMARK_Test.performance/2:SlaveCount/Registrar_BENCHMARK_Test.performance/3:
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from SlaveTest
[ RUN      ] SlaveTest.TerminatingSlaveDoesNotReregister
Using temporary directory '/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_O9kh4V'
I0612 19:03:17.706805  2910 leveldb.cpp:176] Opened db in 15.704031ms
I0612 19:03:17.712888  2910 leveldb.cpp:183] Compacted db in 6.057101ms
I0612 19:03:17.712910  2910 leveldb.cpp:198] Created db iterator in 2075ns
I0612 19:03:17.712920  2910 leveldb.cpp:204] Seeked to beginning of db in 365ns
I0612 19:03:17.712929  2910 leveldb.cpp:273] Iterated through 0 keys in the db in 96ns
I0612 19:03:17.712939  2910 replica.cpp:741] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
I0612 19:03:17.713034  2933 recover.cpp:425] Starting replica recovery
I0612 19:03:17.713165  2925 recover.cpp:451] Replica is in EMPTY status
I0612 19:03:17.713366  2925 replica.cpp:638] Replica in EMPTY status received a broadcasted recover request
I0612 19:03:17.713471  2924 master.cpp:280] Master 20140612-190317-3823062160-44846-2910 (chimney.mesosphere.io) started on 144.76.223.227:44846
I0612 19:03:17.713497  2924 master.cpp:317] Master only allowing authenticated frameworks to register
I0612 19:03:17.713507  2924 master.cpp:322] Master only allowing authenticated slaves to register
I0612 19:03:17.713515  2924 credentials.hpp:35] Loading credentials for authentication from '/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_O9kh4V/credentials'
I0612 19:03:17.713517  2933 recover.cpp:188] Received a recover response from a replica in EMPTY status
I0612 19:03:17.713564  2924 master.cpp:348] Authorization enabled
I0612 19:03:17.713625  2928 recover.cpp:542] Updating replica status to STARTING
I0612 19:03:17.713819  2933 master.cpp:961] The newly elected leader is master@144.76.223.227:44846 with id 20140612-190317-3823062160-44846-2910
I0612 19:03:17.719408  2934 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 5.73482ms
I0612 19:03:32.107343  2933 master.cpp:974] Elected as the leading master!
I0612 19:03:32.107364  2934 replica.cpp:320] Persisted replica status to STARTING
F0612 19:03:27.714102  2910 cluster.hpp:427] Failed to wait for _recover
*** Check failure stack trace: ***
I0612 19:03:32.107374  2933 master.cpp:792] Recovering from registrar
I0612 19:03:32.107522  2934 recover.cpp:451] Replica is in STARTING status
I0612 19:03:32.107746  2929 registrar.cpp:313] Recovering registrar
I0612 19:03:32.108326  2925 replica.cpp:638] Replica in STARTING status received a broadcasted recover request
I0612 19:03:32.108497  2931 recover.cpp:188] Received a recover response from a replica in STARTING status
I0612 19:03:32.108778  2929 recover.cpp:542] Updating replica status to VOTING
    @     0x7f4c0cc3dc3d  google::LogMessage::Fail()
    @     0x7f4c0cc3fa7d  google::LogMessage::SendToLog()
    @     0x7f4c0cc3d82c  google::LogMessage::Flush()
    @     0x7f4c0cc40379  google::LogMessageFatal::~LogMessageFatal()
    @           0x73b9db  mesos::internal::tests::Cluster::Masters::start()
    @           0x736885  mesos::internal::tests::MesosTest::StartMaster()
    @           0x826fbf  SlaveTest_TerminatingSlaveDoesNotReregister_Test::TestBody()
    @           0x8cfbb3  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @           0x8c8e87  testing::Test::Run()
    @           0x8c8f2e  testing::TestInfo::Run()
    @           0x8c9035  testing::TestCase::Run()
    @           0x8c92d8  testing::internal::UnitTestImpl::RunAllTests()
I0612 19:03:32.117660  2932 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 8.736907ms
I0612 19:03:32.117678  2932 replica.cpp:320] Persisted replica status to VOTING
I0612 19:03:32.117710  2931 recover.cpp:556] Successfully joined the Paxos group
    @           0x8c9577  testing::UnitTest::Run()
I0612 19:03:32.117769  2931 recover.cpp:440] Recover process terminated
    @           0x48b01d  main
I0612 19:03:32.117884  2928 log.cpp:656] Attempting to start the writer
    @     0x7f4c0af73de5  (unknown)
I0612 19:03:32.118140  2929 replica.cpp:474] Replica received implicit promise request with proposal 1
    @           0x498144  (unknown)
Aborted


- Yifan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22472/#review45516
-----------------------------------------------------------


On June 12, 2014, 7:15 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22472/
> -----------------------------------------------------------
> 
> (Updated June 12, 2014, 7:15 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Dominic Hamon, and Vinod Kone.
> 
> 
> Bugs: MESOS-1460
>     https://issues.apache.org/jira/browse/MESOS-1460
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Ignored subsequent status updates.
> Muted warnings by catching mock calls.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22472/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22472: Fixed SlaveTest.TerminatingSlaveDoesNotReregister

Posted by Ben Mahler <be...@gmail.com>.

> On June 12, 2014, 6:09 p.m., Ben Mahler wrote:
> > I think the subject is a bit off, should say "Reregister", not "Register", right?
> > 
> > Did you run this with repetition to see if it is flaky still?
> > 
> > $ ./bin/mesos-tests.sh --gtest_filter="SlaveTest.TerminatingSlaveDoesNotReregister" --gtest_repeat=-1 --gtest_break_on_failure --verbose
> 
> Yifan Gu wrote:
>     Thanks for the cool advice. I run 
>     $ ./bin/mesos-tests.sh --gtest_filter="SlaveTest.TerminatingSlaveDoesNotReregister" --gtest_repeat=-1 --gtest_break_on_failure --verbose
>     
>     And in the 13454th iteration, it gets a new error, looks like the master failed to start.
>     
>     
>     Repeating all tests (iteration 13454) . . .
>     
>     Note: Google Test filter = SlaveTest.TerminatingSlaveDoesNotReregister-CpuIsolatorTest/1.UserCpuUsage:CpuIsolatorTest/1.SystemCpuUsage:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota:MemIsolatorTest/0.MemUsage:MemIsolatorTest/1.MemUsage:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DISABLED_ROOT_RunTaskWithCommandInfoWithUser:ContainerizerTest.ROOT_CGROUPS_BalloonFramework:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Enabled:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Subsystems:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Mounted:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Get:CgroupsAnyHierarchyTest.ROOT_CGROUPS_NestedCgroups:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Tasks:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Read:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Write:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Cfs_Big_Quota:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Busy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_SubsystemsHierarchy:CgroupsAnyHierarchyWithCpuM
 emoryTest.ROOT_CGROUPS_MountedSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_CreateRemove:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen:CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy:CgroupsAnyHierarchyWithCpuAcctMemoryTest.ROOT_CGROUPS_Stat:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Freeze:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Kill:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_Destroy:CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_AssignThreads:SlaveCount/Registrar_BENCHMARK_Test.performance/0:SlaveCount/Registrar_BENCHMARK_Test.performance/1:SlaveCount/Registrar_BENCHMARK_Test.performance/2:SlaveCount/Registrar_BENCHMARK_Test.performance/3:
>     [==========] Running 1 test from 1 test case.
>     [----------] Global test environment set-up.
>     [----------] 1 test from SlaveTest
>     [ RUN      ] SlaveTest.TerminatingSlaveDoesNotReregister
>     Using temporary directory '/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_O9kh4V'
>     I0612 19:03:17.706805  2910 leveldb.cpp:176] Opened db in 15.704031ms
>     I0612 19:03:17.712888  2910 leveldb.cpp:183] Compacted db in 6.057101ms
>     I0612 19:03:17.712910  2910 leveldb.cpp:198] Created db iterator in 2075ns
>     I0612 19:03:17.712920  2910 leveldb.cpp:204] Seeked to beginning of db in 365ns
>     I0612 19:03:17.712929  2910 leveldb.cpp:273] Iterated through 0 keys in the db in 96ns
>     I0612 19:03:17.712939  2910 replica.cpp:741] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
>     I0612 19:03:17.713034  2933 recover.cpp:425] Starting replica recovery
>     I0612 19:03:17.713165  2925 recover.cpp:451] Replica is in EMPTY status
>     I0612 19:03:17.713366  2925 replica.cpp:638] Replica in EMPTY status received a broadcasted recover request
>     I0612 19:03:17.713471  2924 master.cpp:280] Master 20140612-190317-3823062160-44846-2910 (chimney.mesosphere.io) started on 144.76.223.227:44846
>     I0612 19:03:17.713497  2924 master.cpp:317] Master only allowing authenticated frameworks to register
>     I0612 19:03:17.713507  2924 master.cpp:322] Master only allowing authenticated slaves to register
>     I0612 19:03:17.713515  2924 credentials.hpp:35] Loading credentials for authentication from '/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_O9kh4V/credentials'
>     I0612 19:03:17.713517  2933 recover.cpp:188] Received a recover response from a replica in EMPTY status
>     I0612 19:03:17.713564  2924 master.cpp:348] Authorization enabled
>     I0612 19:03:17.713625  2928 recover.cpp:542] Updating replica status to STARTING
>     I0612 19:03:17.713819  2933 master.cpp:961] The newly elected leader is master@144.76.223.227:44846 with id 20140612-190317-3823062160-44846-2910
>     I0612 19:03:17.719408  2934 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 5.73482ms
>     I0612 19:03:32.107343  2933 master.cpp:974] Elected as the leading master!
>     I0612 19:03:32.107364  2934 replica.cpp:320] Persisted replica status to STARTING
>     F0612 19:03:27.714102  2910 cluster.hpp:427] Failed to wait for _recover
>     *** Check failure stack trace: ***
>     I0612 19:03:32.107374  2933 master.cpp:792] Recovering from registrar
>     I0612 19:03:32.107522  2934 recover.cpp:451] Replica is in STARTING status
>     I0612 19:03:32.107746  2929 registrar.cpp:313] Recovering registrar
>     I0612 19:03:32.108326  2925 replica.cpp:638] Replica in STARTING status received a broadcasted recover request
>     I0612 19:03:32.108497  2931 recover.cpp:188] Received a recover response from a replica in STARTING status
>     I0612 19:03:32.108778  2929 recover.cpp:542] Updating replica status to VOTING
>         @     0x7f4c0cc3dc3d  google::LogMessage::Fail()
>         @     0x7f4c0cc3fa7d  google::LogMessage::SendToLog()
>         @     0x7f4c0cc3d82c  google::LogMessage::Flush()
>         @     0x7f4c0cc40379  google::LogMessageFatal::~LogMessageFatal()
>         @           0x73b9db  mesos::internal::tests::Cluster::Masters::start()
>         @           0x736885  mesos::internal::tests::MesosTest::StartMaster()
>         @           0x826fbf  SlaveTest_TerminatingSlaveDoesNotReregister_Test::TestBody()
>         @           0x8cfbb3  testing::internal::HandleExceptionsInMethodIfSupported<>()
>         @           0x8c8e87  testing::Test::Run()
>         @           0x8c8f2e  testing::TestInfo::Run()
>         @           0x8c9035  testing::TestCase::Run()
>         @           0x8c92d8  testing::internal::UnitTestImpl::RunAllTests()
>     I0612 19:03:32.117660  2932 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 8.736907ms
>     I0612 19:03:32.117678  2932 replica.cpp:320] Persisted replica status to VOTING
>     I0612 19:03:32.117710  2931 recover.cpp:556] Successfully joined the Paxos group
>         @           0x8c9577  testing::UnitTest::Run()
>     I0612 19:03:32.117769  2931 recover.cpp:440] Recover process terminated
>         @           0x48b01d  main
>     I0612 19:03:32.117884  2928 log.cpp:656] Attempting to start the writer
>         @     0x7f4c0af73de5  (unknown)
>     I0612 19:03:32.118140  2929 replica.cpp:474] Replica received implicit promise request with proposal 1
>         @           0x498144  (unknown)
>     Aborted
>     
>     
>

Thanks Yifan, that looks like an orthogonal issue (strange that the master took more than 10 seconds to realize it was elected).

Will get this committed for you.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22472/#review45516
-----------------------------------------------------------


On June 12, 2014, 7:15 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22472/
> -----------------------------------------------------------
> 
> (Updated June 12, 2014, 7:15 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Dominic Hamon, and Vinod Kone.
> 
> 
> Bugs: MESOS-1460
>     https://issues.apache.org/jira/browse/MESOS-1460
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Ignored subsequent status updates.
> Muted warnings by catching mock calls.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22472/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22472: Fixed SlaveTest.TerminatingSlaveDoesNotRegister

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22472/#review45516
-----------------------------------------------------------


I think the subject is a bit off, should say "Reregister", not "Register", right?

Did you run this with repetition to see if it is flaky still?

$ ./bin/mesos-tests.sh --gtest_filter="SlaveTest.TerminatingSlaveDoesNotReregister" --gtest_repeat=-1 --gtest_break_on_failure --verbose

- Ben Mahler


On June 11, 2014, 7:18 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22472/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 7:18 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Dominic Hamon, and Vinod Kone.
> 
> 
> Bugs: MESOS-1460
>     https://issues.apache.org/jira/browse/MESOS-1460
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Ignored subsequent status updates.
> Muted warnings by catching mock calls.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22472/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22472: Fixed SlaveTest.TerminatingSlaveDoesNotReregister

Posted by Benjamin Mahler <be...@gmail.com>.
Committed, please subscribe to commits@mesos.apache.org so you can get the
commit emails, if you haven't already!


On Thu, Jun 12, 2014 at 12:24 PM, Ben Mahler <be...@gmail.com>
wrote:

>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22472/#review45524
> -----------------------------------------------------------
>
> Ship it!
>
>
> Ship It!
>
> - Ben Mahler
>
>
> On June 12, 2014, 7:15 p.m., Yifan Gu wrote:
> >
> > -----------------------------------------------------------
> > This is an automatically generated e-mail. To reply, visit:
> > https://reviews.apache.org/r/22472/
> > -----------------------------------------------------------
> >
> > (Updated June 12, 2014, 7:15 p.m.)
> >
> >
> > Review request for mesos, Ben Mahler, Dominic Hamon, and Vinod Kone.
> >
> >
> > Bugs: MESOS-1460
> >     https://issues.apache.org/jira/browse/MESOS-1460
> >
> >
> > Repository: mesos-git
> >
> >
> > Description
> > -------
> >
> > Ignored subsequent status updates.
> > Muted warnings by catching mock calls.
> >
> >
> > Diffs
> > -----
> >
> >   src/tests/slave_tests.cpp 2c8f183
> >
> > Diff: https://reviews.apache.org/r/22472/diff/
> >
> >
> > Testing
> > -------
> >
> > make check
> >
> >
> > Thanks,
> >
> > Yifan Gu
> >
> >
>
>

Re: Review Request 22472: Fixed SlaveTest.TerminatingSlaveDoesNotReregister

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22472/#review45524
-----------------------------------------------------------

Ship it!


Ship It!

- Ben Mahler


On June 12, 2014, 7:15 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22472/
> -----------------------------------------------------------
> 
> (Updated June 12, 2014, 7:15 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Dominic Hamon, and Vinod Kone.
> 
> 
> Bugs: MESOS-1460
>     https://issues.apache.org/jira/browse/MESOS-1460
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Ignored subsequent status updates.
> Muted warnings by catching mock calls.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22472/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>


Re: Review Request 22472: Fixed SlaveTest.TerminatingSlaveDoesNotReregister

Posted by Yifan Gu <yi...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22472/
-----------------------------------------------------------

(Updated June 12, 2014, 7:15 p.m.)


Review request for mesos, Ben Mahler, Dominic Hamon, and Vinod Kone.


Summary (updated)
-----------------

Fixed SlaveTest.TerminatingSlaveDoesNotReregister


Bugs: MESOS-1460
    https://issues.apache.org/jira/browse/MESOS-1460


Repository: mesos-git


Description
-------

Ignored subsequent status updates.
Muted warnings by catching mock calls.


Diffs
-----

  src/tests/slave_tests.cpp 2c8f183 

Diff: https://reviews.apache.org/r/22472/diff/


Testing
-------

make check


Thanks,

Yifan Gu


Re: Review Request 22472: Fixed SlaveTest.TerminatingSlaveDoesNotRegister

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22472/#review45462
-----------------------------------------------------------


Patch looks great!

Reviews applied: [22472]

All tests passed.

- Mesos ReviewBot


On June 11, 2014, 7:18 p.m., Yifan Gu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22472/
> -----------------------------------------------------------
> 
> (Updated June 11, 2014, 7:18 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Dominic Hamon, and Vinod Kone.
> 
> 
> Bugs: MESOS-1460
>     https://issues.apache.org/jira/browse/MESOS-1460
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Ignored subsequent status updates.
> Muted warnings by catching mock calls.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_tests.cpp 2c8f183 
> 
> Diff: https://reviews.apache.org/r/22472/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Yifan Gu
> 
>