You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Kapil Arya (JIRA)" <ji...@apache.org> on 2014/12/22 23:15:14 UTC

[jira] [Updated] (MESOS-1890) CoordinatorTest.*LearnedFill and RecoverTest.RacingCatchup tests failing

     [ https://issues.apache.org/jira/browse/MESOS-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kapil Arya updated MESOS-1890:
------------------------------
    Description: 
make check fails for me for the following tests:
[  FAILED  ] CoordinatorTest.MultipleAppendsNotLearnedFill
[  FAILED  ] CoordinatorTest.TruncateNotLearnedFill
[  FAILED  ] CoordinatorTest.TruncateLearnedFill
[  FAILED  ] RecoverTest.RacingCatchup
============

Apart from that, all the Coordinator tests take quite a while to run. Here is an excerpt:

[----------] 21 tests from CoordinatorTest
[ RUN      ] CoordinatorTest.Elect
[       OK ] CoordinatorTest.Elect (3142 ms)
[ RUN      ] CoordinatorTest.ElectWithClockPaused
[       OK ] CoordinatorTest.ElectWithClockPaused (3198 ms)
[ RUN      ] CoordinatorTest.AppendRead
[       OK ] CoordinatorTest.AppendRead (3933 ms)
[ RUN      ] CoordinatorTest.AppendReadError
[       OK ] CoordinatorTest.AppendReadError (6246 ms)
[ RUN      ] CoordinatorTest.AppendDiscarded
[       OK ] CoordinatorTest.AppendDiscarded (3204 ms)
[ RUN      ] CoordinatorTest.ElectNoQuorum
[       OK ] CoordinatorTest.ElectNoQuorum (735 ms)
[ RUN      ] CoordinatorTest.AppendNoQuorum
[       OK ] CoordinatorTest.AppendNoQuorum (1993 ms)
[ RUN      ] CoordinatorTest.Failover
[       OK ] CoordinatorTest.Failover (7065 ms)
[ RUN      ] CoordinatorTest.Demoted
[       OK ] CoordinatorTest.Demoted (6925 ms)
[ RUN      ] CoordinatorTest.Fill
[       OK ] CoordinatorTest.Fill (8687 ms)
[ RUN      ] CoordinatorTest.NotLearnedFill
[       OK ] CoordinatorTest.NotLearnedFill (8385 ms)
[ RUN      ] CoordinatorTest.MultipleAppends
[       OK ] CoordinatorTest.MultipleAppends (15361 ms)
[ RUN      ] CoordinatorTest.MultipleAppendsNotLearnedFill
../../src/tests/log_tests.cpp:1258: Failure
Failed to wait 10secs for electing
[  FAILED  ] CoordinatorTest.MultipleAppendsNotLearnedFill (28107 ms)
[ RUN      ] CoordinatorTest.Truncate
[       OK ] CoordinatorTest.Truncate (14017 ms)
[ RUN      ] CoordinatorTest.TruncateNotLearnedFill
../../src/tests/log_tests.cpp:1399: Failure
Failed to wait 10secs for electing
[  FAILED  ] CoordinatorTest.TruncateNotLearnedFill (28857 ms)
[ RUN      ] CoordinatorTest.TruncateLearnedFill
../../src/tests/log_tests.cpp:1484: Failure
Failed to wait 10secs for electing
[  FAILED  ] CoordinatorTest.TruncateLearnedFill (26668 ms)
[ RUN      ] CoordinatorTest.RacingElect
[       OK ] CoordinatorTest.RacingElect (0 ms)
[ RUN      ] CoordinatorTest.FillNoQuorum
[       OK ] CoordinatorTest.FillNoQuorum (0 ms)
[ RUN      ] CoordinatorTest.FillInconsistent
[       OK ] CoordinatorTest.FillInconsistent (0 ms)
[ RUN      ] CoordinatorTest.LearnedOnOneReplica_NotLearnedOnAnother
[       OK ] CoordinatorTest.LearnedOnOneReplica_NotL



pick
earnedOnAnother (0 ms)
[ RUN      ] CoordinatorTest.LearnedOnOneReplica_NotLearnedOnAnother_AnotherFailsAndRecovers
[       OK ] CoordinatorTest.LearnedOnOneReplica_NotLearnedOnAnother_AnotherFailsAndRecovers (0 ms)
[----------] 21 tests from CoordinatorTest (166524 ms total)

[----------] 4 tests from RecoverTest
[ RUN      ] RecoverTest.RacingCatchup
../../src/tests/log_tests.cpp:1571: Failure
Failed to wait 10secs for recovering4
[  FAILED  ] RecoverTest.RacingCatchup (44916 ms)
[ RUN      ] RecoverTest.CatchupRetry
[       OK ] RecoverTest.CatchupRetry (25811 ms)
[ RUN      ] RecoverTest.AutoInitialization
[       OK ] RecoverTest.AutoInitialization (4573 ms)
[ RUN      ] RecoverTest.AutoInitializationRetry
[       OK ] RecoverTest.AutoInitializationRetry (4950 ms)
[----------] 4 tests from RecoverTest (80250 ms total)

[----------] 2 tests from LogTest
[ RUN      ] LogTest.WriteRead
[       OK ] LogTest.WriteRead (3238 ms)
[ RUN      ] LogTest.Position
[       OK ] LogTest.Position (3204 ms)
[----------] 2 tests from LogTest (6442 ms total)

=========================

Here is the tail log of the failed tests:


I1009 11:42:21.737470  4476 replica.cpp:508] Replica received write request for position 6
I1009 11:42:21.737491  4476 leveldb.cpp:438] Reading position from leveldb took 10414ns
I1009 11:42:21.737517  4478 replica.cpp:508] Replica received write request for position 6
I1009 11:42:21.737531  4478 leveldb.cpp:438] Reading position from leveldb took 6549ns
../../src/tests/log_tests.cpp:1258: Failure
Failed to wait 10secs for electing
I1009 11:42:22.349469  4476 leveldb.cpp:343] Persisting action (17 bytes) to leveldb took 611.945625ms
I1009 11:42:22.349550  4476 replica.cpp:676] Persisted action at 6
[  FAILED  ] CoordinatorTest.MultipleAppendsNotLearnedFill (21352 ms)



I1009 11:42:56.001920  4476 replica.cpp:676] Persisted action at 5
I1009 11:42:56.002207  4478 replica.cpp:508] Replica received write request for position 5
I1009 11:42:56.002207  4481 replica.cpp:508] Replica received write request for position 5
I1009 11:42:56.002229  4478 leveldb.cpp:438] Reading position from leveldb took 9757ns
I1009 11:42:56.002248  4481 leveldb.cpp:438] Reading position from leveldb took 19658ns
../../src/tests/log_tests.cpp:1399: Failure
Failed to wait 10secs for electing
I1009 11:42:56.643134  4481 leveldb.cpp:343] Persisting action (17 bytes) to leveldb took 640.870146ms
I1009 11:42:56.643167  4481 replica.cpp:676] Persisted action at 5
I1009 11:42:56.688416  4478 leveldb.cpp:343] Persisting action (17 bytes) to leveldb took 686.162358ms
I1009 11:42:56.688446  4478 replica.cpp:676] Persisted action at 5
[  FAILED  ] CoordinatorTest.TruncateNotLearnedFill (34512 ms)


I1009 11:43:23.917522  4480 replica.cpp:676] Persisted action at 7
I1009 11:43:23.917670  4482 replica.cpp:655] Replica received learned notice for position 7
I1009 11:43:23.942389  4481 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 81.544002ms
I1009 11:43:23.942421  4481 replica.cpp:676] Persisted action at 7
I1009 11:43:23.942453  4481 replica.cpp:655] Replica received learned notice for position 7
../../src/tests/log_tests.cpp:1484: Failure
Failed to wait 10secs for electing
I1009 11:43:25.179002  4482 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.261297165secs
I1009 11:43:25.179033  4482 replica.cpp:676] Persisted action at 7
I1009 11:43:25.179038  4482 replica.I1009 11:43:23.917522  4480 replica.cpp:676] Persisted action at 7
I1009 11:43:23.917670  4482 replica.cpp:655] Replica received learned notice for position 7
I1009 11:43:23.942389  4481 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 81.544002ms
I1009 11:43:23.942421  4481 replica.cpp:676] Persisted action at 7
I1009 11:43:23.942453  4481 replica.cpp:655] Replica received learned notice for position 7
../../src/tests/log_tests.cpp:1484: Failure
Failed to wait 10secs for electing
I1009 11:43:25.179002  4482 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.261297165secs
I1009 11:43:25.179033  4482 replica.cpp:676] Persisted action at 7
I1009 11:43:25.179038  4482 replica.cpp:661] Replica learned APPEND action at position 7
[  FAILED  ] CoordinatorTest.TruncateLearnedFill (28317 ms)cpp:661] Replica learned APPEND action at position 7
[  FAILED  ] CoordinatorTest.TruncateLearnedFill (28317 ms)



I1009 11:47:02.996986  4763 replica.cpp:676] Persisted action at 3
I1009 11:47:02.997020  4763 replica.cpp:655] Replica received learned notice for position 3
../../src/tests/log_tests.cpp:1571: Failure
Failed to wait 10secs for recovering4
I1009 11:47:03.584002  4769 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.806120123secs
I1009 11:47:03.584002  4766 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.806115441secs
I1009 11:47:03.584002  4764 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.806241459secs
I1009 11:47:03.584058  4766 replica.cpp:676] Persisted action at 3
I1009 11:47:03.584046  4769 replica.cpp:676] Persisted action at 3
I1009 11:47:03.584064  4764 replica.cpp:676] Persisted action at 3
I1009 11:47:03.584064  4766 replica.cpp:661] Replica learned APPEND action at position 3
I1009 11:47:03.584065  4769 replica.cpp:661] Replica learned APPEND action at position 3
I1009 11:47:03.584079  4769 replica.cpp:655] Replica received learned notice for position 3
I1009 11:47:03.584094  4766 replica.cpp:655] Replica received learned notice for position 3
I1009 11:47:03.600181  4768 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 619.632972ms
I1009 11:47:03.600199  4768 replica.cpp:676] Persisted action at 3
I1009 11:47:03.600213  4768 replica.cpp:655] Replica received learned notice for position 3
I1009 11:47:04.218026  4766 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 633.908336ms
I1009 11:47:04.218060  4766 replica.cpp:676] Persisted action at 3
I1009 11:47:04.218062  4766 replica.cpp:661] Replica learned APPEND action at position 3
I1009 11:47:04.218289  4766 replica.cpp:369] Replica ignoring promise request as it is in RECOVERING status
I1009 11:47:04.802306  4763 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.805269938secs
I1009 11:47:04.802347  4763 replica.cpp:676] Persisted action at 3
I1009 11:47:04.802352  4763 replica.cpp:661] Replica learned APPEND action at position 3
I1009 11:47:05.430233  4769 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.846132648secs
I1009 11:47:05.430233  4768 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.830000622secs
I1009 11:47:05.430287  4768 replica.cpp:676] Persisted action at 3
I1009 11:47:05.430268  4769 replica.cpp:676] Persisted action at 3
I1009 11:47:05.430294  4768 replica.cpp:661] Replica learned APPEND action at position 3
I1009 11:47:05.430299  4769 replica.cpp:661] Replica learned APPEND action at position 3
I1009 11:47:05.430336  4769 replica.cpp:369] Replica ignoring promise request as it is in RECOVERING status
I1009 11:47:05.430642  4763 replica.cpp:369] Replica ignoring promise request as it is in RECOVERING status
I1009 11:47:05.430666  4764 replica.cpp:369] Replica ignoring promise request as it is in RECOVERING status
[  FAILED  ] RecoverTest.RacingCatchup (34048 ms)

  was:
make check fails for me for the following tests:
[  FAILED  ] CoordinatorTest.MultipleAppendsNotLearnedFill
[  FAILED  ] CoordinatorTest.TruncateNotLearnedFill
[  FAILED  ] CoordinatorTest.TruncateLearnedFill
[  FAILED  ] RecoverTest.RacingCatchup
============

Apart from that, all the Coordinator tests take quite a while to run. Here is an excerpt:

[----------] 21 tests from CoordinatorTest
[ RUN      ] CoordinatorTest.Elect
[       OK ] CoordinatorTest.Elect (3142 ms)
[ RUN      ] CoordinatorTest.ElectWithClockPaused
[       OK ] CoordinatorTest.ElectWithClockPaused (3198 ms)
[ RUN      ] CoordinatorTest.AppendRead
[       OK ] CoordinatorTest.AppendRead (3933 ms)
[ RUN      ] CoordinatorTest.AppendReadError
[       OK ] CoordinatorTest.AppendReadError (6246 ms)
[ RUN      ] CoordinatorTest.AppendDiscarded
[       OK ] CoordinatorTest.AppendDiscarded (3204 ms)
[ RUN      ] CoordinatorTest.ElectNoQuorum
[       OK ] CoordinatorTest.ElectNoQuorum (735 ms)
[ RUN      ] CoordinatorTest.AppendNoQuorum
[       OK ] CoordinatorTest.AppendNoQuorum (1993 ms)
[ RUN      ] CoordinatorTest.Failover
[       OK ] CoordinatorTest.Failover (7065 ms)
[ RUN      ] CoordinatorTest.Demoted
[       OK ] CoordinatorTest.Demoted (6925 ms)
[ RUN      ] CoordinatorTest.Fill
[       OK ] CoordinatorTest.Fill (8687 ms)
[ RUN      ] CoordinatorTest.NotLearnedFill
[       OK ] CoordinatorTest.NotLearnedFill (8385 ms)
[ RUN      ] CoordinatorTest.MultipleAppends
[       OK ] CoordinatorTest.MultipleAppends (15361 ms)
[ RUN      ] CoordinatorTest.MultipleAppendsNotLearnedFill
../../src/tests/log_tests.cpp:1258: Failure
Failed to wait 10secs for electing
[  FAILED  ] CoordinatorTest.MultipleAppendsNotLearnedFill (28107 ms)
[ RUN      ] CoordinatorTest.Truncate
[       OK ] CoordinatorTest.Truncate (14017 ms)
[ RUN      ] CoordinatorTest.TruncateNotLearnedFill
../../src/tests/log_tests.cpp:1399: Failure
Failed to wait 10secs for electing
[  FAILED  ] CoordinatorTest.TruncateNotLearnedFill (28857 ms)
[ RUN      ] CoordinatorTest.TruncateLearnedFill
../../src/tests/log_tests.cpp:1484: Failure
Failed to wait 10secs for electing
[  FAILED  ] CoordinatorTest.TruncateLearnedFill (26668 ms)
[ RUN      ] CoordinatorTest.RacingElect
[       OK ] CoordinatorTest.RacingElect (0 ms)
[ RUN      ] CoordinatorTest.FillNoQuorum
[       OK ] CoordinatorTest.FillNoQuorum (0 ms)
[ RUN      ] CoordinatorTest.FillInconsistent
[       OK ] CoordinatorTest.FillInconsistent (0 ms)
[ RUN      ] CoordinatorTest.LearnedOnOneReplica_NotLearnedOnAnother
[       OK ] CoordinatorTest.LearnedOnOneReplica_NotLearnedOnAnother (0 ms)
[ RUN      ] CoordinatorTest.LearnedOnOneReplica_NotLearnedOnAnother_AnotherFailsAndRecovers
[       OK ] CoordinatorTest.LearnedOnOneReplica_NotLearnedOnAnother_AnotherFailsAndRecovers (0 ms)
[----------] 21 tests from CoordinatorTest (166524 ms total)

[----------] 4 tests from RecoverTest
[ RUN      ] RecoverTest.RacingCatchup
../../src/tests/log_tests.cpp:1571: Failure
Failed to wait 10secs for recovering4
[  FAILED  ] RecoverTest.RacingCatchup (44916 ms)
[ RUN      ] RecoverTest.CatchupRetry
[       OK ] RecoverTest.CatchupRetry (25811 ms)
[ RUN      ] RecoverTest.AutoInitialization
[       OK ] RecoverTest.AutoInitialization (4573 ms)
[ RUN      ] RecoverTest.AutoInitializationRetry
[       OK ] RecoverTest.AutoInitializationRetry (4950 ms)
[----------] 4 tests from RecoverTest (80250 ms total)

[----------] 2 tests from LogTest
[ RUN      ] LogTest.WriteRead
[       OK ] LogTest.WriteRead (3238 ms)
[ RUN      ] LogTest.Position
[       OK ] LogTest.Position (3204 ms)
[----------] 2 tests from LogTest (6442 ms total)

=========================

Here is the tail log of the failed tests:


I1009 11:42:21.737470  4476 replica.cpp:508] Replica received write request for position 6
I1009 11:42:21.737491  4476 leveldb.cpp:438] Reading position from leveldb took 10414ns
I1009 11:42:21.737517  4478 replica.cpp:508] Replica received write request for position 6
I1009 11:42:21.737531  4478 leveldb.cpp:438] Reading position from leveldb took 6549ns
../../src/tests/log_tests.cpp:1258: Failure
Failed to wait 10secs for electing
I1009 11:42:22.349469  4476 leveldb.cpp:343] Persisting action (17 bytes) to leveldb took 611.945625ms
I1009 11:42:22.349550  4476 replica.cpp:676] Persisted action at 6
[  FAILED  ] CoordinatorTest.MultipleAppendsNotLearnedFill (21352 ms)



I1009 11:42:56.001920  4476 replica.cpp:676] Persisted action at 5
I1009 11:42:56.002207  4478 replica.cpp:508] Replica received write request for position 5
I1009 11:42:56.002207  4481 replica.cpp:508] Replica received write request for position 5
I1009 11:42:56.002229  4478 leveldb.cpp:438] Reading position from leveldb took 9757ns
I1009 11:42:56.002248  4481 leveldb.cpp:438] Reading position from leveldb took 19658ns
../../src/tests/log_tests.cpp:1399: Failure
Failed to wait 10secs for electing
I1009 11:42:56.643134  4481 leveldb.cpp:343] Persisting action (17 bytes) to leveldb took 640.870146ms
I1009 11:42:56.643167  4481 replica.cpp:676] Persisted action at 5
I1009 11:42:56.688416  4478 leveldb.cpp:343] Persisting action (17 bytes) to leveldb took 686.162358ms
I1009 11:42:56.688446  4478 replica.cpp:676] Persisted action at 5
[  FAILED  ] CoordinatorTest.TruncateNotLearnedFill (34512 ms)


I1009 11:43:23.917522  4480 replica.cpp:676] Persisted action at 7
I1009 11:43:23.917670  4482 replica.cpp:655] Replica received learned notice for position 7
I1009 11:43:23.942389  4481 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 81.544002ms
I1009 11:43:23.942421  4481 replica.cpp:676] Persisted action at 7
I1009 11:43:23.942453  4481 replica.cpp:655] Replica received learned notice for position 7
../../src/tests/log_tests.cpp:1484: Failure
Failed to wait 10secs for electing
I1009 11:43:25.179002  4482 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.261297165secs
I1009 11:43:25.179033  4482 replica.cpp:676] Persisted action at 7
I1009 11:43:25.179038  4482 replica.I1009 11:43:23.917522  4480 replica.cpp:676] Persisted action at 7
I1009 11:43:23.917670  4482 replica.cpp:655] Replica received learned notice for position 7
I1009 11:43:23.942389  4481 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 81.544002ms
I1009 11:43:23.942421  4481 replica.cpp:676] Persisted action at 7
I1009 11:43:23.942453  4481 replica.cpp:655] Replica received learned notice for position 7
../../src/tests/log_tests.cpp:1484: Failure
Failed to wait 10secs for electing
I1009 11:43:25.179002  4482 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.261297165secs
I1009 11:43:25.179033  4482 replica.cpp:676] Persisted action at 7
I1009 11:43:25.179038  4482 replica.cpp:661] Replica learned APPEND action at position 7
[  FAILED  ] CoordinatorTest.TruncateLearnedFill (28317 ms)cpp:661] Replica learned APPEND action at position 7
[  FAILED  ] CoordinatorTest.TruncateLearnedFill (28317 ms)



I1009 11:47:02.996986  4763 replica.cpp:676] Persisted action at 3
I1009 11:47:02.997020  4763 replica.cpp:655] Replica received learned notice for position 3
../../src/tests/log_tests.cpp:1571: Failure
Failed to wait 10secs for recovering4
I1009 11:47:03.584002  4769 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.806120123secs
I1009 11:47:03.584002  4766 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.806115441secs
I1009 11:47:03.584002  4764 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.806241459secs
I1009 11:47:03.584058  4766 replica.cpp:676] Persisted action at 3
I1009 11:47:03.584046  4769 replica.cpp:676] Persisted action at 3
I1009 11:47:03.584064  4764 replica.cpp:676] Persisted action at 3
I1009 11:47:03.584064  4766 replica.cpp:661] Replica learned APPEND action at position 3
I1009 11:47:03.584065  4769 replica.cpp:661] Replica learned APPEND action at position 3
I1009 11:47:03.584079  4769 replica.cpp:655] Replica received learned notice for position 3
I1009 11:47:03.584094  4766 replica.cpp:655] Replica received learned notice for position 3
I1009 11:47:03.600181  4768 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 619.632972ms
I1009 11:47:03.600199  4768 replica.cpp:676] Persisted action at 3
I1009 11:47:03.600213  4768 replica.cpp:655] Replica received learned notice for position 3
I1009 11:47:04.218026  4766 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 633.908336ms
I1009 11:47:04.218060  4766 replica.cpp:676] Persisted action at 3
I1009 11:47:04.218062  4766 replica.cpp:661] Replica learned APPEND action at position 3
I1009 11:47:04.218289  4766 replica.cpp:369] Replica ignoring promise request as it is in RECOVERING status
I1009 11:47:04.802306  4763 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.805269938secs
I1009 11:47:04.802347  4763 replica.cpp:676] Persisted action at 3
I1009 11:47:04.802352  4763 replica.cpp:661] Replica learned APPEND action at position 3
I1009 11:47:05.430233  4769 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.846132648secs
I1009 11:47:05.430233  4768 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.830000622secs
I1009 11:47:05.430287  4768 replica.cpp:676] Persisted action at 3
I1009 11:47:05.430268  4769 replica.cpp:676] Persisted action at 3
I1009 11:47:05.430294  4768 replica.cpp:661] Replica learned APPEND action at position 3
I1009 11:47:05.430299  4769 replica.cpp:661] Replica learned APPEND action at position 3
I1009 11:47:05.430336  4769 replica.cpp:369] Replica ignoring promise request as it is in RECOVERING status
I1009 11:47:05.430642  4763 replica.cpp:369] Replica ignoring promise request as it is in RECOVERING status
I1009 11:47:05.430666  4764 replica.cpp:369] Replica ignoring promise request as it is in RECOVERING status
[  FAILED  ] RecoverTest.RacingCatchup (34048 ms)


> CoordinatorTest.*LearnedFill and RecoverTest.RacingCatchup tests failing
> ------------------------------------------------------------------------
>
>                 Key: MESOS-1890
>                 URL: https://issues.apache.org/jira/browse/MESOS-1890
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>         Environment: Mesos HEAD on OpenSUSE Factory running on Macbook Pro with the following:
> $:...mesos/mesos-master-clean/build> uname -a
> Linux Kapil-MBP.localdomain 3.16.2-1.gdcee397-desktop #1 SMP PREEMPT Sun Sep 7 04:40:17 UTC 2014 (dcee397) x86_64 x86_64 x86_64 GNU/Linux
> $:...mesos/mesos-master-clean/build> gcc --version
> gcc (SUSE Linux) 4.9.0
> Copyright (C) 2014 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> $:...mesos/mesos-master-clean/build> /lib64/libc.so.6 --version
> GNU C Library (GNU libc) stable release version 2.19 (git 9a869d822025), by Roland McGrath et al.
> Copyright (C) 2014 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.
> There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
> PARTICULAR PURPOSE.
> Configured for x86_64-suse-linux.
> Compiled by GNU CC version 4.8.3 20140627 [gcc-4_8-branch revision 212064].
> Compiled on a Linux 3.16.0 system.
> Available extensions:
>         crypt add-on version 2.1 by Michael Glad and others
>         GNU Libidn by Simon Josefsson
>         Native POSIX Threads Library by Ulrich Drepper et al
>         BIND-8.2.3-T5B
> libc ABIs: UNIQUE IFUNC
> For bug reporting instructions, please see:
> <http://bugs.opensuse.org>.
>            Reporter: Kapil Arya
>         Attachments: mesos-failed-tests.txt
>
>
> make check fails for me for the following tests:
> [  FAILED  ] CoordinatorTest.MultipleAppendsNotLearnedFill
> [  FAILED  ] CoordinatorTest.TruncateNotLearnedFill
> [  FAILED  ] CoordinatorTest.TruncateLearnedFill
> [  FAILED  ] RecoverTest.RacingCatchup
> ============
> Apart from that, all the Coordinator tests take quite a while to run. Here is an excerpt:
> [----------] 21 tests from CoordinatorTest
> [ RUN      ] CoordinatorTest.Elect
> [       OK ] CoordinatorTest.Elect (3142 ms)
> [ RUN      ] CoordinatorTest.ElectWithClockPaused
> [       OK ] CoordinatorTest.ElectWithClockPaused (3198 ms)
> [ RUN      ] CoordinatorTest.AppendRead
> [       OK ] CoordinatorTest.AppendRead (3933 ms)
> [ RUN      ] CoordinatorTest.AppendReadError
> [       OK ] CoordinatorTest.AppendReadError (6246 ms)
> [ RUN      ] CoordinatorTest.AppendDiscarded
> [       OK ] CoordinatorTest.AppendDiscarded (3204 ms)
> [ RUN      ] CoordinatorTest.ElectNoQuorum
> [       OK ] CoordinatorTest.ElectNoQuorum (735 ms)
> [ RUN      ] CoordinatorTest.AppendNoQuorum
> [       OK ] CoordinatorTest.AppendNoQuorum (1993 ms)
> [ RUN      ] CoordinatorTest.Failover
> [       OK ] CoordinatorTest.Failover (7065 ms)
> [ RUN      ] CoordinatorTest.Demoted
> [       OK ] CoordinatorTest.Demoted (6925 ms)
> [ RUN      ] CoordinatorTest.Fill
> [       OK ] CoordinatorTest.Fill (8687 ms)
> [ RUN      ] CoordinatorTest.NotLearnedFill
> [       OK ] CoordinatorTest.NotLearnedFill (8385 ms)
> [ RUN      ] CoordinatorTest.MultipleAppends
> [       OK ] CoordinatorTest.MultipleAppends (15361 ms)
> [ RUN      ] CoordinatorTest.MultipleAppendsNotLearnedFill
> ../../src/tests/log_tests.cpp:1258: Failure
> Failed to wait 10secs for electing
> [  FAILED  ] CoordinatorTest.MultipleAppendsNotLearnedFill (28107 ms)
> [ RUN      ] CoordinatorTest.Truncate
> [       OK ] CoordinatorTest.Truncate (14017 ms)
> [ RUN      ] CoordinatorTest.TruncateNotLearnedFill
> ../../src/tests/log_tests.cpp:1399: Failure
> Failed to wait 10secs for electing
> [  FAILED  ] CoordinatorTest.TruncateNotLearnedFill (28857 ms)
> [ RUN      ] CoordinatorTest.TruncateLearnedFill
> ../../src/tests/log_tests.cpp:1484: Failure
> Failed to wait 10secs for electing
> [  FAILED  ] CoordinatorTest.TruncateLearnedFill (26668 ms)
> [ RUN      ] CoordinatorTest.RacingElect
> [       OK ] CoordinatorTest.RacingElect (0 ms)
> [ RUN      ] CoordinatorTest.FillNoQuorum
> [       OK ] CoordinatorTest.FillNoQuorum (0 ms)
> [ RUN      ] CoordinatorTest.FillInconsistent
> [       OK ] CoordinatorTest.FillInconsistent (0 ms)
> [ RUN      ] CoordinatorTest.LearnedOnOneReplica_NotLearnedOnAnother
> [       OK ] CoordinatorTest.LearnedOnOneReplica_NotL
> pick
> earnedOnAnother (0 ms)
> [ RUN      ] CoordinatorTest.LearnedOnOneReplica_NotLearnedOnAnother_AnotherFailsAndRecovers
> [       OK ] CoordinatorTest.LearnedOnOneReplica_NotLearnedOnAnother_AnotherFailsAndRecovers (0 ms)
> [----------] 21 tests from CoordinatorTest (166524 ms total)
> [----------] 4 tests from RecoverTest
> [ RUN      ] RecoverTest.RacingCatchup
> ../../src/tests/log_tests.cpp:1571: Failure
> Failed to wait 10secs for recovering4
> [  FAILED  ] RecoverTest.RacingCatchup (44916 ms)
> [ RUN      ] RecoverTest.CatchupRetry
> [       OK ] RecoverTest.CatchupRetry (25811 ms)
> [ RUN      ] RecoverTest.AutoInitialization
> [       OK ] RecoverTest.AutoInitialization (4573 ms)
> [ RUN      ] RecoverTest.AutoInitializationRetry
> [       OK ] RecoverTest.AutoInitializationRetry (4950 ms)
> [----------] 4 tests from RecoverTest (80250 ms total)
> [----------] 2 tests from LogTest
> [ RUN      ] LogTest.WriteRead
> [       OK ] LogTest.WriteRead (3238 ms)
> [ RUN      ] LogTest.Position
> [       OK ] LogTest.Position (3204 ms)
> [----------] 2 tests from LogTest (6442 ms total)
> =========================
> Here is the tail log of the failed tests:
> I1009 11:42:21.737470  4476 replica.cpp:508] Replica received write request for position 6
> I1009 11:42:21.737491  4476 leveldb.cpp:438] Reading position from leveldb took 10414ns
> I1009 11:42:21.737517  4478 replica.cpp:508] Replica received write request for position 6
> I1009 11:42:21.737531  4478 leveldb.cpp:438] Reading position from leveldb took 6549ns
> ../../src/tests/log_tests.cpp:1258: Failure
> Failed to wait 10secs for electing
> I1009 11:42:22.349469  4476 leveldb.cpp:343] Persisting action (17 bytes) to leveldb took 611.945625ms
> I1009 11:42:22.349550  4476 replica.cpp:676] Persisted action at 6
> [  FAILED  ] CoordinatorTest.MultipleAppendsNotLearnedFill (21352 ms)
> I1009 11:42:56.001920  4476 replica.cpp:676] Persisted action at 5
> I1009 11:42:56.002207  4478 replica.cpp:508] Replica received write request for position 5
> I1009 11:42:56.002207  4481 replica.cpp:508] Replica received write request for position 5
> I1009 11:42:56.002229  4478 leveldb.cpp:438] Reading position from leveldb took 9757ns
> I1009 11:42:56.002248  4481 leveldb.cpp:438] Reading position from leveldb took 19658ns
> ../../src/tests/log_tests.cpp:1399: Failure
> Failed to wait 10secs for electing
> I1009 11:42:56.643134  4481 leveldb.cpp:343] Persisting action (17 bytes) to leveldb took 640.870146ms
> I1009 11:42:56.643167  4481 replica.cpp:676] Persisted action at 5
> I1009 11:42:56.688416  4478 leveldb.cpp:343] Persisting action (17 bytes) to leveldb took 686.162358ms
> I1009 11:42:56.688446  4478 replica.cpp:676] Persisted action at 5
> [  FAILED  ] CoordinatorTest.TruncateNotLearnedFill (34512 ms)
> I1009 11:43:23.917522  4480 replica.cpp:676] Persisted action at 7
> I1009 11:43:23.917670  4482 replica.cpp:655] Replica received learned notice for position 7
> I1009 11:43:23.942389  4481 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 81.544002ms
> I1009 11:43:23.942421  4481 replica.cpp:676] Persisted action at 7
> I1009 11:43:23.942453  4481 replica.cpp:655] Replica received learned notice for position 7
> ../../src/tests/log_tests.cpp:1484: Failure
> Failed to wait 10secs for electing
> I1009 11:43:25.179002  4482 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.261297165secs
> I1009 11:43:25.179033  4482 replica.cpp:676] Persisted action at 7
> I1009 11:43:25.179038  4482 replica.I1009 11:43:23.917522  4480 replica.cpp:676] Persisted action at 7
> I1009 11:43:23.917670  4482 replica.cpp:655] Replica received learned notice for position 7
> I1009 11:43:23.942389  4481 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 81.544002ms
> I1009 11:43:23.942421  4481 replica.cpp:676] Persisted action at 7
> I1009 11:43:23.942453  4481 replica.cpp:655] Replica received learned notice for position 7
> ../../src/tests/log_tests.cpp:1484: Failure
> Failed to wait 10secs for electing
> I1009 11:43:25.179002  4482 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.261297165secs
> I1009 11:43:25.179033  4482 replica.cpp:676] Persisted action at 7
> I1009 11:43:25.179038  4482 replica.cpp:661] Replica learned APPEND action at position 7
> [  FAILED  ] CoordinatorTest.TruncateLearnedFill (28317 ms)cpp:661] Replica learned APPEND action at position 7
> [  FAILED  ] CoordinatorTest.TruncateLearnedFill (28317 ms)
> I1009 11:47:02.996986  4763 replica.cpp:676] Persisted action at 3
> I1009 11:47:02.997020  4763 replica.cpp:655] Replica received learned notice for position 3
> ../../src/tests/log_tests.cpp:1571: Failure
> Failed to wait 10secs for recovering4
> I1009 11:47:03.584002  4769 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.806120123secs
> I1009 11:47:03.584002  4766 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.806115441secs
> I1009 11:47:03.584002  4764 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.806241459secs
> I1009 11:47:03.584058  4766 replica.cpp:676] Persisted action at 3
> I1009 11:47:03.584046  4769 replica.cpp:676] Persisted action at 3
> I1009 11:47:03.584064  4764 replica.cpp:676] Persisted action at 3
> I1009 11:47:03.584064  4766 replica.cpp:661] Replica learned APPEND action at position 3
> I1009 11:47:03.584065  4769 replica.cpp:661] Replica learned APPEND action at position 3
> I1009 11:47:03.584079  4769 replica.cpp:655] Replica received learned notice for position 3
> I1009 11:47:03.584094  4766 replica.cpp:655] Replica received learned notice for position 3
> I1009 11:47:03.600181  4768 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 619.632972ms
> I1009 11:47:03.600199  4768 replica.cpp:676] Persisted action at 3
> I1009 11:47:03.600213  4768 replica.cpp:655] Replica received learned notice for position 3
> I1009 11:47:04.218026  4766 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 633.908336ms
> I1009 11:47:04.218060  4766 replica.cpp:676] Persisted action at 3
> I1009 11:47:04.218062  4766 replica.cpp:661] Replica learned APPEND action at position 3
> I1009 11:47:04.218289  4766 replica.cpp:369] Replica ignoring promise request as it is in RECOVERING status
> I1009 11:47:04.802306  4763 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.805269938secs
> I1009 11:47:04.802347  4763 replica.cpp:676] Persisted action at 3
> I1009 11:47:04.802352  4763 replica.cpp:661] Replica learned APPEND action at position 3
> I1009 11:47:05.430233  4769 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.846132648secs
> I1009 11:47:05.430233  4768 leveldb.cpp:343] Persisting action (19 bytes) to leveldb took 1.830000622secs
> I1009 11:47:05.430287  4768 replica.cpp:676] Persisted action at 3
> I1009 11:47:05.430268  4769 replica.cpp:676] Persisted action at 3
> I1009 11:47:05.430294  4768 replica.cpp:661] Replica learned APPEND action at position 3
> I1009 11:47:05.430299  4769 replica.cpp:661] Replica learned APPEND action at position 3
> I1009 11:47:05.430336  4769 replica.cpp:369] Replica ignoring promise request as it is in RECOVERING status
> I1009 11:47:05.430642  4763 replica.cpp:369] Replica ignoring promise request as it is in RECOVERING status
> I1009 11:47:05.430666  4764 replica.cpp:369] Replica ignoring promise request as it is in RECOVERING status
> [  FAILED  ] RecoverTest.RacingCatchup (34048 ms)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)