You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Meng Zhu (JIRA)" <ji...@apache.org> on 2019/01/03 05:07:00 UTC
[jira] [Commented] (MESOS-5048)
MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky
[ https://issues.apache.org/jira/browse/MESOS-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16732670#comment-16732670 ]
Meng Zhu commented on MESOS-5048:
---------------------------------
Observed being flaky on our CI, re-opening the ticket. Uploaded `ResourceStatistics-badrun4.txt`.
Looks like a different race from last time.
There is a race between executor shutdown (due to never getting any tasks) and the test querying resource statistics. If the executor
is shutdown before the statistics query, the test will fail.
{noformat}
W0102 12:41:15.709300 27356 slave.cpp:5182] Shutting down reregistering executor '853030e8-dbf8-4493-8f02-557e061ad79a' of framework a4181b2d-03c1-4e32-ad0d-b2dd91245ca1-0000 at executor(1)@172.16.10.156:38324 because it has no tasks to run and has never been sent a task
{noformat}
> MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky
> ---------------------------------------------------------------
>
> Key: MESOS-5048
> URL: https://issues.apache.org/jira/browse/MESOS-5048
> Project: Mesos
> Issue Type: Bug
> Components: test
> Affects Versions: 0.28.0
> Environment: Ubuntu 15.04, Ubuntu 16.04
> Reporter: Jian Qiu
> Assignee: Meng Zhu
> Priority: Major
> Labels: flaky-test
> Fix For: 1.8.0
>
> Attachments: ResourceStatistics-badrun2.txt, ResourceStatistics-badrun3.txt, ResourceStatistics-badrun4.txt
>
>
> ./mesos-tests.sh --gtest_filter=MesosContainerizerSlaveRecoveryTest.ResourceStatistics --gtest_repeat=100 --gtest_break_on_failure
> This is found in rb, and reproduced in my local machine. There are two types of failures. However, the failure does not appear when enabling verbose...
> {code}
> ../../src/tests/environment.cpp:790: Failure
> Failed
> Tests completed with child processes remaining:
> -+- 1446 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-tests
> \-+- 9171 sh -c /mesos/mesos-0.29.0/_build/src/mesos-executor
> \--- 9185 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-executor
> {code}
> And
> {code}
> I0328 15:42:36.982471 5687 exec.cpp:150] Version: 0.29.0
> I0328 15:42:37.008765 5708 exec.cpp:225] Executor registered on slave 731fb93b-26fe-4c7c-a543-fc76f106a62e-S0
> Registered executor on mesos
> ../../src/tests/slave_recovery_tests.cpp:3506: Failure
> Value of: containers.get().size()
> Actual: 0
> Expected: 1u
> Which is: 1
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)