You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Meng Zhu (JIRA)" <ji...@apache.org> on 2019/02/14 01:34:00 UTC
[jira] [Commented] (MESOS-5048)
MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky
[ https://issues.apache.org/jira/browse/MESOS-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767776#comment-16767776 ]
Meng Zhu commented on MESOS-5048:
---------------------------------
commit 1875b5380de926ce1759715227883418e2fb9717
Author: Meng Zhu <mz...@mesosphere.io>
Date: Wed Jan 2 20:51:25 2019 -0800
Fixed test `MesosContainerizerSlaveRecoveryTest.ResourceStatistics`.
`MesosContainerizerSlaveRecoveryTest.ResourceStatistics` is flaky
due to a race between executor shutdown (due to never getting any
tasks) and the test querying resource statistics. If the executor
is shutdown before the statistics query, the test will fail.
This patch fixes the test by explicitly waiting for the task to
be delivered and task status transition to `TASK_RUNNING` before
restarting the agent. This way, the executor will not be shutdown
after agent restart. Hence there will be no race.
Review: https://reviews.apache.org/r/69656
> MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky
> ---------------------------------------------------------------
>
> Key: MESOS-5048
> URL: https://issues.apache.org/jira/browse/MESOS-5048
> Project: Mesos
> Issue Type: Bug
> Components: test
> Affects Versions: 0.28.0
> Environment: Ubuntu 15.04, Ubuntu 16.04
> Reporter: Jian Qiu
> Assignee: Meng Zhu
> Priority: Major
> Labels: flaky-test
> Fix For: 1.8.0
>
> Attachments: ResourceStatistics-badrun2.txt, ResourceStatistics-badrun3.txt, ResourceStatistics-badrun4.txt
>
>
> ./mesos-tests.sh --gtest_filter=MesosContainerizerSlaveRecoveryTest.ResourceStatistics --gtest_repeat=100 --gtest_break_on_failure
> This is found in rb, and reproduced in my local machine. There are two types of failures. However, the failure does not appear when enabling verbose...
> {code}
> ../../src/tests/environment.cpp:790: Failure
> Failed
> Tests completed with child processes remaining:
> -+- 1446 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-tests
> \-+- 9171 sh -c /mesos/mesos-0.29.0/_build/src/mesos-executor
> \--- 9185 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-executor
> {code}
> And
> {code}
> I0328 15:42:36.982471 5687 exec.cpp:150] Version: 0.29.0
> I0328 15:42:37.008765 5708 exec.cpp:225] Executor registered on slave 731fb93b-26fe-4c7c-a543-fc76f106a62e-S0
> Registered executor on mesos
> ../../src/tests/slave_recovery_tests.cpp:3506: Failure
> Value of: containers.get().size()
> Actual: 0
> Expected: 1u
> Which is: 1
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)