You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Meng Zhu <mz...@mesosphere.io> on 2019/01/03 05:16:09 UTC

Review Request 69656: Fixed test `MesosContainerizerSlaveRecoveryTest.ResourceStatistics`.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69656/
-----------------------------------------------------------

Review request for mesos, Joseph Wu and Till Toenshoff.


Bugs: MESOS-5048
    https://issues.apache.org/jira/browse/MESOS-5048


Repository: mesos


Description
-------

`MesosContainerizerSlaveRecoveryTest.ResourceStatistics` is flaky
due to a race between executor shutdown (due to never getting any
tasks) and the test querying resource statistics. If the executor
is shutdown before the statistics query, the test will fail.

This patch fixes the test by explicitly waiting for the task to
be delivered and task status transition to `TASK_RUNNING` before
restarting the agent. This way, the executor will not be shutdown
after agent restart. Hence there will be no race.


Diffs
-----

  src/tests/slave_recovery_tests.cpp 0eb47e2bdf6a46fc21b59bb85b4b89181087ccd3 


Diff: https://reviews.apache.org/r/69656/diff/1/


Testing
-------

ran `MesosContainerizerSlaveRecoveryTest.ResourceStatistics` continuously without failure.


Thanks,

Meng Zhu


Re: Review Request 69656: Fixed test `MesosContainerizerSlaveRecoveryTest.ResourceStatistics`.

Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69656/#review212468
-----------------------------------------------------------



Patch looks great!

Reviews applied: [69656]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose --disable-libtool-wrappers --disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Jan. 3, 2019, 5:16 a.m., Meng Zhu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69656/
> -----------------------------------------------------------
> 
> (Updated Jan. 3, 2019, 5:16 a.m.)
> 
> 
> Review request for mesos, Joseph Wu and Till Toenshoff.
> 
> 
> Bugs: MESOS-5048
>     https://issues.apache.org/jira/browse/MESOS-5048
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> `MesosContainerizerSlaveRecoveryTest.ResourceStatistics` is flaky
> due to a race between executor shutdown (due to never getting any
> tasks) and the test querying resource statistics. If the executor
> is shutdown before the statistics query, the test will fail.
> 
> This patch fixes the test by explicitly waiting for the task to
> be delivered and task status transition to `TASK_RUNNING` before
> restarting the agent. This way, the executor will not be shutdown
> after agent restart. Hence there will be no race.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_recovery_tests.cpp 0eb47e2bdf6a46fc21b59bb85b4b89181087ccd3 
> 
> 
> Diff: https://reviews.apache.org/r/69656/diff/1/
> 
> 
> Testing
> -------
> 
> ran `MesosContainerizerSlaveRecoveryTest.ResourceStatistics` continuously without failure.
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>


Re: Review Request 69656: Fixed test `MesosContainerizerSlaveRecoveryTest.ResourceStatistics`.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69656/#review211965
-----------------------------------------------------------



PASS: Mesos patch 69656 was successfully built and tested.

Reviews applied: `['69656']`

All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2773/mesos-review-69656

- Mesos Reviewbot Windows


On Jan. 3, 2019, 5:16 a.m., Meng Zhu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69656/
> -----------------------------------------------------------
> 
> (Updated Jan. 3, 2019, 5:16 a.m.)
> 
> 
> Review request for mesos, Joseph Wu and Till Toenshoff.
> 
> 
> Bugs: MESOS-5048
>     https://issues.apache.org/jira/browse/MESOS-5048
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> `MesosContainerizerSlaveRecoveryTest.ResourceStatistics` is flaky
> due to a race between executor shutdown (due to never getting any
> tasks) and the test querying resource statistics. If the executor
> is shutdown before the statistics query, the test will fail.
> 
> This patch fixes the test by explicitly waiting for the task to
> be delivered and task status transition to `TASK_RUNNING` before
> restarting the agent. This way, the executor will not be shutdown
> after agent restart. Hence there will be no race.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_recovery_tests.cpp 0eb47e2bdf6a46fc21b59bb85b4b89181087ccd3 
> 
> 
> Diff: https://reviews.apache.org/r/69656/diff/1/
> 
> 
> Testing
> -------
> 
> ran `MesosContainerizerSlaveRecoveryTest.ResourceStatistics` continuously without failure.
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>


Re: Review Request 69656: Fixed test `MesosContainerizerSlaveRecoveryTest.ResourceStatistics`.

Posted by Meng Zhu <mz...@mesosphere.io>.

> On Jan. 14, 2019, 11:41 a.m., Joseph Wu wrote:
> > src/tests/slave_recovery_tests.cpp
> > Lines 5152-5153 (original), 5153-5154 (patched)
> > <https://reviews.apache.org/r/69656/diff/1/?file=2117263#file2117263line5153>
> >
> >     Not terribly impactful for the test, but since you are capturing the TASK_STARTING update, you could check the status before the TASK_RUNNING one.
> >     ```
> >       AWAIT_READY(statusStarting);
> >       EXPECT_EQ(TASK_STARTING, statusStarting->state());
> >     ```

Done.


- Meng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69656/#review211962
-----------------------------------------------------------


On Jan. 2, 2019, 9:16 p.m., Meng Zhu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69656/
> -----------------------------------------------------------
> 
> (Updated Jan. 2, 2019, 9:16 p.m.)
> 
> 
> Review request for mesos, Joseph Wu and Till Toenshoff.
> 
> 
> Bugs: MESOS-5048
>     https://issues.apache.org/jira/browse/MESOS-5048
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> `MesosContainerizerSlaveRecoveryTest.ResourceStatistics` is flaky
> due to a race between executor shutdown (due to never getting any
> tasks) and the test querying resource statistics. If the executor
> is shutdown before the statistics query, the test will fail.
> 
> This patch fixes the test by explicitly waiting for the task to
> be delivered and task status transition to `TASK_RUNNING` before
> restarting the agent. This way, the executor will not be shutdown
> after agent restart. Hence there will be no race.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_recovery_tests.cpp 0eb47e2bdf6a46fc21b59bb85b4b89181087ccd3 
> 
> 
> Diff: https://reviews.apache.org/r/69656/diff/1/
> 
> 
> Testing
> -------
> 
> ran `MesosContainerizerSlaveRecoveryTest.ResourceStatistics` continuously without failure.
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>


Re: Review Request 69656: Fixed test `MesosContainerizerSlaveRecoveryTest.ResourceStatistics`.

Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69656/#review211962
-----------------------------------------------------------


Fix it, then Ship it!





src/tests/slave_recovery_tests.cpp
Lines 5152-5153 (original), 5153-5154 (patched)
<https://reviews.apache.org/r/69656/#comment297542>

    Not terribly impactful for the test, but since you are capturing the TASK_STARTING update, you could check the status before the TASK_RUNNING one.
    ```
      AWAIT_READY(statusStarting);
      EXPECT_EQ(TASK_STARTING, statusStarting->state());
    ```


- Joseph Wu


On Jan. 2, 2019, 9:16 p.m., Meng Zhu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69656/
> -----------------------------------------------------------
> 
> (Updated Jan. 2, 2019, 9:16 p.m.)
> 
> 
> Review request for mesos, Joseph Wu and Till Toenshoff.
> 
> 
> Bugs: MESOS-5048
>     https://issues.apache.org/jira/browse/MESOS-5048
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> `MesosContainerizerSlaveRecoveryTest.ResourceStatistics` is flaky
> due to a race between executor shutdown (due to never getting any
> tasks) and the test querying resource statistics. If the executor
> is shutdown before the statistics query, the test will fail.
> 
> This patch fixes the test by explicitly waiting for the task to
> be delivered and task status transition to `TASK_RUNNING` before
> restarting the agent. This way, the executor will not be shutdown
> after agent restart. Hence there will be no race.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_recovery_tests.cpp 0eb47e2bdf6a46fc21b59bb85b4b89181087ccd3 
> 
> 
> Diff: https://reviews.apache.org/r/69656/diff/1/
> 
> 
> Testing
> -------
> 
> ran `MesosContainerizerSlaveRecoveryTest.ResourceStatistics` continuously without failure.
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>