You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Jan Schlicht <ja...@mesosphere.io> on 2015/12/11 12:15:10 UTC

Review Request 40966: Corrected termination of Docker containers.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40966/
-----------------------------------------------------------

Review request for mesos, Greg Mann, haosdent huang, Jojy Varghese, and Till Toenshoff.


Bugs: MESOS-4025
    https://issues.apache.org/jira/browse/MESOS-4025


Repository: mesos


Description
-------

Tests cases have to wait until a container has been terminated by the
DockerContainerizer. Otherwise there could be artifacts (e.g. locked cgroups)
that can affect later test cases (see MESOS-4025, where cgroups couldn't be
removed).


Diffs
-----

  src/tests/health_check_tests.cpp b1454b085b36bb7c4d8ef012c764cd8466b4fb02 

Diff: https://reviews.apache.org/r/40966/diff/


Testing
-------

make check
sudo ./bin/mesos-tests.sh --gtest_repeat=50 --gtest_filter="HealthCheckTest.ROOT_DOCKER_*:SlaveRecoveryTest*GCExecutor"


Thanks,

Jan Schlicht


Re: Review Request 40966: Corrected termination of Docker containers.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40966/#review110120
-----------------------------------------------------------


Patch looks great!

Reviews applied: [40966]

Passed command: export OS=ubuntu:14.04;export CONFIGURATION="--verbose";export COMPILER=gcc; ./support/docker_build.sh

- Mesos ReviewBot


On Dec. 11, 2015, 11:15 a.m., Jan Schlicht wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40966/
> -----------------------------------------------------------
> 
> (Updated Dec. 11, 2015, 11:15 a.m.)
> 
> 
> Review request for mesos, Greg Mann, haosdent huang, Jojy Varghese, and Till Toenshoff.
> 
> 
> Bugs: MESOS-4025
>     https://issues.apache.org/jira/browse/MESOS-4025
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Tests cases have to wait until a container has been terminated by the
> DockerContainerizer. Otherwise there could be artifacts (e.g. locked cgroups)
> that can affect later test cases (see MESOS-4025, where cgroups couldn't be
> removed).
> 
> 
> Diffs
> -----
> 
>   src/tests/health_check_tests.cpp b1454b085b36bb7c4d8ef012c764cd8466b4fb02 
> 
> Diff: https://reviews.apache.org/r/40966/diff/
> 
> 
> Testing
> -------
> 
> make check
> sudo ./bin/mesos-tests.sh --gtest_repeat=50 --gtest_filter="HealthCheckTest.ROOT_DOCKER_*:SlaveRecoveryTest*GCExecutor"
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>


Re: Review Request 40966: Corrected termination of Docker containers.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40966/#review110317
-----------------------------------------------------------


Hi Jan! I tried your patch and still got some failures on my Ubuntu 14.04 VM. I did the following to build & test, both before and after your patch:

`./bootstrap`
`cd build && ../configure`
`sudo GTEST_FILTER="" make -j6 check`
`sudo bin/mesos-tests.sh`
`sudo GTEST_FILTER="SlaveRecoveryTest*" bin/mesos-tests.sh`

Before your patch, I saw a couple of the `SlaveRecoveryTest`s fail after `sudo bin/mesos-tests.sh`, then they all failed when they were re-run in the final command. After your patch, all of the `SlaveRecoveryTest`s passed during `sudo bin/mesos-tests.sh`, but then they all still failed during the final step. Looks like the same error I was seeing before:

[ RUN      ] SlaveRecoveryTest/0.MasterFailover
../../src/tests/mesos.cpp:906: Failure
(cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup '/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy
-----------------------------------------------------------
We're very sorry but we can't seem to destroy existing
cgroups that we likely created as part of an earlier
invocation of the tests. Please manually destroy the cgroup
at '/sys/fs/cgroup/perf_event/mesos_test' by first
manually killing all the processes found in the file at '/sys/fs/cgroup/perf_event/mesos_test/tasks'
-----------------------------------------------------------
../../src/tests/mesos.cpp:940: Failure
(cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup '/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy
[  FAILED  ] SlaveRecoveryTest/0.MasterFailover, where TypeParam = mesos::internal::slave::MesosContainerizer (18 ms)

- Greg Mann


On Dec. 11, 2015, 11:15 a.m., Jan Schlicht wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40966/
> -----------------------------------------------------------
> 
> (Updated Dec. 11, 2015, 11:15 a.m.)
> 
> 
> Review request for mesos, Greg Mann, haosdent huang, Jojy Varghese, and Till Toenshoff.
> 
> 
> Bugs: MESOS-4025
>     https://issues.apache.org/jira/browse/MESOS-4025
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Tests cases have to wait until a container has been terminated by the
> DockerContainerizer. Otherwise there could be artifacts (e.g. locked cgroups)
> that can affect later test cases (see MESOS-4025, where cgroups couldn't be
> removed).
> 
> 
> Diffs
> -----
> 
>   src/tests/health_check_tests.cpp b1454b085b36bb7c4d8ef012c764cd8466b4fb02 
> 
> Diff: https://reviews.apache.org/r/40966/diff/
> 
> 
> Testing
> -------
> 
> make check
> sudo ./bin/mesos-tests.sh --gtest_repeat=50 --gtest_filter="HealthCheckTest.ROOT_DOCKER_*:SlaveRecoveryTest*GCExecutor"
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>


Re: Review Request 40966: Corrected termination of Docker containers.

Posted by haosdent huang <ha...@gmail.com>.

> On Dec. 12, 2015, 8:36 a.m., haosdent huang wrote:
> > Ship It!

Thank you very much. I verify this in Ubuntu 14.04 with your test command. Before apply your patch could reproduce problem, after apply this patch could pass.


- haosdent


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40966/#review110076
-----------------------------------------------------------


On Dec. 11, 2015, 11:15 a.m., Jan Schlicht wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40966/
> -----------------------------------------------------------
> 
> (Updated Dec. 11, 2015, 11:15 a.m.)
> 
> 
> Review request for mesos, Greg Mann, haosdent huang, Jojy Varghese, and Till Toenshoff.
> 
> 
> Bugs: MESOS-4025
>     https://issues.apache.org/jira/browse/MESOS-4025
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Tests cases have to wait until a container has been terminated by the
> DockerContainerizer. Otherwise there could be artifacts (e.g. locked cgroups)
> that can affect later test cases (see MESOS-4025, where cgroups couldn't be
> removed).
> 
> 
> Diffs
> -----
> 
>   src/tests/health_check_tests.cpp b1454b085b36bb7c4d8ef012c764cd8466b4fb02 
> 
> Diff: https://reviews.apache.org/r/40966/diff/
> 
> 
> Testing
> -------
> 
> make check
> sudo ./bin/mesos-tests.sh --gtest_repeat=50 --gtest_filter="HealthCheckTest.ROOT_DOCKER_*:SlaveRecoveryTest*GCExecutor"
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>


Re: Review Request 40966: Corrected termination of Docker containers.

Posted by haosdent huang <ha...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40966/#review110076
-----------------------------------------------------------

Ship it!


Ship It!

- haosdent huang


On Dec. 11, 2015, 11:15 a.m., Jan Schlicht wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40966/
> -----------------------------------------------------------
> 
> (Updated Dec. 11, 2015, 11:15 a.m.)
> 
> 
> Review request for mesos, Greg Mann, haosdent huang, Jojy Varghese, and Till Toenshoff.
> 
> 
> Bugs: MESOS-4025
>     https://issues.apache.org/jira/browse/MESOS-4025
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Tests cases have to wait until a container has been terminated by the
> DockerContainerizer. Otherwise there could be artifacts (e.g. locked cgroups)
> that can affect later test cases (see MESOS-4025, where cgroups couldn't be
> removed).
> 
> 
> Diffs
> -----
> 
>   src/tests/health_check_tests.cpp b1454b085b36bb7c4d8ef012c764cd8466b4fb02 
> 
> Diff: https://reviews.apache.org/r/40966/diff/
> 
> 
> Testing
> -------
> 
> make check
> sudo ./bin/mesos-tests.sh --gtest_repeat=50 --gtest_filter="HealthCheckTest.ROOT_DOCKER_*:SlaveRecoveryTest*GCExecutor"
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>