You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Gaston Kleiman <ga...@mesosphere.io> on 2018/02/07 20:05:25 UTC
Review Request 65552: Added a regression test for MESOS-8468.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65552/
-----------------------------------------------------------
Review request for mesos, Anand Mazumdar, Greg Mann, Qian Zhang, and Vinod Kone.
Bugs: MESOS-8468
https://issues.apache.org/jira/browse/MESOS-8468
Repository: mesos
Description
-------
Added a regression test for MESOS-8468.
Diffs
-----
src/tests/default_executor_tests.cpp cc97e0d1fea7f4d0bc544d850593d8d91921b552
Diff: https://reviews.apache.org/r/65552/diff/1/
Testing
-------
`GLOG_v=1 sudo bin/mesos-tests.sh --gtest_filter='*ROOT_LaunchGroupFailure*' --verbose --gtest_repeat=650 --gtest_break_on_failure` on GNU/Linux
Thanks,
Gaston Kleiman
Re: Review Request 65552: Added a regression test for MESOS-8468.
Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65552/#review197034
-----------------------------------------------------------
FAIL: Some of the unit tests failed. Please check the relevant logs.
Reviews applied: `['65548', '65549', '65550', '65551', '65552']`
Failed command: `Start-MesosCITesting`
All the build artifacts available at: http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65552
Relevant logs:
- [mesos-tests-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65552/logs/mesos-tests-stdout.log):
```
[ OK ] Endpoint/SlaveEndpointTest.NoAuthorizer/2 (102 ms)
[----------] 9 tests from Endpoint/SlaveEndpointTest (1004 ms total)
[----------] 2 tests from ContainerizerType/DefaultContainerDNSFlagTest
[ RUN ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/0
[ OK ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/0 (32 ms)
[ RUN ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/1
[ OK ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/1 (37 ms)
[----------] 2 tests from ContainerizerType/DefaultContainerDNSFlagTest (70 ms total)
[----------] 1 test from IsolationFlag/CpuIsolatorTest
[ RUN ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0
[ OK ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0 (2284 ms)
[----------] 1 test from IsolationFlag/CpuIsolatorTest (2307 ms total)
[----------] 1 test from IsolationFlag/MemoryIsolatorTest
[ RUN ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0
[ OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (2250 ms)
[----------] 1 test from IsolationFlag/MemoryIsolatorTest (2274 ms total)
[----------] Global test environment tear-down
[==========] 852 tests from 85 test cases ran. (309053 ms total)
[ PASSED ] 851 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] MesosContainerizer/DefaultExecutorTest.ROOT_LaunchGroupFailure/0, where GetParam() = "mesos"
1 FAILED TEST
YOU HAVE 213 DISABLED TESTS
```
- [mesos-tests-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65552/logs/mesos-tests-stderr.log):
```
I0207 20:57:04.589033 5876 executor.cpp:171] Received SUBSCRIBED event
I0207 20:57:04.593070 5876 executor.cpp:175] Subscribed executor on build-srv-04.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net
I0207 20:57:04.594066 5876 executor.cpp:171] Received LAUNCH event
I0207 20:57:04.598068 5876 executor.cpp:638] Starting task c6317f1b-f8c3-4e2e-b15e-923ea21bc48f
I0207 20:57:04.670063 5876 executor.cpp:478] Running 'D:\DCOS\mesos\src\mesos-containerizer.exe launch <POSSIBLY-SENSITIVE-DATA>'
I0207 20:57:05.175050 5876 executor.cpp:651] Forked command at 2288
I0207 20:57:05.204072 6708 exec.cpp:445] Executor asked to shutdown
I0207 20:57:05.205040 5876 executor.cpp:171] Received SHUTDOWN event
I0207 20:57:05.205040 5876 executor.cpp:748] Shutting down
I0207 20:57:05.205040 5876 executor.cpp:863] Sending SIGTERM to process tree at pid 247 2068 master.cpp:3239] Deactivating framework b9f2a006-df38-46e0-b8be-85efde5447c3-0000 (default) at scheduler-bf610ad9-c067-4614-b82b-f632e1568bc6@10.3.1.5:59751
I0207 20:57:05.202073 3272 hierarchical.cpp:405] Deactivated framework b9f2a006-df38-46e0-b8be-85efde5447c3-0000
I0207 20:57:05.202073 2068 master.cpp:10204] Updating the state of task c6317f1b-f8c3-4e2e-b15e-923ea21bc48f of framework b9f2a006-df38-46e0-b8be-85efde5447c3-0000 (latest state: TASK_KILLED, status update state: TASK_KILLED)
I0207 20:57:05.202073 7332 slave.cpp:3479] Shutting down framework b9f2a006-df38-46e0-b8be-85efde5447c3-0000
I0207 20:57:05.202073 7332 slave.cpp:6178] Shutting down executor 'c6317f1b-f8c3-4e2e-b15e-923ea21bc48f' of framework b9f2a006-df38-46e0-b8be-85efde5447c3-0000 at executor(1)@10.3.1.5:59772
I0207 20:57:05.203073 7332 slave.cpp:931] Agent terminating
W0207 20:57:05.204072 7332 slave.cpp:3475] Ignoring shutdown framework b9f2a006-df38-46e0-b8be-85efde5447c3-0000 because it is terminating
I0207 20:57:05.205040 2068 master.cpp:10303] Removing task c6317f1b-f8c3-4e2e-b15e-923ea21bc48f with resources cpus(allocated: *):4; mem(allocated: *):2048; disk(allocated: *):1024; ports(allocated: *):[31000-32000] of framework b9f2a006-df38-46e0-b8be-85efde5447c3-0000 on agent b9f2a006-df38-46e0-b8be-85efde5447c3-S0 at slave(330)@10.3.1.5:59751 (build-srv-04.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0207 20:57:05.206046 8624 containerizer.cpp:2338] Destroying container 99c11c94-aea8-4a79-959b-c6b208d02c59 in RUNNING state
I0207 20:57:05.206046 8624 containerizer.cpp:2952] Transitioning the state of container 99c11c94-aea8-4a79-959b-c6b208d02c59 from RUNNING to DESTROYING
I0207 20:57:05.207041 8624 launcher.cpp:156] Asked to destroy container 99c11c94-aea8-4a79-959b-c6b208d02c59
I0207 20:57:05.208039 2068 master.cpp:1307] Agent b9f2a006-df38-46e0-b8be-85efde5447c3-S0 at slave(330)@10.3.1.5:59751 (build-srv-04.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net) disconnected
I0207 20:57:05.208039 2068 master.cpp:3276] Disconnecting agent b9f2a006-df38-46e0-b8be-85efde5447c3-S0 at slave(330)@10.3.1.5:59751 (build-srv-04.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0207 20:57:05.208039 3272 hierarchical.cpp:344] Removed framework b9f2a006-df38-46e0-b8be-85efde5447c3-0000
I0207 20:57:05.208039 2068 master.cpp:3295] Deactivating agent b9f2a006-df38-46e0-b8be-85efde5447c3-S0 at slave(330)@10.3.1.5:59751 (build-srv-04.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0207 20:57:05.209039 4508 hierarchical.cpp:766] Agent b9f2a006-df38-46e0-b8be-85efde5447c3-S0 deactivated
I0207 20:57:05.237257 11016 containerizer.cpp:2791] Container 99c11c94-aea8-4a79-959b-c6b208d02c59 has exited
I0207 20:57:05.267297 6468 master.cpp:1149] Master terminating
I0207 20:57:05.269306 3272 hierarchical.cpp:609] Removed agent b9f2a006-df38-46e0-b8be-85efde5447c3-S0
I0207 20:57:05.762310 10416 process.cpp:929] Stopped the socket accept loop
```
- Mesos Reviewbot Windows
On Feb. 7, 2018, 8:05 p.m., Gaston Kleiman wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65552/
> -----------------------------------------------------------
>
> (Updated Feb. 7, 2018, 8:05 p.m.)
>
>
> Review request for mesos, Anand Mazumdar, Greg Mann, Qian Zhang, and Vinod Kone.
>
>
> Bugs: MESOS-8468
> https://issues.apache.org/jira/browse/MESOS-8468
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Added a regression test for MESOS-8468.
>
>
> Diffs
> -----
>
> src/tests/default_executor_tests.cpp cc97e0d1fea7f4d0bc544d850593d8d91921b552
>
>
> Diff: https://reviews.apache.org/r/65552/diff/1/
>
>
> Testing
> -------
>
> `GLOG_v=1 sudo bin/mesos-tests.sh --gtest_filter='*ROOT_LaunchGroupFailure*' --verbose --gtest_repeat=650 --gtest_break_on_failure` on GNU/Linux
>
>
> Thanks,
>
> Gaston Kleiman
>
>
Re: Review Request 65552: Added a regression test for MESOS-8468.
Posted by Qian Zhang <zh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65552/#review197509
-----------------------------------------------------------
Ship it!
Ship It!
- Qian Zhang
On Feb. 13, 2018, 7:24 a.m., Gaston Kleiman wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65552/
> -----------------------------------------------------------
>
> (Updated Feb. 13, 2018, 7:24 a.m.)
>
>
> Review request for mesos, Anand Mazumdar, Greg Mann, Qian Zhang, and Vinod Kone.
>
>
> Bugs: MESOS-8468
> https://issues.apache.org/jira/browse/MESOS-8468
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Added a regression test for MESOS-8468.
>
>
> Diffs
> -----
>
> src/tests/default_executor_tests.cpp cc97e0d1fea7f4d0bc544d850593d8d91921b552
>
>
> Diff: https://reviews.apache.org/r/65552/diff/4/
>
>
> Testing
> -------
>
> `GLOG_v=1 sudo bin/mesos-tests.sh --gtest_filter='*ROOT_LaunchGroupFailure*' --verbose --gtest_repeat=650 --gtest_break_on_failure` on GNU/Linux
>
>
> Thanks,
>
> Gaston Kleiman
>
>
Re: Review Request 65552: Added a regression test for MESOS-8468.
Posted by Gaston Kleiman <ga...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65552/
-----------------------------------------------------------
(Updated Feb. 12, 2018, 3:24 p.m.)
Review request for mesos, Anand Mazumdar, Greg Mann, Qian Zhang, and Vinod Kone.
Changes
-------
Swapped tasks in task groups in order to prevent a potential race.
Bugs: MESOS-8468
https://issues.apache.org/jira/browse/MESOS-8468
Repository: mesos
Description
-------
Added a regression test for MESOS-8468.
Diffs (updated)
-----
src/tests/default_executor_tests.cpp cc97e0d1fea7f4d0bc544d850593d8d91921b552
Diff: https://reviews.apache.org/r/65552/diff/3/
Changes: https://reviews.apache.org/r/65552/diff/2-3/
Testing
-------
`GLOG_v=1 sudo bin/mesos-tests.sh --gtest_filter='*ROOT_LaunchGroupFailure*' --verbose --gtest_repeat=650 --gtest_break_on_failure` on GNU/Linux
Thanks,
Gaston Kleiman
Re: Review Request 65552: Added a regression test for MESOS-8468.
Posted by Gaston Kleiman <ga...@mesosphere.io>.
> On Feb. 12, 2018, 12:35 p.m., Joseph Wu wrote:
> > src/tests/default_executor_tests.cpp
> > Lines 3450-3461 (patched)
> > <https://reviews.apache.org/r/65552/diff/2/?file=1954220#file1954220line3450>
> >
> > Is it possible for the following race to occur?
> >
> > * Executor launches task group 1 (expected to fail/kill)
> > * Executor performs the launch/kill.
> > * Executor commits suicide because it is no longer running any tasks.
> > * The agent sends the second task group to the now-dead executor.
Yeah, that sounds possible, I changed the test so that it now does the following:
1. Executor launches `taskGroup1` with a task that sleeps for a very long time and isn't expected to stop until killed.
2. Executor launches `taskGroup2` with a sleep task and one that should fail to launch.
3. Executor should kill the sleep task in `taskGroup2`.
4. Executor report all tasks in `taskGroup2` as killed/failed.
4. Scheduler will ask to kill the sole task in `taskGroup1`.
5. Executor should kill the task in `taskGroup1` and terminate.
- Gaston
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65552/#review197312
-----------------------------------------------------------
On Feb. 12, 2018, 3:24 p.m., Gaston Kleiman wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65552/
> -----------------------------------------------------------
>
> (Updated Feb. 12, 2018, 3:24 p.m.)
>
>
> Review request for mesos, Anand Mazumdar, Greg Mann, Qian Zhang, and Vinod Kone.
>
>
> Bugs: MESOS-8468
> https://issues.apache.org/jira/browse/MESOS-8468
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Added a regression test for MESOS-8468.
>
>
> Diffs
> -----
>
> src/tests/default_executor_tests.cpp cc97e0d1fea7f4d0bc544d850593d8d91921b552
>
>
> Diff: https://reviews.apache.org/r/65552/diff/3/
>
>
> Testing
> -------
>
> `GLOG_v=1 sudo bin/mesos-tests.sh --gtest_filter='*ROOT_LaunchGroupFailure*' --verbose --gtest_repeat=650 --gtest_break_on_failure` on GNU/Linux
>
>
> Thanks,
>
> Gaston Kleiman
>
>
Re: Review Request 65552: Added a regression test for MESOS-8468.
Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65552/#review197312
-----------------------------------------------------------
src/tests/default_executor_tests.cpp
Lines 3276 (patched)
<https://reviews.apache.org/r/65552/#comment277452>
s/shoud/should/
src/tests/default_executor_tests.cpp
Lines 3450-3461 (patched)
<https://reviews.apache.org/r/65552/#comment277454>
Is it possible for the following race to occur?
* Executor launches task group 1 (expected to fail/kill)
* Executor performs the launch/kill.
* Executor commits suicide because it is no longer running any tasks.
* The agent sends the second task group to the now-dead executor.
- Joseph Wu
On Feb. 7, 2018, 12:05 p.m., Gaston Kleiman wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65552/
> -----------------------------------------------------------
>
> (Updated Feb. 7, 2018, 12:05 p.m.)
>
>
> Review request for mesos, Anand Mazumdar, Greg Mann, Qian Zhang, and Vinod Kone.
>
>
> Bugs: MESOS-8468
> https://issues.apache.org/jira/browse/MESOS-8468
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Added a regression test for MESOS-8468.
>
>
> Diffs
> -----
>
> src/tests/default_executor_tests.cpp cc97e0d1fea7f4d0bc544d850593d8d91921b552
>
>
> Diff: https://reviews.apache.org/r/65552/diff/2/
>
>
> Testing
> -------
>
> `GLOG_v=1 sudo bin/mesos-tests.sh --gtest_filter='*ROOT_LaunchGroupFailure*' --verbose --gtest_repeat=650 --gtest_break_on_failure` on GNU/Linux
>
>
> Thanks,
>
> Gaston Kleiman
>
>