You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Zhitao Li (JIRA)" <ji...@apache.org> on 2017/05/15 16:48:04 UTC

[jira] [Commented] (MESOS-7381) Flaky tests in NestedMesosContainerizerTest

    [ https://issues.apache.org/jira/browse/MESOS-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16010887#comment-16010887 ] 

Zhitao Li commented on MESOS-7381:
----------------------------------

I suspect the problem is that many tests which relies on Unified containerizer's docker store is using the default `/tmp/mesos/store/docker` rather than the test workdir.

> Flaky tests in NestedMesosContainerizerTest
> -------------------------------------------
>
>                 Key: MESOS-7381
>                 URL: https://issues.apache.org/jira/browse/MESOS-7381
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 1.1.1, 1.2.0
>            Reporter: Zhitao Li
>            Priority: Blocker
>
> There are about 12-13 tests in that class which are very flaky in my environment (Debian/jessie running 4.4.38 kernel).
> It seems like the primary cause is that root container "failed" and caused `reap` to be called on itself, which cascalade and cause both containers to be actively killed by containerizer instead of terminate by themselves.
> This happens on master branch at commit 6c1e20c0f2777d9bb831be3ff43c885b253af7bb
> Some log:
> {panel}
> [ RUN      ] NestedMesosContainerizerTest.ROOT_CGROUPS_LaunchNested
> I0412 16:06:29.698456 16110 containerizer.cpp:221] Using isolation: cgroups/cpu,filesystem/linux,namespaces/pid,network/cni,volume/image
> I0412 16:06:29.703445 16110 linux_launcher.cpp:150] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
> I0412 16:06:29.704658 16110 provisioner.cpp:249] Using default backend 'overlay'
> I0412 16:06:29.714778 16125 containerizer.cpp:608] Recovering containerizer
> I0412 16:06:29.718374 16125 provisioner.cpp:410] Provisioner recovery complete
> I0412 16:06:29.719195 16127 containerizer.cpp:1001] Starting container 7cd8794a-4c4f-43e2-8824-2459dc57753d for executor 'executor' of framework 
> I0412 16:06:29.721225 16128 cgroups.cpp:410] Creating cgroup at '/sys/fs/cgroup/cpu,cpuacct/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d' for container 7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:29.723470 16128 cpu.cpp:101] Updated 'cpu.shares' to 1024 (cpus 1) for container 7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:29.727627 16126 containerizer.cpp:1499] Launching 'mesos-containerizer' with flags '--help="false" --launch_info="{"clone_namespaces":[131072,536870912],"command":{"shell":true,"value":"sleep 1000"},"environment":{"variables":[{"name":"MESOS_SANDBOX","type":"VALUE","value":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_AtLcOb"}]},"pre_exec_commands":[{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/home\/uber\/mesos\/build\/src\/mesos-containerizer"},{"shell":true,"value":"mount -n -t proc proc \/proc -o nosuid,noexec,nodev"}],"working_directory":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_AtLcOb"}" --pipe_read="7" --pipe_write="8" --runtime_directory="/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_CfIMuj/containers/7cd8794a-4c4f-43e2-8824-2459dc57753d" --unshare_namespace_mnt="false"'
> I0412 16:06:29.728123 16131 linux_launcher.cpp:429] Launching container 7cd8794a-4c4f-43e2-8824-2459dc57753d and cloning with namespaces CLONE_NEWNS | CLONE_NEWPID
> I0412 16:06:29.750948 16126 containerizer.cpp:1598] Checkpointing container's forked pid 16164 to '/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_SX8Ner/meta/slaves/frameworks/executors/executor/runs/7cd8794a-4c4f-43e2-8824-2459dc57753d/pids/forked.pid'
> I0412 16:06:29.755005 16130 fetcher.cpp:353] Starting to fetch URIs for container: 7cd8794a-4c4f-43e2-8824-2459dc57753d, directory: /tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_AtLcOb
> I0412 16:06:29.758663 16124 containerizer.cpp:1766] Starting nested container 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.762702 16126 containerizer.cpp:1499] Launching 'mesos-containerizer' with flags '--help="false" --launch_info="{"clone_namespaces":[131072,536870912],"command":{"shell":true,"value":"exit 42"},"enter_namespaces":[536870912],"environment":{"variables":[{"name":"MESOS_SANDBOX","type":"VALUE","value":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_AtLcOb\/containers\/fc7f3523-e348-43f3-b809-3be02e35315c"}]},"pre_exec_commands":[{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/home\/uber\/mesos\/build\/src\/mesos-containerizer"},{"shell":true,"value":"mount -n -t proc proc \/proc -o nosuid,noexec,nodev"}],"working_directory":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_AtLcOb\/containers\/fc7f3523-e348-43f3-b809-3be02e35315c"}" --pipe_read="7" --pipe_write="8" --runtime_directory="/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_CfIMuj/containers/7cd8794a-4c4f-43e2-8824-2459dc57753d/containers/fc7f3523-e348-43f3-b809-3be02e35315c" --unshare_namespace_mnt="false"'
> I0412 16:06:29.763293 16124 linux_launcher.cpp:429] Launching nested container 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c and cloning with namespaces CLONE_NEWNS | CLONE_NEWPID
> I0412 16:06:29.771055 16131 fetcher.cpp:353] Starting to fetch URIs for container: 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c, directory: /tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_AtLcOb/containers/fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.888044 16130 containerizer.cpp:2483] Container 7cd8794a-4c4f-43e2-8824-2459dc57753d has exited
> I0412 16:06:29.888092 16130 containerizer.cpp:2077] Destroying container 7cd8794a-4c4f-43e2-8824-2459dc57753d in RUNNING state
> I0412 16:06:29.888116 16130 containerizer.cpp:2077] Destroying container 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c in RUNNING state
> I0412 16:06:29.888913 16130 containerizer.cpp:2483] Container 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c has exited
> I0412 16:06:29.889135 16129 linux_launcher.cpp:505] Asked to destroy container 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.889752 16129 linux_launcher.cpp:548] Using freezer to destroy cgroup mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos/fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.891119 16124 cgroups.cpp:2692] Freezing cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos/fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.892491 16130 cgroups.cpp:1405] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos/fc7f3523-e348-43f3-b809-3be02e35315c after 1.32096ms
> I0412 16:06:29.894062 16127 cgroups.cpp:2710] Thawing cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos/fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.895290 16127 cgroups.cpp:1434] Successfully thawed cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos/fc7f3523-e348-43f3-b809-3be02e35315c after 1.184768ms
> I0412 16:06:29.900456 16130 provisioner.cpp:484] Ignoring destroy request for unknown container 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.900617 16128 containerizer.cpp:2356] Checkpointing termination state to nested container's runtime directory '/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_CfIMuj/containers/7cd8794a-4c4f-43e2-8824-2459dc57753d/containers/fc7f3523-e348-43f3-b809-3be02e35315c/termination'
> ../../src/tests/containerizer/nested_mesos_containerizer_tests.cpp:230: Failure
> Expecting WIFEXITED(wait.get()->status()) but  WIFSIGNALED(wait.get()->status()) is true and WTERMSIG(wait.get()->status()) is Killed
> I0412 16:06:29.901587 16125 linux_launcher.cpp:505] Asked to destroy container 7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:29.902186 16125 linux_launcher.cpp:548] Using freezer to destroy cgroup mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:29.903249 16125 cgroups.cpp:2692] Freezing cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos
> I0412 16:06:29.903316 16124 cgroups.cpp:2692] Freezing cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:30.006909 16130 cgroups.cpp:1405] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d after 103.539968ms
> I0412 16:06:30.007313 16124 cgroups.cpp:1405] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos after 104.015872ms
> I0412 16:06:30.008755 16128 cgroups.cpp:2710] Thawing cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos
> I0412 16:06:30.011144 16125 cgroups.cpp:2710] Thawing cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:30.012442 16125 cgroups.cpp:1434] Successfully thawed cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d after 1.254912ms
> I0412 16:06:30.111548 16131 cgroups.cpp:1434] Successfully thawed cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos after 102.735872ms
> I0412 16:06:30.118083 16126 provisioner.cpp:484] Ignoring destroy request for unknown container 7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:30.142882 16130 cgroups.cpp:2692] Freezing cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8
> I0412 16:06:30.144379 16126 cgroups.cpp:1405] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8 after 1.382144ms
> I0412 16:06:30.145800 16131 cgroups.cpp:2710] Thawing cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8
> I0412 16:06:30.147094 16131 cgroups.cpp:1434] Successfully thawed cgroup /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8 after 1.203968ms
> [  FAILED  ] NestedMesosContainerizerTest.ROOT_CGROUPS_LaunchNested (500 ms)
> {panel}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)