You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "haosdent (JIRA)" <ji...@apache.org> on 2016/04/01 12:04:25 UTC

[jira] [Commented] (MESOS-5075) Remain processes when running perf event related test cases

    [ https://issues.apache.org/jira/browse/MESOS-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15221487#comment-15221487 ] 

haosdent commented on MESOS-5075:
---------------------------------

{code}
    if (perf.isSome() && perf->status().isPending()) {
      kill(perf->pid(), SIGTERM);
    }
{code}
The problem here is we didn't reap after {{kill}}. So the child process status keep zombie.

> Remain processes when running perf event related test cases
> -----------------------------------------------------------
>
>                 Key: MESOS-5075
>                 URL: https://issues.apache.org/jira/browse/MESOS-5075
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: haosdent
>            Assignee: haosdent
>              Labels: isolation, perf
>
> Currently when running single perf event related test cases, I always saw
> {code}
> [----------] Global test environment tear-down
> ../../src/tests/environment.cpp:790: Failure
> Failed
> Tests completed with child processes remaining:
> -+- 22886 /home/haosdent/mesos/build/src/.libs/mesos-tests --gtest_filter=CgroupsIsolatorTest.ROOT_CGROUPS_PerfEventSubsystemSample --verbose
>  \-+- 22963 /home/haosdent/mesos/build/src/.libs/mesos-tests --gtest_filter=CgroupsIsolatorTest.ROOT_CGROUPS_PerfEventSubsystemSample --verbose
>    \-+- 22965 perf stat --all-cpus --field-separator , --log-fd 1 --event cycles --cgroup mesos/5f02f820-cc63-471b-98b9-37bbc4fde674 --event task-clock --cgroup mesos/5f02f820-cc63-471b-98b9-37bbc4fde674 -- sleep 0.25
>      \--- 22966 sleep 0.25
> [==========] 1 test from 1 test case ran. (3165 ms total)
> {code}
> In {{PerfEventIsolatorTest.ROOT_CGROUPS_Sample}}, we add a sleep.
> {code}
> sleep(2);
> {code}
> This could avoid the remain processes in most cases, but a better approach is to discard and kill perf sample process before exit.
> As discussion in [r43284 | https://reviews.apache.org/r/43284/], discard did't work as well except waiting for process exit. So need to investigate why discard didn't work and fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)