You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Zhitao Li (JIRA)" <ji...@apache.org> on 2017/09/06 16:01:00 UTC
[jira] [Commented] (MESOS-5893) mesos-executor should adopt and
reap orphan child processes
[ https://issues.apache.org/jira/browse/MESOS-5893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155588#comment-16155588 ]
Zhitao Li commented on MESOS-5893:
----------------------------------
Is this problem still there, [~jieyu]?
> mesos-executor should adopt and reap orphan child processes
> -----------------------------------------------------------
>
> Key: MESOS-5893
> URL: https://issues.apache.org/jira/browse/MESOS-5893
> Project: Mesos
> Issue Type: Bug
> Components: containerization
> Affects Versions: 1.1.0
> Environment: mesos compiled from git master ( 1.1.0 )
> {{../configure --enable-ssl --enable-libevent --prefix=/usr --enable-optimize --enable-silent-rules --enable-xfs-disk-isolator}}
> isolators : {{namespaces/pid,cgroups/cpu,cgroups/mem,filesystem/linux,docker/runtime,network/cni,docker/volume}}
> Reporter: Stéphane Cottin
> Labels: containerizer
>
> mesos containerizer does not properly handle children death.
> discovered using marathon-lb, each topology update fork another haproxy, the old haproxy process should properly die after its last client connection is terminated, but turn into a zombie.
> {noformat}
> 7716 ? Ssl 0:00 | \_ mesos-executor --launcher_dir=/usr/libexec/mesos --sandbox_directory=/mnt/mesos/sandbox --user=root --working_directory=/marathon-lb --rootfs=/mnt/mesos/provisioner/containers/3b381d5c-7490-4dcd-ab4b-81051226075a/backends/overlay/rootfses/a4beacac-2d7e-445b-80c8-a9b4e480c491
> 7813 ? Ss 0:00 | | \_ sh -c /marathon-lb/run sse --marathon https://marathon:8443 --auth-credentials user:pass --group 'external' --ssl-certs /certs --max-serv-port-ip-per-task 20050
> 7823 ? S 0:00 | | | \_ /bin/bash /marathon-lb/run sse --marathon https://marathon:8443 --auth-credentials user:pass --group external --ssl-certs /certs --max-serv-port-ip-per-task 20050
> 7827 ? S 0:00 | | | \_ /usr/bin/runsv /marathon-lb/service/haproxy
> 7829 ? S 0:00 | | | | \_ /bin/bash ./run
> 8879 ? S 0:00 | | | | \_ sleep 0.5
> 7828 ? Sl 0:00 | | | \_ python3 /marathon-lb/marathon_lb.py --syslog-socket /dev/null --haproxy-config /marathon-lb/haproxy.cfg --ssl-certs /certs --command sv reload /marathon-lb/service/haproxy --sse --marathon https://marathon:8443 --auth-credentials user:pass --group external --max-serv-port-ip-per-task 20050
> 7906 ? Zs 0:00 | | \_ [haproxy] <defunct>
> 8628 ? Zs 0:00 | | \_ [haproxy] <defunct>
> 8722 ? Ss 0:00 | | \_ haproxy -p /tmp/haproxy.pid -f /marathon-lb/haproxy.cfg -D -sf 144 52
> {noformat}
> update: mesos-executor should be registered as a subreaper ( http://man7.org/linux/man-pages/man2/prctl.2.html ) and propagate signals.
> code sample: https://github.com/krallin/tini/blob/master/src/tini.c
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)