You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficcontrol.apache.org by "Dylan Volz (JIRA)" <ji...@apache.org> on 2017/08/09 16:01:00 UTC
[jira] [Comment Edited] (TC-374) `systemctl stop traffic_ops` does
not kill all processes
[ https://issues.apache.org/jira/browse/TC-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120083#comment-16120083 ]
Dylan Volz edited comment on TC-374 at 8/9/17 4:00 PM:
-------------------------------------------------------
This means the processes were orphaned, and adopted by root (as expected) but then never waited on (reaped). I happened to have just read about a similar issue docker can have here: https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/.
Seems that hypnotoad (wrapping Mojo::Server::Prefork) should be handling this and it will require some digging, I think some tooling using http://mojolicious.org/perldoc/Mojo/Server/Prefork#reap
and http://mojolicious.org/perldoc/Mojo/Server/Prefork#spawn may help illuminate what is going on.
was (Author: dylan_volz):
This means the processes were orphaned, and adopted by root (as expected) but then never waited on (reaped). I happened to have just read about a similar issue docker can have here: https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/ that describes it in detail and perhaps a similar solution can be applied in our start up script.
> `systemctl stop traffic_ops` does not kill all processes
> --------------------------------------------------------
>
> Key: TC-374
> URL: https://issues.apache.org/jira/browse/TC-374
> Project: Traffic Control
> Issue Type: Bug
> Components: Traffic Ops
> Affects Versions: 2.1.0
> Reporter: Dan Kirkwood
> Labels: systemctl
> Fix For: 2.1.0
>
>
> not sure why it gets in this state, but traffic_ops processes sometimes get in a state where the parent process is the root (1) for many of the script/cdn workers. The init.d script doesn't stop them because they don't match the pid in `/var/run/traffic_ops.pid`.
> This needs to be more robust.
> Example: on my test VM, if I run this, I see multiple parent processes:
> {quote}
> ps -ef | grep script/cdn|grep -v root | awk '\{print $3\}' | sort -u
> 1
> 22713
> 29306
> {quote}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)