You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficcontrol.apache.org by "Dylan Volz (JIRA)" <ji...@apache.org> on 2017/08/09 16:01:00 UTC

[jira] [Comment Edited] (TC-374) `systemctl stop traffic_ops` does not kill all processes

    [ https://issues.apache.org/jira/browse/TC-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120083#comment-16120083 ] 

Dylan Volz edited comment on TC-374 at 8/9/17 4:00 PM:
-------------------------------------------------------

This means the processes were orphaned, and adopted by root (as expected) but then never waited on (reaped). I happened to have just read about a similar issue docker can have here: https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/.

Seems that hypnotoad (wrapping Mojo::Server::Prefork) should be handling this and it will require some digging, I think some tooling using http://mojolicious.org/perldoc/Mojo/Server/Prefork#reap
and http://mojolicious.org/perldoc/Mojo/Server/Prefork#spawn may help illuminate what is going on.


was (Author: dylan_volz):
This means the processes were orphaned, and adopted by root (as expected) but then never waited on (reaped). I happened to have just read about a similar issue docker can have here: https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/ that describes it in detail and perhaps a similar solution can be applied in our start up script.

> `systemctl stop traffic_ops` does not kill all processes
> --------------------------------------------------------
>
>                 Key: TC-374
>                 URL: https://issues.apache.org/jira/browse/TC-374
>             Project: Traffic Control
>          Issue Type: Bug
>          Components: Traffic Ops
>    Affects Versions: 2.1.0
>            Reporter: Dan Kirkwood
>              Labels: systemctl
>             Fix For: 2.1.0
>
>
> not sure why it gets in this state,  but traffic_ops processes sometimes get in a state where the parent process is the root (1) for many of the script/cdn workers.   The init.d script doesn't stop them because they don't match the pid in `/var/run/traffic_ops.pid`.
> This needs to be more robust.
> Example:   on my test VM,  if I run this,  I see multiple parent processes:
> {quote}
>  ps -ef | grep script/cdn|grep -v root | awk '\{print $3\}'  | sort -u
>  1
>  22713
>  29306
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)