You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Jerry Cwiklik (JIRA)" <de...@uima.apache.org> on 2014/02/12 18:22:19 UTC

[jira] [Created] (UIMA-3612) DUCC Agent should detect defunct processes

Jerry Cwiklik created UIMA-3612:
-----------------------------------

             Summary: DUCC Agent should detect defunct processes 
                 Key: UIMA-3612
                 URL: https://issues.apache.org/jira/browse/UIMA-3612
             Project: UIMA
          Issue Type: Bug
          Components: DUCC
    Affects Versions: 1.0-Ducc
            Reporter: Jerry Cwiklik
            Assignee: Jerry Cwiklik


Agent's rogue process detector should change to detect a process that is defunct. Its been observed that a process drops core and remains running as defunct. Since it is up, the agent is happy and keeps reporting the process as Running. 

Trying to kill via kill -9 doesnt help. It looks like the defunct process must be be cleaned up by root. 

Modify code to change the state of such process from Running to Defunct (?). 




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: [jira] [Created] (UIMA-3612) DUCC Agent should detect defunct processes

Posted by Jim Challenger <ch...@gmail.com>.
On this problem - the underlying issue is that the process became a 
Zombie and wasn't cleaned up.  As best as can be seen the waitpid() call 
in Java's Process class never did wake up.  This looks like a possible 
kernel bug in SLES SP2 as we see bug warnings in the system log that 
exactly correlate with the process being zombified.

There's really nothing we can do to clean up the zombie.  I killed and 
restarted the parent Agent and almost 18 hours later the zombie is still 
an undead child of init.

Probably the only thing to do is for Agent to detect the situation and 
report the process is gone (it really is).

The Process.waitFor() call is just a java wait() on a monitor (that is 
normally notified when the waitpid() system call returns).  It should be 
possible to interrupt it so the wait thread doesn't leak. Hard to know 
what might and might not work because the situation can't, in general, 
be replicated in test.

Jim
On 2/12/14 12:22 PM, Jerry Cwiklik (JIRA) wrote:
> Jerry Cwiklik created UIMA-3612:
> -----------------------------------
>
>               Summary: DUCC Agent should detect defunct processes
>                   Key: UIMA-3612
>                   URL: https://issues.apache.org/jira/browse/UIMA-3612
>               Project: UIMA
>            Issue Type: Bug
>            Components: DUCC
>      Affects Versions: 1.0-Ducc
>              Reporter: Jerry Cwiklik
>              Assignee: Jerry Cwiklik
>
>
> Agent's rogue process detector should change to detect a process that is defunct. Its been observed that a process drops core and remains running as defunct. Since it is up, the agent is happy and keeps reporting the process as Running.
>
> Trying to kill via kill -9 doesnt help. It looks like the defunct process must be be cleaned up by root.
>
> Modify code to change the state of such process from Running to Defunct (?).
>
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.1.5#6160)