You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Paul Baclace (JIRA)" <ji...@apache.org> on 2005/12/26 23:09:30 UTC

[jira] Updated: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

     [ http://issues.apache.org/jira/browse/NUTCH-151?page=all ]

Paul Baclace updated NUTCH-151:
-------------------------------

    Attachment: CommandRunner.java

Minimal required changes to fix bug NUTCH-151:
1. The pipe io threads should be daemons.
2. The main thread should always interrupt() the pipe io threads when finishing up, not just when a timeout occurs.
3. Sleep before testing whether the process has finished with Process.exitValue().
4. Increased the sleep time to be 1000msec.

Obvious cleanup hitchhiking along:
5. Remove unused _kaput;
6. Added comments indicating changes to make in order to use JDK 1.5 instead of  EDU.oswego.cs.dl.util.concurrent package.
7. Changed void evaluate() to be a convenience method that uses int exec() which returns the exit code (or -1 if timed out).

An alternative to the busy loop is to use Process.waitFor() and a separate alarm thread can interrupt the main thread to effect a timeout.  The main thread can then interrupt() the io pipe threads and they will receive an InterruptedIOException.  If necessary, the main thread can also close the streams the io pipe threads are reading from in order to force  them out of read().  (Oddly, the JavaDoc for Thread.interrupt() does not  mention InterruptedIOException.)  



> CommandRunner can hang after the main thread exec is finished and has inefficient busy loop
> -------------------------------------------------------------------------------------------
>
>          Key: NUTCH-151
>          URL: http://issues.apache.org/jira/browse/NUTCH-151
>      Project: Nutch
>         Type: Bug
>   Components: indexer
>     Versions: 0.8-dev
>  Environment: all
>     Reporter: Paul Baclace
>  Attachments: CommandRunner.java
>
> I encountered a case where the JVM of a Tasktracker child did not exit after the main thread returned; a thread dump showed only the threads named STDOUT and STDERR from CommandRunner as non-daemon threads, and both were doing a read().
> CommandRunner usually works correctly when the subprocess is expected to be finished before the timeout or when no timeout is used. By _usually_, I mean in the absence of external thread interrupts.  The busy loop that waits for the process to finish has a sleep that is skipped over by an exception; this causes the waiting main thread to compete with the subprocess in a tight loop and effectively reduces the available cpu by 50%.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira