You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Tina Shan (Jira)" <ji...@apache.org> on 2021/03/09 07:01:00 UTC

[jira] [Created] (MAPREDUCE-7327) Job.waitForCompletion function can sleep most for 596 hours when jobclient.completion.poll.interval is misconfigured , causing the job to hang

Tina Shan created MAPREDUCE-7327:
------------------------------------

             Summary: Job.waitForCompletion function can sleep most for 596 hours when jobclient.completion.poll.interval is misconfigured , causing the job to hang  
                 Key: MAPREDUCE-7327
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7327
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: client
    Affects Versions: 3.3.0
            Reporter: Tina Shan


The loop terminates depending on a configurable value and there is little sanity checking on this value. When jobclient.completion.poll.interval is misconfigured to INT_MAX, it can cause the loop to sleep at most for 596 hours. The thread would get stuck and never return to the user even if the job completes. We suggest adding a cap value or a warning message. 

{code:java}
public boolean waitForCompletion(boolean verbose
                                   ) throws IOException, InterruptedException,
                                            ClassNotFoundException {
... 
   while (!isComplete()) {
        try {
            Thread.sleep(completionPollIntervalMillis);
        } catch (InterruptedException ie) {
    }
...
}
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org