You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amar Kamat (JIRA)" <ji...@apache.org> on 2008/05/02 06:30:55 UTC

[jira] Commented: (HADOOP-3289) Hadoop should have a way to know when JobTracker is really ready to accept jobs.

    [ https://issues.apache.org/jira/browse/HADOOP-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12593718#action_12593718 ] 

Amar Kamat commented on HADOOP-3289:
------------------------------------

Yes. If one can query the JT (via JobClient) then you can rely on the JT's state information provided via the cluster status. 

> Hadoop should have a way to know when JobTracker is really ready to accept jobs.
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-3289
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3289
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Vinod Kumar Vavilapalli
>
> Hadoop throws an org.apache.hadoop.mapred.JobTracker$IllegalStateException when we try to submit jobs while JT is still initializing and cannot accept jobs yet. This might be because of various reasons, like job submitted too early, or JT waiting for response from NN which might be in safemode (HADOOP-2213) or JT fails to clean-up mapred system directory(HADOOP-3276). This causes problems in HoD or any other user scripts automatically submitting jobs.
> To deal with such problems, we need to have a way either to find out the state of the job tracker so that this can be checked before launching any job, or, otherwise, a way to determine if JT can accept any jobs now. Currently there is no api/command line interface to check this in Hadoop. Even job submission doesn't return any specific error code, even in presence of IllegalStateException error. So, the only reliable way, for HOD/scripts to detect such exceptions as these, is to search for exception strings in the output of these commands, which is kind of nasty.
> So, it would be good if Hadoop can provide an api/error code/cmd line utility to check if JT is really ready to accept any jobs. Otherwise HoD/user scripts will be left with resorting to (unreliable) way of sleeping for arbitrary amounts of time and retrying w/o knowing the actual reason.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.