You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org> on 2008/04/21 08:11:21 UTC

[jira] Created: (HADOOP-3289) Hadoop should have a way to know when JobTracker is really ready to accept jobs.

Hadoop should have a way to know when JobTracker is really ready to accept jobs.
--------------------------------------------------------------------------------

                 Key: HADOOP-3289
                 URL: https://issues.apache.org/jira/browse/HADOOP-3289
             Project: Hadoop Core
          Issue Type: New Feature
            Reporter: Vinod Kumar Vavilapalli


Hadoop throws an org.apache.hadoop.mapred.JobTracker$IllegalStateException when we try to submit jobs while JT is still initializing and cannot accept jobs yet. This might be because of various reasons, like job submitted too early, or JT waiting for response from NN which might be in safemode (HADOOP-2213) or JT fails to clean-up mapred system directory(HADOOP-3276). This causes problems in HoD or any other user scripts automatically submitting jobs.

To deal with such problems, we need to have a way either to find out the state of the job tracker so that this can be checked before launching any job, or, otherwise, a way to determine if JT can accept any jobs now. Currently there is no api/command line interface to check this in Hadoop. Even job submission doesn't return any specific error code, even in presence of IllegalStateException error. So, the only reliable way, for HOD/scripts to detect such exceptions as these, is to search for exception strings in the output of these commands, which is kind of nasty.

So, it would be good if Hadoop can provide an api/error code/cmd line utility to check if JT is really ready to accept any jobs. Otherwise HoD/user scripts will be left with resorting to (unreliable) way of sleeping for arbitrary amounts of time and retrying w/o knowing the actual reason.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3289) Hadoop should have a way to know when JobTracker is really ready to accept jobs.

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12593718#action_12593718 ] 

Amar Kamat commented on HADOOP-3289:
------------------------------------

Yes. If one can query the JT (via JobClient) then you can rely on the JT's state information provided via the cluster status. 

> Hadoop should have a way to know when JobTracker is really ready to accept jobs.
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-3289
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3289
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Vinod Kumar Vavilapalli
>
> Hadoop throws an org.apache.hadoop.mapred.JobTracker$IllegalStateException when we try to submit jobs while JT is still initializing and cannot accept jobs yet. This might be because of various reasons, like job submitted too early, or JT waiting for response from NN which might be in safemode (HADOOP-2213) or JT fails to clean-up mapred system directory(HADOOP-3276). This causes problems in HoD or any other user scripts automatically submitting jobs.
> To deal with such problems, we need to have a way either to find out the state of the job tracker so that this can be checked before launching any job, or, otherwise, a way to determine if JT can accept any jobs now. Currently there is no api/command line interface to check this in Hadoop. Even job submission doesn't return any specific error code, even in presence of IllegalStateException error. So, the only reliable way, for HOD/scripts to detect such exceptions as these, is to search for exception strings in the output of these commands, which is kind of nasty.
> So, it would be good if Hadoop can provide an api/error code/cmd line utility to check if JT is really ready to accept any jobs. Otherwise HoD/user scripts will be left with resorting to (unreliable) way of sleeping for arbitrary amounts of time and retrying w/o knowing the actual reason.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3289) Hadoop should have a way to know when JobTracker is really ready to accept jobs.

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12593398#action_12593398 ] 

Pete Wyckoff commented on HADOOP-3289:
--------------------------------------

Hi Amar,

The cluster status contains  http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/JobTracker.State.html

Does this not accurately reflect whether the JT is running or initializing ?

thanks, pete


> Hadoop should have a way to know when JobTracker is really ready to accept jobs.
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-3289
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3289
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Vinod Kumar Vavilapalli
>
> Hadoop throws an org.apache.hadoop.mapred.JobTracker$IllegalStateException when we try to submit jobs while JT is still initializing and cannot accept jobs yet. This might be because of various reasons, like job submitted too early, or JT waiting for response from NN which might be in safemode (HADOOP-2213) or JT fails to clean-up mapred system directory(HADOOP-3276). This causes problems in HoD or any other user scripts automatically submitting jobs.
> To deal with such problems, we need to have a way either to find out the state of the job tracker so that this can be checked before launching any job, or, otherwise, a way to determine if JT can accept any jobs now. Currently there is no api/command line interface to check this in Hadoop. Even job submission doesn't return any specific error code, even in presence of IllegalStateException error. So, the only reliable way, for HOD/scripts to detect such exceptions as these, is to search for exception strings in the output of these commands, which is kind of nasty.
> So, it would be good if Hadoop can provide an api/error code/cmd line utility to check if JT is really ready to accept any jobs. Otherwise HoD/user scripts will be left with resorting to (unreliable) way of sleeping for arbitrary amounts of time and retrying w/o knowing the actual reason.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-3289) Hadoop should have a way to know when JobTracker is really ready to accept jobs.

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607169#action_12607169 ] 

amar_kamat edited comment on HADOOP-3289 at 6/23/08 2:27 AM:
-------------------------------------------------------------

Will be solved as a part of HADOOP-3618

      was (Author: amar_kamat):
    Will be solved as a part of HADOOP-3168
  
> Hadoop should have a way to know when JobTracker is really ready to accept jobs.
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-3289
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3289
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Vinod Kumar Vavilapalli
>
> Hadoop throws an org.apache.hadoop.mapred.JobTracker$IllegalStateException when we try to submit jobs while JT is still initializing and cannot accept jobs yet. This might be because of various reasons, like job submitted too early, or JT waiting for response from NN which might be in safemode (HADOOP-2213) or JT fails to clean-up mapred system directory(HADOOP-3276). This causes problems in HoD or any other user scripts automatically submitting jobs.
> To deal with such problems, we need to have a way either to find out the state of the job tracker so that this can be checked before launching any job, or, otherwise, a way to determine if JT can accept any jobs now. Currently there is no api/command line interface to check this in Hadoop. Even job submission doesn't return any specific error code, even in presence of IllegalStateException error. So, the only reliable way, for HOD/scripts to detect such exceptions as these, is to search for exception strings in the output of these commands, which is kind of nasty.
> So, it would be good if Hadoop can provide an api/error code/cmd line utility to check if JT is really ready to accept any jobs. Otherwise HoD/user scripts will be left with resorting to (unreliable) way of sleeping for arbitrary amounts of time and retrying w/o knowing the actual reason.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3289) Hadoop should have a way to know when JobTracker is really ready to accept jobs.

Posted by "Mukund Madhugiri (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mukund Madhugiri updated HADOOP-3289:
-------------------------------------

    Fix Version/s:     (was: 0.18.0)

> Hadoop should have a way to know when JobTracker is really ready to accept jobs.
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-3289
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3289
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Vinod Kumar Vavilapalli
>
> Hadoop throws an org.apache.hadoop.mapred.JobTracker$IllegalStateException when we try to submit jobs while JT is still initializing and cannot accept jobs yet. This might be because of various reasons, like job submitted too early, or JT waiting for response from NN which might be in safemode (HADOOP-2213) or JT fails to clean-up mapred system directory(HADOOP-3276). This causes problems in HoD or any other user scripts automatically submitting jobs.
> To deal with such problems, we need to have a way either to find out the state of the job tracker so that this can be checked before launching any job, or, otherwise, a way to determine if JT can accept any jobs now. Currently there is no api/command line interface to check this in Hadoop. Even job submission doesn't return any specific error code, even in presence of IllegalStateException error. So, the only reliable way, for HOD/scripts to detect such exceptions as these, is to search for exception strings in the output of these commands, which is kind of nasty.
> So, it would be good if Hadoop can provide an api/error code/cmd line utility to check if JT is really ready to accept any jobs. Otherwise HoD/user scripts will be left with resorting to (unreliable) way of sleeping for arbitrary amounts of time and retrying w/o knowing the actual reason.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3289) Hadoop should have a way to know when JobTracker is really ready to accept jobs.

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-3289:
--------------------------------

    Fix Version/s: 0.18.0

> Hadoop should have a way to know when JobTracker is really ready to accept jobs.
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-3289
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3289
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Vinod Kumar Vavilapalli
>             Fix For: 0.18.0
>
>
> Hadoop throws an org.apache.hadoop.mapred.JobTracker$IllegalStateException when we try to submit jobs while JT is still initializing and cannot accept jobs yet. This might be because of various reasons, like job submitted too early, or JT waiting for response from NN which might be in safemode (HADOOP-2213) or JT fails to clean-up mapred system directory(HADOOP-3276). This causes problems in HoD or any other user scripts automatically submitting jobs.
> To deal with such problems, we need to have a way either to find out the state of the job tracker so that this can be checked before launching any job, or, otherwise, a way to determine if JT can accept any jobs now. Currently there is no api/command line interface to check this in Hadoop. Even job submission doesn't return any specific error code, even in presence of IllegalStateException error. So, the only reliable way, for HOD/scripts to detect such exceptions as these, is to search for exception strings in the output of these commands, which is kind of nasty.
> So, it would be good if Hadoop can provide an api/error code/cmd line utility to check if JT is really ready to accept any jobs. Otherwise HoD/user scripts will be left with resorting to (unreliable) way of sleeping for arbitrary amounts of time and retrying w/o knowing the actual reason.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3289) Hadoop should have a way to know when JobTracker is really ready to accept jobs.

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591054#action_12591054 ] 

Owen O'Malley commented on HADOOP-3289:
---------------------------------------

This should probably look like the safemode stuff with get and wait operations.

> Hadoop should have a way to know when JobTracker is really ready to accept jobs.
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-3289
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3289
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Vinod Kumar Vavilapalli
>
> Hadoop throws an org.apache.hadoop.mapred.JobTracker$IllegalStateException when we try to submit jobs while JT is still initializing and cannot accept jobs yet. This might be because of various reasons, like job submitted too early, or JT waiting for response from NN which might be in safemode (HADOOP-2213) or JT fails to clean-up mapred system directory(HADOOP-3276). This causes problems in HoD or any other user scripts automatically submitting jobs.
> To deal with such problems, we need to have a way either to find out the state of the job tracker so that this can be checked before launching any job, or, otherwise, a way to determine if JT can accept any jobs now. Currently there is no api/command line interface to check this in Hadoop. Even job submission doesn't return any specific error code, even in presence of IllegalStateException error. So, the only reliable way, for HOD/scripts to detect such exceptions as these, is to search for exception strings in the output of these commands, which is kind of nasty.
> So, it would be good if Hadoop can provide an api/error code/cmd line utility to check if JT is really ready to accept any jobs. Otherwise HoD/user scripts will be left with resorting to (unreliable) way of sleeping for arbitrary amounts of time and retrying w/o knowing the actual reason.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3289) Hadoop should have a way to know when JobTracker is really ready to accept jobs.

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592748#action_12592748 ] 

Amar Kamat commented on HADOOP-3289:
------------------------------------

Look at the JobTracker logs to see if it says {{RUNNING}}. 

> Hadoop should have a way to know when JobTracker is really ready to accept jobs.
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-3289
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3289
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Vinod Kumar Vavilapalli
>
> Hadoop throws an org.apache.hadoop.mapred.JobTracker$IllegalStateException when we try to submit jobs while JT is still initializing and cannot accept jobs yet. This might be because of various reasons, like job submitted too early, or JT waiting for response from NN which might be in safemode (HADOOP-2213) or JT fails to clean-up mapred system directory(HADOOP-3276). This causes problems in HoD or any other user scripts automatically submitting jobs.
> To deal with such problems, we need to have a way either to find out the state of the job tracker so that this can be checked before launching any job, or, otherwise, a way to determine if JT can accept any jobs now. Currently there is no api/command line interface to check this in Hadoop. Even job submission doesn't return any specific error code, even in presence of IllegalStateException error. So, the only reliable way, for HOD/scripts to detect such exceptions as these, is to search for exception strings in the output of these commands, which is kind of nasty.
> So, it would be good if Hadoop can provide an api/error code/cmd line utility to check if JT is really ready to accept any jobs. Otherwise HoD/user scripts will be left with resorting to (unreliable) way of sleeping for arbitrary amounts of time and retrying w/o knowing the actual reason.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HADOOP-3289) Hadoop should have a way to know when JobTracker is really ready to accept jobs.

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat resolved HADOOP-3289.
--------------------------------

    Resolution: Duplicate

Will be solved as a part of HADOOP-3168

> Hadoop should have a way to know when JobTracker is really ready to accept jobs.
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-3289
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3289
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Vinod Kumar Vavilapalli
>
> Hadoop throws an org.apache.hadoop.mapred.JobTracker$IllegalStateException when we try to submit jobs while JT is still initializing and cannot accept jobs yet. This might be because of various reasons, like job submitted too early, or JT waiting for response from NN which might be in safemode (HADOOP-2213) or JT fails to clean-up mapred system directory(HADOOP-3276). This causes problems in HoD or any other user scripts automatically submitting jobs.
> To deal with such problems, we need to have a way either to find out the state of the job tracker so that this can be checked before launching any job, or, otherwise, a way to determine if JT can accept any jobs now. Currently there is no api/command line interface to check this in Hadoop. Even job submission doesn't return any specific error code, even in presence of IllegalStateException error. So, the only reliable way, for HOD/scripts to detect such exceptions as these, is to search for exception strings in the output of these commands, which is kind of nasty.
> So, it would be good if Hadoop can provide an api/error code/cmd line utility to check if JT is really ready to accept any jobs. Otherwise HoD/user scripts will be left with resorting to (unreliable) way of sleeping for arbitrary amounts of time and retrying w/o knowing the actual reason.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.