You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amar Kamat (JIRA)" <ji...@apache.org> on 2009/03/04 12:59:56 UTC

[jira] Created: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

JobTracker crashes during recovery if job files are garbled
-----------------------------------------------------------

                 Key: HADOOP-5392
                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
            Reporter: Amar Kamat
            Priority: Blocker


Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-5392:
-------------------------------

    Attachment: HADOOP-5392-v2.7.patch

Attaching a patch incorporating Devaraj's comments. Ant test passes on my box.

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch, HADOOP-5392-v2.6.patch, HADOOP-5392-v2.7.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-5392:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this to the 0.19, 0.20 branches and trunk. Thanks, Amar!

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.2
>
>         Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch, HADOOP-5392-v2.6.patch, HADOOP-5392-v2.7.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-5392:
-------------------------------------

    Fix Version/s: 0.20.0

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-5392:
-------------------------------

    Status: Patch Available  (was: Open)

Submitting. Patch applies to both trunk and 0.20.

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679100#action_12679100 ] 

Amar Kamat commented on HADOOP-5392:
------------------------------------

HADOOP-4638 added this piece of code
{code}
synchronized (trackerToJobsToCleanup) {
          Set<JobID> jobs = trackerToJobsToCleanup.get(trackerName);
          jobs.add(taskId.getJobID());
        }
{code}
Here jobs can be null. A null check should be made. This issue was detected during HADOOP-5392 testing


> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679557#action_12679557 ] 

Amareshwari Sriramadasu commented on HADOOP-5392:
-------------------------------------------------

ant test passed on local machine.

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-5392:
--------------------------------

    Affects Version/s: 0.19.0
        Fix Version/s:     (was: 0.20.0)
                       0.19.2
         Hadoop Flags: [Reviewed]

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.2
>
>         Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch, HADOOP-5392-v2.6.patch, HADOOP-5392-v2.7.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679163#action_12679163 ] 

Amareshwari Sriramadasu commented on HADOOP-5392:
-------------------------------------------------

One comment:
 You can move everything in first while loop to single try-catch block.

Otherwise, patch looks fine.

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-5392-v2.1.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680382#action_12680382 ] 

Devaraj Das commented on HADOOP-5392:
-------------------------------------

+1 on the patch. Please ensure that "ant test" goes through with the patch.
BTW while reviewing this patch, I noticed that the JobHistory calls use the user's jobconf to create/read history file paths on both hadoop.job.history.location and hadoop.job.history.user.location. This should be fixed (in a separate jira) to use the JobTracker's conf for the history files on hadoop.job.history.location.

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch, HADOOP-5392-v2.6.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681789#action_12681789 ] 

Hudson commented on HADOOP-5392:
--------------------------------

Integrated in Hadoop-trunk #778 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/778/])
    

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.19.2
>
>         Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch, HADOOP-5392-v2.6.patch, HADOOP-5392-v2.7.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-5392:
-------------------------------

    Attachment: HADOOP-5392-v2.1.patch

Attaching a patch the fixes the issue. Added a test case to validate that. Result of test-patch
{code}
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
{code}

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-5392-v2.1.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680141#action_12680141 ] 

Hadoop QA commented on HADOOP-5392:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12401600/HADOOP-5392-v2.3.patch
  against trunk revision 751463.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/34/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/34/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/34/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/34/console

This message is automatically generated.

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-5392:
-------------------------------

    Attachment: HADOOP-5392-v2.6.patch

Attaching a patch incorporating Devaraj's offline comments. Result of test-patch
{code}
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 6 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
{code}

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch, HADOOP-5392-v2.6.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-5392:
-------------------------------

    Attachment: HADOOP-5392-v2.3.patch

Attaching a patch incorporating Amareshwari's comments. Test-patch result
{code}
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
{code}

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-5392) JobTracker crashes during recovery if job files are garbled

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat reassigned HADOOP-5392:
----------------------------------

    Assignee: Amar Kamat

> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
>                 Key: HADOOP-5392
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5392
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.