You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amar Kamat (JIRA)" <ji...@apache.org> on 2009/03/04 12:59:56 UTC
[jira] Created: (HADOOP-5392) JobTracker crashes during recovery if
job files are garbled
JobTracker crashes during recovery if job files are garbled
-----------------------------------------------------------
Key: HADOOP-5392
URL: https://issues.apache.org/jira/browse/HADOOP-5392
Project: Hadoop Core
Issue Type: Bug
Components: mapred
Reporter: Amar Kamat
Priority: Blocker
Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if
job files are garbled
Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amar Kamat updated HADOOP-5392:
-------------------------------
Attachment: HADOOP-5392-v2.7.patch
Attaching a patch incorporating Devaraj's comments. Ant test passes on my box.
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch, HADOOP-5392-v2.6.patch, HADOOP-5392-v2.7.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if
job files are garbled
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HADOOP-5392:
--------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
I just committed this to the 0.19, 0.20 branches and trunk. Thanks, Amar!
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.19.0
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.19.2
>
> Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch, HADOOP-5392-v2.6.patch, HADOOP-5392-v2.7.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if
job files are garbled
Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hemanth Yamijala updated HADOOP-5392:
-------------------------------------
Fix Version/s: 0.20.0
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.20.0
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if
job files are garbled
Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amar Kamat updated HADOOP-5392:
-------------------------------
Status: Patch Available (was: Open)
Submitting. Patch applies to both trunk and 0.20.
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5392) JobTracker crashes during recovery
if job files are garbled
Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679100#action_12679100 ]
Amar Kamat commented on HADOOP-5392:
------------------------------------
HADOOP-4638 added this piece of code
{code}
synchronized (trackerToJobsToCleanup) {
Set<JobID> jobs = trackerToJobsToCleanup.get(trackerName);
jobs.add(taskId.getJobID());
}
{code}
Here jobs can be null. A null check should be made. This issue was detected during HADOOP-5392 testing
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.20.0
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5392) JobTracker crashes during recovery
if job files are garbled
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679557#action_12679557 ]
Amareshwari Sriramadasu commented on HADOOP-5392:
-------------------------------------------------
ant test passed on local machine.
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if
job files are garbled
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HADOOP-5392:
--------------------------------
Affects Version/s: 0.19.0
Fix Version/s: (was: 0.20.0)
0.19.2
Hadoop Flags: [Reviewed]
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.19.0
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.19.2
>
> Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch, HADOOP-5392-v2.6.patch, HADOOP-5392-v2.7.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5392) JobTracker crashes during recovery
if job files are garbled
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679163#action_12679163 ]
Amareshwari Sriramadasu commented on HADOOP-5392:
-------------------------------------------------
One comment:
You can move everything in first while loop to single try-catch block.
Otherwise, patch looks fine.
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: HADOOP-5392-v2.1.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5392) JobTracker crashes during recovery
if job files are garbled
Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680382#action_12680382 ]
Devaraj Das commented on HADOOP-5392:
-------------------------------------
+1 on the patch. Please ensure that "ant test" goes through with the patch.
BTW while reviewing this patch, I noticed that the JobHistory calls use the user's jobconf to create/read history file paths on both hadoop.job.history.location and hadoop.job.history.user.location. This should be fixed (in a separate jira) to use the JobTracker's conf for the history files on hadoop.job.history.location.
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch, HADOOP-5392-v2.6.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5392) JobTracker crashes during recovery
if job files are garbled
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681789#action_12681789 ]
Hudson commented on HADOOP-5392:
--------------------------------
Integrated in Hadoop-trunk #778 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/778/])
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.19.0
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.19.2
>
> Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch, HADOOP-5392-v2.6.patch, HADOOP-5392-v2.7.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if
job files are garbled
Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amar Kamat updated HADOOP-5392:
-------------------------------
Attachment: HADOOP-5392-v2.1.patch
Attaching a patch the fixes the issue. Added a test case to validate that. Result of test-patch
{code}
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 3 new or modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec]
[exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
[exec]
[exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
{code}
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: HADOOP-5392-v2.1.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-5392) JobTracker crashes during recovery
if job files are garbled
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680141#action_12680141 ]
Hadoop QA commented on HADOOP-5392:
-----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12401600/HADOOP-5392-v2.3.patch
against trunk revision 751463.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 Eclipse classpath. The patch retains Eclipse classpath integrity.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/34/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/34/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/34/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/34/console
This message is automatically generated.
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if
job files are garbled
Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amar Kamat updated HADOOP-5392:
-------------------------------
Attachment: HADOOP-5392-v2.6.patch
Attaching a patch incorporating Devaraj's offline comments. Result of test-patch
{code}
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 6 new or modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec]
[exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
[exec]
[exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
{code}
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch, HADOOP-5392-v2.6.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-5392) JobTracker crashes during recovery if
job files are garbled
Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amar Kamat updated HADOOP-5392:
-------------------------------
Attachment: HADOOP-5392-v2.3.patch
Attaching a patch incorporating Amareshwari's comments. Test-patch result
{code}
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 3 new or modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec]
[exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
[exec]
[exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
{code}
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: HADOOP-5392-v2.1.patch, HADOOP-5392-v2.3.patch
>
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HADOOP-5392) JobTracker crashes during recovery
if job files are garbled
Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amar Kamat reassigned HADOOP-5392:
----------------------------------
Assignee: Amar Kamat
> JobTracker crashes during recovery if job files are garbled
> -----------------------------------------------------------
>
> Key: HADOOP-5392
> URL: https://issues.apache.org/jira/browse/HADOOP-5392
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Priority: Blocker
>
> Jobtracker crashed in the recovery stage for a job with 0 byte job.xml. Ideally one would expect the jobtracker to try and recover as many jobs as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.