You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amar Kamat (JIRA)" <ji...@apache.org> on 2009/04/01 13:37:13 UTC

[jira] Updated: (HADOOP-5394) JobTracker might schedule 2 attempts of the same task with the same attempt id across restarts

     [ https://issues.apache.org/jira/browse/HADOOP-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-5394:
-------------------------------

    Attachment: HADOOP-5394-v1.9.1.patch

Attaching a patch incorporating Devaraj's offline comments. Result of test-patch
{code}
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 18 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
{code}

Ant test passed on my box.

> JobTracker might schedule 2 attempts of the same task with the same attempt id across restarts
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5394
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5394
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Critical
>         Attachments: HADOOP-5394-v1.2.patch, HADOOP-5394-v1.5.patch, HADOOP-5394-v1.9.1.patch
>
>
> This can happen when the jobtracker gets restarted more than once. In such cases, the jobtracker depends on the jobhistory file for the next restart count. If the new restart-count is not flushed to the file then there is a fair chance that upon next restart, the jobtracker might schedule a new attempt with an existing id. This can cause problems not only with the side-effect files but also can cause the jobtracker to be in an inconsistent state.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.