You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Amar Kamat (JIRA)" <ji...@apache.org> on 2008/09/24 13:16:44 UTC

[jira] Created: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Jobs failing in the init stage will never cleanup
-------------------------------------------------

                 Key: HADOOP-4261
                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
            Reporter: Amar Kamat
            Priority: Blocker


Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
- staus set to KILLED/FAILED
- job files from the system dir are deleted
- closing of job history files
- making jobtracker aware of this through {{jobTracker.finalizeJob()}}
- cleaning up the data structures via {{JobInProgress.garbageCollect()}}

Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635641#action_12635641 ] 

Amareshwari Sriramadasu commented on HADOOP-4261:
-------------------------------------------------

Test result for the failed tests on hudson:
{noformat}
    [junit] Running org.apache.hadoop.mapred.TestJobQueueInformation
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 69.908 sec
    [junit] Running org.apache.hadoop.mapred.TestJobTrackerRestartWithLostTracker
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 265.076 sec
{noformat}

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4261:
--------------------------------------------

    Attachment: patch-4261.txt

Modified TestJobTrackerRestart to test setuptask Reports. 
Modified ObtainSetup/CleanupTask to use addRunningTaskToTIP()
Changed canLaunchSetupTask's implementation as suggested by Amar

bq. It would make more sense to call the job run-state as SETUP when logInited() is invoked
This would not be in consistent with JobStatus. Adding SETUP as JobStatus is not necessary.

bq.I dont see any code that deals with setup task in job-recovery i.e recovery-manager
There is no specific change required. Change in checkAndInit method will do.



> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635317#action_12635317 ] 

Amar Kamat commented on HADOOP-4261:
------------------------------------

Few comments w.r.t job-recovery
1) Upon restart, the task-completion-events/task-reports for the setup tasks should also match.
2) It would make more sense to call the job run-state as {{SETUP}} when {{logInited()}} is invoked. While recovering, check if the SETUP state is reached before calling {{init()}}.
3) Check if {{JobInProgress.obtainSetupTask()}} can reuse {{JobInProgress.addRunningTaskToTIP()}}.
4) I think {{JobInProgress.canLaunchSetupTask()}} can also be written as
{code}
private synchronized boolean canLaunchSetupTask() {
    // check if the job is in PREP, initialized and not setup
    return status.getRunState() == JobStatus.PREP && tasksInited.get() && !launchedSetup;
}
{code}
5) I dont see any code that deals with setup task in job-recovery i.e recovery-manager. Just make sure that the effect of scheduling setup tasks before restart is same as the effect of replaying it from history. I assume that when the JIP is given a task-attempt update, it figures out if the task if setup or not. Ideally the way setup is launched from a recvory-manager should mimic the way its invoked from the real(live) jobtracker.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636534#action_12636534 ] 

Amareshwari Sriramadasu commented on HADOOP-4261:
-------------------------------------------------

Test failure is not related to the patch. I don't see the see the failure on my machine.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4261:
--------------------------------------------

    Status: Open  (was: Patch Available)

Passing taskType as String should not break any. Cancelling patch to incorporate Amar's comments. 

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4261:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4261:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4261:
--------------------------------------------

    Status: Patch Available  (was: Open)

test-patch result:
{noformat}
     [exec] +1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 9 new or modified tests.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec]
{noformat}
All core and contrib tests passed on machine. Ran Sort benchmark on the cluster.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4261:
--------------------------------------------

    Attachment: patch-4261.txt

Uploading patch incorporating the the above fix and review comments from Sharad and Amar.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4261:
--------------------------------------------

    Status: Open  (was: Patch Available)

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4261:
--------------------------------------------

    Release Note: A setup task (for doing OutputCommitter.setupJob()) will be launched, when the job is in PREP state and after initializing tasks. Once the setup task completes, the job will be moved to RUNNING state.
    Hadoop Flags: [Incompatible change, Reviewed]  (was: [Reviewed])

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4261:
--------------------------------------------

    Attachment: patch-4261.txt

Patch after incorporating review comments.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637035#action_12637035 ] 

Hadoop QA commented on HADOOP-4261:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12391515/patch-4261.txt
  against trunk revision 701476.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3434/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3434/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3434/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3434/console

This message is automatically generated.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635638#action_12635638 ] 

Amareshwari Sriramadasu commented on HADOOP-4261:
-------------------------------------------------

All core and contrib tests passed on my machine.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634111#action_12634111 ] 

Amareshwari Sriramadasu commented on HADOOP-4261:
-------------------------------------------------

One solution i could see is JobClient should not do setupJob, even setupJob should also be done as a seperate task (InitTask), similar to cleanup task. Then, after the completion of 'job initialization', setup will be done by the InitTask. If initializations fails, there will be no setup or cleanup task launched.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635321#action_12635321 ] 

Hadoop QA commented on HADOOP-4261:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12391115/patch-4261.txt
  against trunk revision 699676.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3390/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3390/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3390/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3390/console

This message is automatically generated.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636566#action_12636566 ] 

Amar Kamat commented on HADOOP-4261:
------------------------------------

+1, looks fine to me from the recovery point of view. One general comment
- Can we have a enum in {{JobHistory}} (say {{TaskType}}) and pass {{TaskType}} as a parameter instead of {{isCleanup}}, {{isSetup}} etc. Can you check if its possible? The reason is in future we can simply grow the enum and make appropriate calls. 

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636572#action_12636572 ] 

Amareshwari Sriramadasu commented on HADOOP-4261:
-------------------------------------------------

bq. Can we have a enum in JobHistory (say TaskType) and pass TaskType as a parameter instead of isCleanup, isSetup etc.
I feel this should be done as part of HADOOP-4122, because now the code is in tact with existing apis. Having separate enum for TaskType will break the existing checks such as Task.get(Keys.TASK_TYPE).equals(Values.Map().name()), because they have to be modifed for the new enum.  *And also I strongly feel HADOOP-4122 should be done in 0.19*, because re-factoring the code breaks many things. So, the sooner the better.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637479#action_12637479 ] 

Hudson commented on HADOOP-4261:
--------------------------------

Integrated in Hadoop-trunk #626 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/626/])
    . Adds a setup task for jobs. This is required so that we don't setup jobs that haven't been inited yet (since init could lead to job failure). Only after the init has successfully happened do we launch the setupJob task. Contributed by Amareshwari Sriramadasu.


> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-4261:
--------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Amareshwari!

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chansler updated HADOOP-4261:
------------------------------------

    Release Note:   (was: A setup task (for doing OutputCommitter.setupJob()) will be launched, when the job is in PREP state and after initializing tasks. Once the setup task completes, the job will be moved to RUNNING state.)
    Hadoop Flags: [Reviewed]  (was: [Reviewed, Incompatible change])

A setup task (for doing OutputCommitter.setupJob()) will be launched, when the job is in PREP state and after initializing tasks. Once the setup task completes, the job will be moved to RUNNING state.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636583#action_12636583 ] 

Amar Kamat commented on HADOOP-4261:
------------------------------------

bq. Having separate enum for TaskType will break the existing checks such as Task.get(Keys.TASK_TYPE).equals(Values.Map().name()
What I meant was pass enum instead of booleans. What goes in the jobhistory remains same and hence what gets retrieved also remains same. Something like
{code}
// before
public static void logStarted(TaskAttemptID taskAttemptId, long startTime, String trackerName, int httpPort,  boolean isCleanup, boolean isSetup){}
//after
public static void logStarted(TaskAttemptID taskAttemptId, long startTime, String trackerName, int httpPort,  TaskType type){}
{code}
and 
{code}
// before
isCleanup ? Values.CLEANUP.name() :  isSetup ? Values.SETUP.name() : Values.MAP.name(),
//after
type.name()
{code}

But you could also do something like
{code}
// before
public static void logStarted(TaskAttemptID taskAttemptId, long startTime, String trackerName, int httpPort,  String taskType){}
{code}
and pass appropriate tasktype from JIP.

But if this turns out to be a big change then we can do it in HADOOP-4122 as Amareshwari suggested.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4261:
--------------------------------------------

    Attachment: patch-4261.txt

Attaching patch thats modifies committer.setupJob() as a separate task similar to cleanupTask.

* With the patch, JobInProgress.initTasks() initializes all the tips (including setup and cleanup tips). SetupTask will be launched when the job is in PREP state, after the tasks get initialized. Once the setupTask completes, the Job is moved to RUNNING state. 
* JobTracker first iterates over the jobs for cleanup tasks, then setup tasks. If there are no cleanup and setup tasks, it calls taskScheduler.assignTasks() for maps and reduces.
* This task is logged in JobHistory with SETUP as TASK_TYPE. 
* The Job web UI and History web UI is modified to add setup task. The UI change is similar to the cleanup task UI. 
* Updated Javadoc and documentation



> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635546#action_12635546 ] 

Hadoop QA commented on HADOOP-4261:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12391129/patch-4261.txt
  against trunk revision 700163.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3393/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3393/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3393/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3393/console

This message is automatically generated.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-4261:
--------------------------------

    Fix Version/s: 0.19.0

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4261:
--------------------------------------------

    Status: Open  (was: Patch Available)

Found that check for tip.isSetupTask() got missed with isMapTask check in JobTracker.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636284#action_12636284 ] 

Hadoop QA commented on HADOOP-4261:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12391267/patch-4261.txt
  against trunk revision 700923.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3419/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3419/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3419/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3419/console

This message is automatically generated.

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4261:
--------------------------------------------

    Status: Patch Available  (was: Open)

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu reassigned HADOOP-4261:
-----------------------------------------------

    Assignee: Amareshwari Sriramadasu

> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-4261) Jobs failing in the init stage will never cleanup

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636534#action_12636534 ] 

amareshwari edited comment on HADOOP-4261 at 10/2/08 10:39 PM:
---------------------------------------------------------------------------

Test failure is not related to the patch. I don't see the failure on my machine.

      was (Author: amareshwari):
    Test failure is not related to the patch. I don't see the see the failure on my machine.
  
> Jobs failing in the init stage will never cleanup
> -------------------------------------------------
>
>                 Key: HADOOP-4261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4261
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4261.txt, patch-4261.txt, patch-4261.txt
>
>
> Pre HADOOP-3150, if the job fails in the init stage, {{job.kill()}} was called. This used to make sure that the job was cleaned up w.r.t 
> - staus set to KILLED/FAILED
> - job files from the system dir are deleted
> - closing of job history files
> - making jobtracker aware of this through {{jobTracker.finalizeJob()}}
> - cleaning up the data structures via {{JobInProgress.garbageCollect()}}
> Now if the job fails in the init stage, {{job.fail()}} is called which doesnt do the cleanup. HADOOP-3150 introduces cleanup tasks which are launched once the job completes i.e killed/failed/succeeded.  Jobtracker will never consider this job for scheduling as the job will be in the {{PREP}} state forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.