You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2008/10/07 10:47:44 UTC

[jira] Created: (HADOOP-4361) Corner cases in killJob from command line

Corner cases in killJob from command line
-----------------------------------------

                 Key: HADOOP-4361
                 URL: https://issues.apache.org/jira/browse/HADOOP-4361
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.19.0
            Reporter: Amareshwari Sriramadasu
            Assignee: Amareshwari Sriramadasu
            Priority: Blocker
             Fix For: 0.19.0


If the job has occupied all the slots (maps and reduces), and there is kill from command line to kill the job, we have to make sure the cleanup task is eventually launched. Now there is no slot free until one of the attempts completes/failed.
If there is kill from commandline during job setup, setup task has to be killed, before launching cleanup task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4361) Corner cases in killJob from command line

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637985#action_12637985 ] 

Vinod K V commented on HADOOP-4361:
-----------------------------------

Else where in HADOOP-4035, I needed to check if a job is being killed to avoid duplicate kills. For this I had already introduced a boolean killInProgress which is set in JobInProgress.terminate(). We'll need to sync up depending on which JIRA goes in first.

> Corner cases in killJob from command line
> -----------------------------------------
>
>                 Key: HADOOP-4361
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4361
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> If the job has occupied all the slots (maps and reduces), and there is kill from command line to kill the job, we have to make sure the cleanup task is eventually launched. Now there is no slot free until one of the attempts completes/failed.
> If there is kill from commandline during job setup, setup task has to be killed, before launching cleanup task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4361) Corner cases in killJob from command line

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639854#action_12639854 ] 

Hudson commented on HADOOP-4361:
--------------------------------

Integrated in Hadoop-trunk #634 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/634/])
    

> Corner cases in killJob from command line
> -----------------------------------------
>
>                 Key: HADOOP-4361
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4361
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4361.txt
>
>
> If the job has occupied all the slots (maps and reduces), and there is kill from command line to kill the job, we have to make sure the cleanup task is eventually launched. Now there is no slot free until one of the attempts completes/failed.
> If there is kill from commandline during job setup, setup task has to be killed, before launching cleanup task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4361) Corner cases in killJob from command line

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639389#action_12639389 ] 

Amareshwari Sriramadasu commented on HADOOP-4361:
-------------------------------------------------

Ran Sort benchmark on big cluster.

> Corner cases in killJob from command line
> -----------------------------------------
>
>                 Key: HADOOP-4361
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4361
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4361.txt
>
>
> If the job has occupied all the slots (maps and reduces), and there is kill from command line to kill the job, we have to make sure the cleanup task is eventually launched. Now there is no slot free until one of the attempts completes/failed.
> If there is kill from commandline during job setup, setup task has to be killed, before launching cleanup task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4361) Corner cases in killJob from command line

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4361:
--------------------------------------------

    Status: Patch Available  (was: Open)

test-patch result:
{noformat}
     [exec]
     [exec] -1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
     [exec]                         Please justify why no tests are needed for this patch.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec]
{noformat}

All the core and contrib tests passed on machine except TestMiniMRDebugScript(due to HADOOP-4410)

Did not include any unit test because it is difficult to simulate the corner-cases. Did the manual testing for the following:
1. Killed job when all slots are full. made sure cleanup is launched and job is killed.
2. Killed job during setup.
3. Killed job during cleanup.
4. Kill task action for setup during setup
5. kill task action for cleanup during cleanup


> Corner cases in killJob from command line
> -----------------------------------------
>
>                 Key: HADOOP-4361
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4361
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4361.txt
>
>
> If the job has occupied all the slots (maps and reduces), and there is kill from command line to kill the job, we have to make sure the cleanup task is eventually launched. Now there is no slot free until one of the attempts completes/failed.
> If there is kill from commandline during job setup, setup task has to be killed, before launching cleanup task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4361) Corner cases in killJob from command line

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637420#action_12637420 ] 

Amareshwari Sriramadasu commented on HADOOP-4361:
-------------------------------------------------

During the cleanup, (the time JT is waiting for the cleanup to get launched, and while cleaning up), the JobStatus should be "CLEANUP". This makes the tasks get KillTaskAction immediately and makes slots free. Thoughts?

> Corner cases in killJob from command line
> -----------------------------------------
>
>                 Key: HADOOP-4361
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4361
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> If the job has occupied all the slots (maps and reduces), and there is kill from command line to kill the job, we have to make sure the cleanup task is eventually launched. Now there is no slot free until one of the attempts completes/failed.
> If there is kill from commandline during job setup, setup task has to be killed, before launching cleanup task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4361) Corner cases in killJob from command line

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-4361:
--------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Amareshwari!

> Corner cases in killJob from command line
> -----------------------------------------
>
>                 Key: HADOOP-4361
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4361
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4361.txt
>
>
> If the job has occupied all the slots (maps and reduces), and there is kill from command line to kill the job, we have to make sure the cleanup task is eventually launched. Now there is no slot free until one of the attempts completes/failed.
> If there is kill from commandline during job setup, setup task has to be killed, before launching cleanup task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4361) Corner cases in killJob from command line

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637757#action_12637757 ] 

Amareshwari Sriramadasu commented on HADOOP-4361:
-------------------------------------------------

On second thought, adding the new JobStatus looks big change,  we can have a boolean set while the job is cleaning up instead of state change.

> Corner cases in killJob from command line
> -----------------------------------------
>
>                 Key: HADOOP-4361
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4361
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> If the job has occupied all the slots (maps and reduces), and there is kill from command line to kill the job, we have to make sure the cleanup task is eventually launched. Now there is no slot free until one of the attempts completes/failed.
> If there is kill from commandline during job setup, setup task has to be killed, before launching cleanup task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4361) Corner cases in killJob from command line

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4361:
--------------------------------------------

    Attachment: patch-4361.txt

Attaching patch with following changes:

1. changed TaskInProgress.shouldClose(attemptId) to return true if the tip has failed. This will make sure cleanup task will get a slor free.
2. the gaurd in obtainNewMap/ReduceTask for tasksInited is changed to gaurd for JobState as RUNNING
3. kill setup Tips when there is a kill from commandLine 
4.  Fixed setup task failures when
     i) kill-task for setup task from commandLine 
     ii) lost tracker on which setup task is running.

> Corner cases in killJob from command line
> -----------------------------------------
>
>                 Key: HADOOP-4361
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4361
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: patch-4361.txt
>
>
> If the job has occupied all the slots (maps and reduces), and there is kill from command line to kill the job, we have to make sure the cleanup task is eventually launched. Now there is no slot free until one of the attempts completes/failed.
> If there is kill from commandline during job setup, setup task has to be killed, before launching cleanup task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.