You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Ramkumar Vadali (JIRA)" <ji...@apache.org> on 2010/12/08 21:49:01 UTC

[jira] Created: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched

TaskTracker should release slot if task is not launched
-------------------------------------------------------

                 Key: MAPREDUCE-2214
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 0.20.1
            Reporter: Ramkumar Vadali
            Assignee: Ramkumar Vadali


TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not in an expected state. However, in the case where the task is not launched, the slot is not released. We have observed this in production - the task was in SUCCEEDED state by the time launchTask() got to it and then the slot was never released. It is not clear how the task got into that state, but it is better to handle the case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched

Posted by "Dick King (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969485#action_12969485 ] 

Dick King commented on MAPREDUCE-2214:
--------------------------------------

Speculative execution is a legitimate way a task can become {{SUCCEEDED}} while an attempt on that task is waiting to get launched.

> TaskTracker should release slot if task is not launched
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-2214
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>
> TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not in an expected state. However, in the case where the task is not launched, the slot is not released. We have observed this in production - the task was in SUCCEEDED state by the time launchTask() got to it and then the slot was never released. It is not clear how the task got into that state, but it is better to handle the case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched

Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ramkumar Vadali updated MAPREDUCE-2214:
---------------------------------------

    Status: Patch Available  (was: Open)

> TaskTracker should release slot if task is not launched
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-2214
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-2214.patch
>
>
> TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not in an expected state. However, in the case where the task is not launched, the slot is not released. We have observed this in production - the task was in SUCCEEDED state by the time launchTask() got to it and then the slot was never released. It is not clear how the task got into that state, but it is better to handle the case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched

Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970534#action_12970534 ] 

Ramkumar Vadali commented on MAPREDUCE-2214:
--------------------------------------------

TEST RESULTS

ant test-patch complains about unit-tests, but its difficult to come up with a unit-test for this.
{code}
     [exec] -1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
     [exec]                         Please justify why no new tests are needed for this patch.
     [exec]                         Also please list what manual steps were performed to verify this patch.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec]
     [exec]     +1 system test framework.  The patch passed system test framework compile.
     [exec]
     [exec]
     [exec]
     [exec]
     [exec] ======================================================================
     [exec] ======================================================================
     [exec]     Finished build.
     [exec] ======================================================================
     [exec] ======================================================================
     [exec]
     [exec]
{code}

ant test: there was only one test failure, but that fails in a clean checkout too.
{code}
    [junit] Test org.apache.hadoop.mapred.TestControlledMapReduceJob FAILED (timeout)
{code}

> TaskTracker should release slot if task is not launched
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-2214
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-2214.patch
>
>
> TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not in an expected state. However, in the case where the task is not launched, the slot is not released. We have observed this in production - the task was in SUCCEEDED state by the time launchTask() got to it and then the slot was never released. It is not clear how the task got into that state, but it is better to handle the case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (MAPREDUCE-2214) TaskTracker should release slot if task is not launched

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated MAPREDUCE-2214:
-------------------------------------

    Status: Open  (was: Patch Available)

Sorry to come in late, the patch has gone stale. Can you please rebase? Thanks.

Given this is not an issue with MRv2 should we still commit this? I'm happy to, but not sure it's useful. Thanks.

> TaskTracker should release slot if task is not launched
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-2214
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-2214.patch
>
>
> TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not in an expected state. However, in the case where the task is not launched, the slot is not released. We have observed this in production - the task was in SUCCEEDED state by the time launchTask() got to it and then the slot was never released. It is not clear how the task got into that state, but it is better to handle the case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969572#action_12969572 ] 

Joydeep Sen Sarma commented on MAPREDUCE-2214:
----------------------------------------------

i think what happened in our case was something like this:
# task was requested to be killed
# the TT performed the kill action and reported back to the JT
# but the task reported back as done - at which point the TT promptly moved it into the SUCCEEDED state
# meanwhile the JT scheduled a cleanup and the cleanup failed to launch without returning the slot

the cris-crossing of #2 and #3 was what was unexpected i think (something the code doesn't anticipate). 

we don't hit this problem with speculation because we never request speculation when the task is about to complete (there's a check on the remaining time on the task and if the remaining time is less than N min - we don't speculate. there's a jira for this - don't remember which).

> TaskTracker should release slot if task is not launched
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-2214
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>
> TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not in an expected state. However, in the case where the task is not launched, the slot is not released. We have observed this in production - the task was in SUCCEEDED state by the time launchTask() got to it and then the slot was never released. It is not clear how the task got into that state, but it is better to handle the case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched

Posted by "Ramkumar Vadali (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ramkumar Vadali updated MAPREDUCE-2214:
---------------------------------------

    Attachment: MAPREDUCE-2214.patch

> TaskTracker should release slot if task is not launched
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-2214
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-2214.patch
>
>
> TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not in an expected state. However, in the case where the task is not launched, the slot is not released. We have observed this in production - the task was in SUCCEEDED state by the time launchTask() got to it and then the slot was never released. It is not clear how the task got into that state, but it is better to handle the case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-2214) TaskTracker should release slot if task is not launched

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12999860#comment-12999860 ] 

Hadoop QA commented on MAPREDUCE-2214:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12466033/MAPREDUCE-2214.patch
  against trunk revision 1074251.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/48//testReport/
Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/48//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/48//console

This message is automatically generated.

> TaskTracker should release slot if task is not launched
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-2214
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2214
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-2214.patch
>
>
> TaskTracker.TaskInProgress.launchTask() does not launch a task if it is not in an expected state. However, in the case where the task is not launched, the slot is not released. We have observed this in production - the task was in SUCCEEDED state by the time launchTask() got to it and then the slot was never released. It is not clear how the task got into that state, but it is better to handle the case.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira