You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Vivek Ratan (JIRA)" <ji...@apache.org> on 2008/10/27 07:48:44 UTC

[jira] Created: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Enhance how memory-intensive user tasks are handled
---------------------------------------------------

                 Key: HADOOP-4523
                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
             Project: Hadoop Core
          Issue Type: Improvement
            Reporter: Vivek Ratan


HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646201#action_12646201 ] 

Hemanth Yamijala commented on HADOOP-4523:
------------------------------------------

The latest patch kills only the last task that started if the sum total of all tasks' memory usage goes beyond the configured limit. Picking up only one task may or may not bring down the usage to within the configured limits. Should we really be picking up enough tasks to kill ?

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646333#action_12646333 ] 

Hadoop QA commented on HADOOP-4523:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12393629/HADOOP-4523-20081110.txt
  against trunk revision 712615.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3569/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3569/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3569/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3569/console

This message is automatically generated.

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vivek Ratan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644220#action_12644220 ] 

Vivek Ratan commented on HADOOP-4523:
-------------------------------------

I'm proposing a couple of improvements: 
# The TT currently monitors each task (and its descendants) to see if that task's memory usage goes over a per-task limit. The TT should additionally monitor to make sure that the sum of memory used by all tasks should not go over a per-node limit (_tasks.maxmemory_). This situation is unlikely t happen if schedulers consider memory judiciously when scheduling, but not all schedulers may. In addition, the TT should pick tasks that ran last, when deciding what tasks to kill. 
# I'm also proposing a way to specify _tasks.maxmemory_. This particular discussion is, however, going on in HADOOP-4035, so we can continue discussion there. 


> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-4523:
------------------------------

    Attachment: HADOOP-4523-20081118.txt

New patch refactoring the newly added code to a killTasksWithLeastProgress method.

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt, HADOOP-4523-20081118.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646855#action_12646855 ] 

Hemanth Yamijala commented on HADOOP-4523:
------------------------------------------

- Consider a case where a task just started, and so is sitting in the tasksToBeAdded list, but doesn't have a ProcessTreeInfo. If by some chance, this task is the first reduce task with the least progress, it would be returned in the findTaskToKill method. However, if it is not found in the processTreeInfoMap, it is not added to the tasksToKill list. And also the memory is not reduced. Hence subsequent calls to findTaskToKill will keep repeating this task, and the code would be stuck in a loop.

- Code related to killing a task is repeated when killing tasks that were over limit, and those that need to be killed because the cumulative limit is still in excess. Can we refactor this into a common code ?

- Can we improve the diagnostic message being logged when we kill the task. Something like: "Killing task '<tid>' as the cumulative memory usage of tasks exceeds virtual memory limit '<limit>' on the task tracker, as task has the least progress."

- Add javadoc on param tasksToExclude for findTaskToKill. Can mention that passing null will include all tasks.

- I think we need the following tests:
-- When memory management is disabled, and tasks running over limits, but nothing is killed. (backwards compatibility)
-- When there are tasks that are individually over limit and also cumulatively over limit (where some tasks haven't specified memory limits)

- To simulate a task with least progress, can we have some tasks which have very large sleep limit and some with very small, or something like that.

- Suggest a few shorter names for the tests. Essentially we test with jobs which exceed limits individually, cumulatively, and a mix of both. So something like
-- testJobWithinLimits
-- testJobExceedingLimits
-- testJobsCumulativelyExceedingLimits
-- testMixedSetOfJobsExceedingLimits

- Few references to WordCount in the comments. Are they valid ?

- The testTasksWithinIndividualLimitsButTotalUsageBeyondTTLimits (or testJobsCumulativelyExceedingLimits) does not seem deterministic. Indeed this test case failed on my machine. How can we be sure that atleast one overflows. One way could be to a have a TT with 2 maps and 2 reduce slots. Submit a job with 2 map tasks and 2 reduces. Let the tasks ask for high memory so that sum of 2 tasks exceeds the TT limit, and the TT have very low memory limit. Then we can get the task reports and verify that a couple of tasks were killed. 



> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4523:
-------------------------------------

    Status: Patch Available  (was: Open)

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt, HADOOP-4523-20081118.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4523:
-------------------------------------

    Status: Open  (was: Patch Available)

Going to run it through hudson once more.

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt, HADOOP-4523-20081118.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4523:
-------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.20.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Vinod !

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt, HADOOP-4523-20081118.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-4523:
------------------------------

    Attachment: HADOOP-4523-20081113.txt

Attaching another patch with the above review comments.

Didn't write a separate testMixedSetExceedingLimits - it seemed to me that it's not adding any value, for it is already being indirectly incorporated in the other two independent tests that verify the tasks' limits and the TT limits. 


> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala reassigned HADOOP-4523:
----------------------------------------

    Assignee: Vinod K V

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649663#action_12649663 ] 

Hadoop QA commented on HADOOP-4523:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12394152/HADOOP-4523-20081118.txt
  against trunk revision 719431.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3618/console

This message is automatically generated.

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt, HADOOP-4523-20081118.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644088#action_12644088 ] 

Owen O'Malley commented on HADOOP-4523:
---------------------------------------

This jira isn't very clear. What are you proposing changing? Is it to make the mapred.tasktracker.tasks.maxmemory pluggable? If so, I'd propose making an interface like:

{code}
abstract class MemoryPlugin {
  long getVirtualMemorySize(Configuration conf);
}
{code}

and you configure an implementation of it. (mapred.server.memory.plugin ?)


> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648509#action_12648509 ] 

Hemanth Yamijala commented on HADOOP-4523:
------------------------------------------

Vinod, while going over this patch with Devaraj for a quick check, we thought it will be nice to split up the run method in the TaskMemoryManagerThread into a couple of smaller methods, just to ease readability. The rest of the changes are still fine. Can you please submit a new patch with this minor change ?

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648188#action_12648188 ] 

Hemanth Yamijala commented on HADOOP-4523:
------------------------------------------

Code looks good to me. +1

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648544#action_12648544 ] 

Vinod K V commented on HADOOP-4523:
-----------------------------------


`ant test-patch` results:

{code}
     [exec] +1 overall.  

     [exec]     +1 @author.  The patch does not contain any @author tags.

     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.

     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.

     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
{code}


> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt, HADOOP-4523-20081118.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-4523:
------------------------------

    Attachment: HADOOP-4523-200811-06.txt

Messed up the approach. Here's another patch that gets it right. It
 - monitors and kills any tasks that cross the individual tasks' limits they have.
 - kills the task with least progress(via tasktracker.findTaskToKill()) if even after the first step the total memory usage across all tasks goes over the total usage allowed.
 - includes the tests testTasksWithNoIndividualLimitsButTotalUsageWithinTTLimits, testTasksWithinIndividualLimitsAndTotalUsageWithinTTLimits, testTasksBeyondIndividualLimitsAndTotalUsageWithinTTLimits and testTasksWithinIndividualLimitsButTotalUsageBeyondTTLimits.

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649183#action_12649183 ] 

Hudson commented on HADOOP-4523:
--------------------------------

Integrated in Hadoop-trunk #665 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/665/])
    . Prevent too many tasks scheduled on a node from bringing it down by monitoring for cumulative memory usage across tasks. Contributed by Vinod Kumar Vavilapalli


> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>             Fix For: 0.20.0
>
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt, HADOOP-4523-20081118.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-4523:
------------------------------

    Attachment: HADOOP-4523-20081110.txt

bq. The latest patch kills only the last task that started if the sum total of all tasks' memory usage goes beyond the configured limit. Picking up only one task may or may not bring down the usage to within the configured limits.
Attaching a new patch to address this. TaskMemoryManagerThread now calls {{TaskTracker.findTaskToKill()}} repeatedly to find a few tasks with the least progress so as to bring down the total memory usage of all tasks falls below TT's limit, and then kills them. Modified the signature of {{TaskTracker.findTaskToKill()}} to {{TaskTracker.findTaskToKill(List<TaskAttempId> tasksToExclude)}} so as to help excluding tasks that are already marked for killing.

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vivek Ratan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642871#action_12642871 ] 

Vivek Ratan commented on HADOOP-4523:
-------------------------------------

HADOOP-3759 provides a configuration value, _mapred.tasktracker.tasks.maxmemory_, which specifies the total VM on a machine available to tasks spawned by the TT. Along with HADOOP-4439, it provides a cluster-wide default for the maximum VM associated per task, _mapred.task.default.maxmemory_. This value can be overridden by individual jobs. HADOOP-3581 implements a monitoring mechanism that kill tasks if they go over their _maxmemory_ value. Keeping all this in mind, here's a proposal for what we need to additionally do: 

If _tasks.maxmemory_ is set, the TT monitors the total memory usage of all tasks spawned by the TT. If this value goes over _tasks.maxmemory_, the TT needs to kill one or more tasks. It first looks for tasks whose individual memory is over their _default.maxmemory_ value. These are killed (while you may ideally want to kill just enough that your total memory usage comes down, it's not obvious which of these violators you choose to kill, so it's probably simpler to kill all). If no such task is found, or if killing one or more of these tasks still takes us over the memory limit, we need to pick other tasks to kill. There are many ways to do this. Probably the easiest is to kill tasks that ran most recently. 

Tasks that are killed because they went over their memory limit should be treated as failed, since they violated their contract. Tasks that are killed because the sum total of memory usage was over a limit should be treated as killed, since it's not really their fault. 

Another improvement is to let _mapred.tasktracker.tasks.maxmemory_ be set by an external script, which lets Ops control what this value should be. A slightly less desirable option, as indicated in some offline discussions with Alan W, is to set this value to be an absolute number ("hadoop may use X amount") or an offset of the total amount of memory on the machine ("hadoop may use all but  4g"). 

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>            Reporter: Vivek Ratan
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hemanth Yamijala (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated HADOOP-4523:
-------------------------------------

          Component/s: mapred
    Affects Version/s: 0.19.0

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647536#action_12647536 ] 

Hadoop QA commented on HADOOP-4523:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12393859/HADOOP-4523-20081113.txt
  against trunk revision 713893.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3590/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3590/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3590/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3590/console

This message is automatically generated.

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vivek Ratan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642871#action_12642871 ] 

vivekr edited comment on HADOOP-4523 at 10/28/08 9:29 PM:
---------------------------------------------------------------

HADOOP-3759 provides a configuration value, _mapred.tasktracker.tasks.maxmemory_, which specifies the total VM on a machine available to tasks spawned by the TT. Along with HADOOP-4439, it provides a cluster-wide default for the maximum VM associated per task, _mapred.task.default.maxmemory_. This value can be overridden by individual jobs. HADOOP-3581 implements a monitoring mechanism that kill tasks if they go over their _maxmemory_ value. Keeping all this in mind, here's a proposal for what we need to additionally do: 

If _tasks.maxmemory_ is set, the TT monitors the total memory usage of all tasks spawned by the TT. If this value goes over _tasks.maxmemory_, the TT needs to kill one or more tasks. It first looks for tasks whose individual memory is over their _default.maxmemory_ value. These are killed (while you may ideally want to kill just enough that your total memory usage comes down, it's not obvious which of these violators you choose to kill, so it's probably simpler to kill all). If no such task is found, or if killing one or more of these tasks still takes us over the memory limit, we need to pick other tasks to kill. There are many ways to do this. Probably the easiest is to kill tasks that ran most recently. 

Tasks that are killed because they went over their memory limit should be treated as failed, since they violated their contract. Tasks that are killed because the sum total of memory usage was over a limit should be treated as killed, since it's not really their fault. 

Another improvement is to let _mapred.tasktracker.tasks.maxmemory_ be set by an external script, which lets Ops control what this value should be. A slightly less desirable option, as indicated in some offline discussions with Allen W, is to set this value to be an absolute number ("hadoop may use X amount") or an offset of the total amount of memory on the machine ("hadoop may use all but  4g"). 

      was (Author: vivekr):
    HADOOP-3759 provides a configuration value, _mapred.tasktracker.tasks.maxmemory_, which specifies the total VM on a machine available to tasks spawned by the TT. Along with HADOOP-4439, it provides a cluster-wide default for the maximum VM associated per task, _mapred.task.default.maxmemory_. This value can be overridden by individual jobs. HADOOP-3581 implements a monitoring mechanism that kill tasks if they go over their _maxmemory_ value. Keeping all this in mind, here's a proposal for what we need to additionally do: 

If _tasks.maxmemory_ is set, the TT monitors the total memory usage of all tasks spawned by the TT. If this value goes over _tasks.maxmemory_, the TT needs to kill one or more tasks. It first looks for tasks whose individual memory is over their _default.maxmemory_ value. These are killed (while you may ideally want to kill just enough that your total memory usage comes down, it's not obvious which of these violators you choose to kill, so it's probably simpler to kill all). If no such task is found, or if killing one or more of these tasks still takes us over the memory limit, we need to pick other tasks to kill. There are many ways to do this. Probably the easiest is to kill tasks that ran most recently. 

Tasks that are killed because they went over their memory limit should be treated as failed, since they violated their contract. Tasks that are killed because the sum total of memory usage was over a limit should be treated as killed, since it's not really their fault. 

Another improvement is to let _mapred.tasktracker.tasks.maxmemory_ be set by an external script, which lets Ops control what this value should be. A slightly less desirable option, as indicated in some offline discussions with Alan W, is to set this value to be an absolute number ("hadoop may use X amount") or an offset of the total amount of memory on the machine ("hadoop may use all but  4g"). 
  
> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vivek Ratan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646203#action_12646203 ] 

Vivek Ratan commented on HADOOP-4523:
-------------------------------------

bq. Should we really be picking up enough tasks to kill ?

Yes. Kill one or more so you go below the limit, as mentioned in the summary in HADOOP-4035. 

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648598#action_12648598 ] 

Vinod K V commented on HADOOP-4523:
-----------------------------------

Hudson's patch queue is very long. I ran `ant test` on my machine. It built successfully.

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt, HADOOP-4523-20081118.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-4523:
------------------------------

    Status: Patch Available  (was: Open)

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-4523:
------------------------------

    Status: Open  (was: Patch Available)

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-4523:
------------------------------

    Attachment: HADOOP-4523-200811-05.txt

Attaching a patch. This
 - makes TaskMemoryManagerThread to observe total memory usage across all tasks. If total usage crosses overall limit, TT tries and kills any tasks which cross individual task limits. If it cannot find such tasks, it kills the task with the least progress found via TaskTracker.findTaskToKill() which has already been used in case of overflowing disk. This method first tries to find the reduce task with least progress, otherwise it returns the map task with least progress.
 - marks tasks killed because of transgressing individual limits as failed, otherwise they are marked as killed.
 - includes testTasksWithinTTLimits, testTaskBeyondIndividualLimitsAndTotalUsageBeyondTTLimits and testTaskBeyondIndividualLimitsButTotalUsageWithinTTLimits. Couldn't write a test to check killing of a task with least progress; simulating this situation proved very difficult.

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4523) Enhance how memory-intensive user tasks are handled

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-4523:
------------------------------

    Status: Patch Available  (was: Open)

> Enhance how memory-intensive user tasks are handled
> ---------------------------------------------------
>
>                 Key: HADOOP-4523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4523
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Vinod K V
>         Attachments: HADOOP-4523-200811-05.txt, HADOOP-4523-200811-06.txt, HADOOP-4523-20081110.txt, HADOOP-4523-20081113.txt
>
>
> HADOOP-3581 monitors each Hadoop task to see if its memory usage (which includes usage of any tasks spawned by it and so on) is within a per-task limit. If the task's memory usage goes over its limit, the task is killed. This, by itself, is not enough to prevent badly behaving jobs from bringing down nodes. What is also needed is the ability to make sure that the sum total of VM usage of all Hadoop tasks does not exceed a certain limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.