You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2012/10/26 17:43:12 UTC

[jira] [Commented] (MAPREDUCE-4752) Reduce MR AM memory usage through String Interning

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485009#comment-13485009 ] 

Hadoop QA commented on MAPREDUCE-4752:
--------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12550973/MR-4752-trunk.txt
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:red}-1 core tests{color}.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core:

                  org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2969//testReport/
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2969//console

This message is automatically generated.
                
> Reduce MR AM memory usage through String Interning
> --------------------------------------------------
>
>                 Key: MAPREDUCE-4752
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4752
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>            Reporter: Robert Joseph Evans
>            Assignee: Robert Joseph Evans
>         Attachments: MR-4752-branch-0.23.txt, MR-4752-trunk.txt
>
>
> There are a lot of strings that are duplicates of one another in the AM.  This comes from all of the PB events the come across the wire and also tasks heart-beating in through the umbilical.  There are even several duplicates from Configuration.  By "interning" all of these strings on the Heap I have been able to reduce the resting memory usage of the AM to be about 5KB per task attempt.  With about half of this coming from counters.  This results in a 5MB heap for a typical 1000 task job, or a 500MB heap for a 100,000 task attempt job.  I think I could cut the size of the counters in half by completely rewriting how counters work in the AM and History Server, but I don't think it is worth it at this point.
> I am still investigating what the memory usage of the AM is like when running very large jobs, and I will probably have a follow-up JIRA for reducing that memory usage as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira