You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Robert Joseph Evans (JIRA)" <ji...@apache.org> on 2012/10/31 16:13:12 UTC

[jira] [Created] (MAPREDUCE-4760) Make a version of Counters that is composit for the job and stores the counter values in arrays.

Robert Joseph Evans created MAPREDUCE-4760:
----------------------------------------------

             Summary: Make a version of Counters that is composit for the job and stores the counter values in arrays.
                 Key: MAPREDUCE-4760
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4760
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: mrv2
    Affects Versions: 0.23.4, 2.0.2-alpha
            Reporter: Robert Joseph Evans
            Priority: Minor


String interning reduced the size of counters a lot.  After that and the fix for a memory leak in the IPC server a job with 20000 map tasks and 3000 reducers takes about 200MB to store the state of all of the tasks.  Looking at a memory dump of the AM each task attempt has a pointer to a Counters object that is about 2kb to 3kb in size.  That means Counters account for about 56MB of the 200MB of state.  This job only had about 40 task counters in it.  Each counter stores a long value so if we stored them in a long[] instead we should only be taking up 7MB.

Also assuming that some of the counters only appear in a map task or a reduce task we should be able to have one CompositCounters for map tasks and one for reduce tasks so it would reduce the size even further. 

NOTE: without this change I would expect to be able to run a 100,000 task job in the default 1024MB AM heap (875MB/200MB * 2300) I reserved 150MB for IPC buffers and event data.  With this change we could expect to run about 130,000 tasks (875MB/150MB * 2300).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4760) Make a version of Counters that is composite for the job and stores the counter values in arrays

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated MAPREDUCE-4760:
--------------------------------------

    Summary: Make a version of Counters that is composite for the job and stores the counter values in arrays  (was: Make a version of Counters that is composit for the job and stores the counter values in arrays.)
    
> Make a version of Counters that is composite for the job and stores the counter values in arrays
> ------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4760
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4760
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 2.0.2-alpha, 0.23.4
>            Reporter: Robert Joseph Evans
>            Priority: Minor
>
> String interning reduced the size of counters a lot.  After that and the fix for a memory leak in the IPC server a job with 20000 map tasks and 3000 reducers takes about 200MB to store the state of all of the tasks.  Looking at a memory dump of the AM each task attempt has a pointer to a Counters object that is about 2kb to 3kb in size.  That means Counters account for about 56MB of the 200MB of state.  This job only had about 40 task counters in it.  Each counter stores a long value so if we stored them in a long[] instead we should only be taking up 7MB.
> Also assuming that some of the counters only appear in a map task or a reduce task we should be able to have one CompositCounters for map tasks and one for reduce tasks so it would reduce the size even further. 
> NOTE: without this change I would expect to be able to run a 100,000 task job in the default 1024MB AM heap (875MB/200MB * 2300) I reserved 150MB for IPC buffers and event data.  With this change we could expect to run about 130,000 tasks (875MB/150MB * 2300).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira