You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Sandy Ryza (JIRA)" <ji...@apache.org> on 2013/08/15 22:01:47 UTC
[jira] [Commented] (MAPREDUCE-5462) In map-side sort, swap entire
meta entries instead of indexes for better cache performance
[ https://issues.apache.org/jira/browse/MAPREDUCE-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741418#comment-13741418 ]
Sandy Ryza commented on MAPREDUCE-5462:
---------------------------------------
Submitting a patch based on half of Todd's patch from MAPREDUCE-3235.
I benchmarked the change with the LocalJobRunner, using a WordCount job with a single map task on 64 MB of data. I did five runs with and without the patch. In all runs, the rest of the job after the map task finished took less than a second. I measured cache misses using the perf command.
Average cache misses without the change: 165,083,881 (stddev 986,099)
Average job run time without the change: 14.46 seconds (stddev 1.24)
Average cache misses with the change: 83,130,729 (stddev 342,826)
Average job run time with the change: 12.018 seconds (stddev 1.95)
> In map-side sort, swap entire meta entries instead of indexes for better cache performance
> -------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5462
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5462
> Project: Hadoop Map/Reduce
> Issue Type: Sub-task
> Components: performance, task
> Affects Versions: 2.1.0-beta
> Reporter: Sandy Ryza
> Assignee: Sandy Ryza
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira