You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Hal Mo (JIRA)" <ji...@apache.org> on 2012/07/04 13:29:35 UTC
[jira] [Commented] (MAPREDUCE-3235) Improve CPU cache behavior in map side sort

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406452#comment-13406452 ] 

Hal Mo commented on MAPREDUCE-3235:
-----------------------------------

bq. I guess you're working with Krystal Mok? Cool stuff, I hope to see it make it into OpenJDK as well!
Yes. 
Krystal has just left, and will join Oracle JVM core team soon. 
He has done some work on binary array compare, yet not finished, seems there are much codes to "copy" than expected.

bq. You mean that if there are more rack-local (as opposed to data-local), right? If everything is data-local (eg terasort on an empty cluster) then I would expect the CPU difference to make a more noticeable difference.
Yes, I think so. 
We are trying to test it on hive.
                
> Improve CPU cache behavior in map side sort
> -------------------------------------------
>
>                 Key: MAPREDUCE-3235
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3235
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: performance, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: map_sort_perf.diff, mr-3235-poc.txt
>
>
> When running oprofile on a terasort workload, I noticed that a large amount of CPU usage was going to MapTask$MapOutputBuffer.compare. Upon disassembling this and looking at cycle counters, most of the cycles were going to memory loads dereferencing into the array of key-value data -- implying expensive cache misses. This can be avoided as follows:
> - rather than simply swapping indexes into the kv array, swap the entire meta entries in the meta array. Swapping 16 bytes is only negligibly slower than swapping 4 bytes. This requires adding the value-length into the meta array, since we used to rely on the previous-in-the-array meta entry to determine this. So we replace INDEX with VALUELEN and avoid one layer of indirection.
> - introduce an interface which allows key types to provide a 4-byte comparison proxy. For string keys, this can simply be the first 4 bytes of the string. The idea is that, if stringCompare(key1.proxy(), key2.proxy()) != 0, then compare(key1, key2) should have the same result. If the proxies are equal, the normal comparison method is used. We then include the 4-byte proxy as part of the metadata entry, so that for many cases the indirection into the data buffer can be avoided.
> On a terasort benchmark, these optimizations plus an optimization to WritableComparator.compareBytes dropped the aggregate mapside CPU millis by 40%, and the compare() routine mostly dropped off the oprofile results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira