You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Wei-Chiu Chuang (JIRA)" <ji...@apache.org> on 2019/08/01 17:19:00 UTC

[jira] [Assigned] (MAPREDUCE-7223) Quicksort GetInt performance Issue in Terasort

     [ https://issues.apache.org/jira/browse/MAPREDUCE-7223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wei-Chiu Chuang reassigned MAPREDUCE-7223:
------------------------------------------

    Assignee: WuZeyi

> Quicksort GetInt performance Issue in Terasort
> ----------------------------------------------
>
>                 Key: MAPREDUCE-7223
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7223
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: task
>    Affects Versions: 2.7.6, 3.1.2
>            Reporter: WuZeyi
>            Assignee: WuZeyi
>            Priority: Major
>              Labels: ByteBuffer, Quick-sort, Terasort, performance, unsafe
>         Attachments: MAPREDUCE-7223-001.patch, MAPREDUCE-7223-002.patch, MAPREDUCE-7223-003.patch, makeint.png, terasort.png
>
>
> I find a hot spot of 'java.nio.Bits.getIntL' in the Terasort case of Hadoop. 
>  It is done by shifting four bytes in the byte array each time to get an int. 
>  This 'getIntL' operation is repeatedly called in the quick-sort of KVbuffer which has a complexity of NlogN, and it causes the hot spot.
>  The element that is gotten in the quick-sort may be gotten frequently, which means it has to be shifted again and again.
>  After replacing 'java.nio.Bits.getIntL' with 'unsafe.getInt', the performance of quick-sort can be improved by 30%。Terasort can be improved by 10%
>   !terasort.png!
> !makeint.png!
> Quick-sort performance: The time of quick-sort using unsafe is 16515s,and using byteBuffer is 21643s.
>           unsafe(s)   byteBuffer(s)   byteBuffer/unsafe
>  AVG   16515           21643            1.310481735



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org