You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "WuZeyi (JIRA)" <ji...@apache.org> on 2019/07/12 03:49:00 UTC

[jira] [Created] (MAPREDUCE-7223) Quicksort GetInt performance Issue in Terasort

WuZeyi created MAPREDUCE-7223:
---------------------------------

             Summary: Quicksort GetInt performance Issue in Terasort
                 Key: MAPREDUCE-7223
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7223
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: task
    Affects Versions: 3.1.2, 2.7.6
            Reporter: WuZeyi
             Fix For: 2.7.6


I find a hot spot of 'java.nio.Bits.getIntL' in the Terasort case of Hadoop. 
It is done by shifting four bytes in the byte array each time to get an int. 
This 'getIntL' operation is repeatedly called in the quick-sort of KVbuffer which has a complexity of NlogN, and it causes the hot spot.
The element that is gotten in the quick-sort may be gotten frequently, which means it has to be shifted again and again.
After replacing 'java.nio.Bits.getIntL' with 'unsafe.getInt', the performance of quick-sort can be improved by 30%。Terasort can be improved by 10%

!terasort.png!!makeint.png!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org