You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Prasanth Jayachandran (JIRA)" <ji...@apache.org> on 2018/05/17 07:50:00 UTC

[jira] [Comment Edited] (HIVE-19578) HLL merges tempList on every add

    [ https://issues.apache.org/jira/browse/HIVE-19578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478651#comment-16478651 ] 

Prasanth Jayachandran edited comment on HIVE-19578 at 5/17/18 7:49 AM:
-----------------------------------------------------------------------

Ran a quick benchmark with following scenarios
{code:java}
testHLLAddHive - Current Hive implementation that uses TreeMap for HLL Sparse Register
testHLLAddInt2ByteSortedMap - TreeMap replaced with Int2ByteSortedMap for HLL Sparse Regsiter
testHLLTreeMapPlusOpt - TreeMap + branch optimizations in HLL add() inner loop
testHLLAddInt2ByteSortedMapPlusOpt - Int2ByteSortedMap + branch optimizations in HLL add() inner loop{code}
{code:java}
Benchmark                                         Mode Cnt Score Error Units
HyperLogLogAdd.testHLLAddInt2ByteSortedMapPlusOpt avgt 10 12.773 ± 0.382 ns/op
HyperLogLogAdd.testHLLAddInt2ByteSortedMap        avgt 10 25.675 ± 0.439 ns/op
HyperLogLogAdd.testHLLTreeMapPlusOpt              avgt 10 21.978 ± 0.562 ns/op
HyperLogLogAdd.testHLLAddHive                     avgt 10 37.559 ± 0.488 ns/op{code}


was (Author: prasanth_j):
Ran a quick benchmark with following scenarios
{code:java}
testHLLAddHive - Current Hive implementation that uses TreeMap for HLL Sparse Register
testHLLAddInt2ByteSortedMap - TreeMap replaced with Int2ByteSortedMap for HLL Sparse Regsiter
testHLLAddInt2ByteSortedMapPlusOpt - Int2ByteSortedMap + branch optimizations in HLL add() inner loop{code}
{code:java}
Benchmark                                         Mode Cnt Score Error Units
HyperLogLogAdd.testHLLAddInt2ByteSortedMapPlusOpt avgt 10 12.773 ± 0.382 ns/op
HyperLogLogAdd.testHLLAddInt2ByteSortedMap        avgt 10 25.675 ± 0.439 ns/op
HyperLogLogAdd.testHLLAddHive                     avgt 10 37.559 ± 0.488 ns/op{code}

> HLL merges tempList on every add
> --------------------------------
>
>                 Key: HIVE-19578
>                 URL: https://issues.apache.org/jira/browse/HIVE-19578
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Prasanth Jayachandran
>            Priority: Major
>         Attachments: Screen Shot 2018-05-16 at 15.29.12 .png
>
>
>  See comments on HIVE-18866; this has significant perf overhead after the even bigger overhead from hashing is removed.  !Screen Shot 2018-05-16 at 15.29.12 .png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)