You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2018/03/24 07:33:00 UTC

[jira] [Commented] (HIVE-18866) Semijoin: Implement a Long -> Hash64 vector fast-path

    [ https://issues.apache.org/jira/browse/HIVE-18866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412467#comment-16412467 ] 

Gopal V commented on HIVE-18866:
--------------------------------

This codepath also runs hot during the HLL computation for column stats on Double & Longs.

{code}
 15.83 │       movzbq 0x18(%r10,%rsi,1),%rax                                                                                                                                                                      
  0.03 │       cmp    %ebx,%ecx                                                                                                                                                                                   
       │     . jae    32d                                                                                                                                                                                         
  0.03 │       vmovd  %ebx,%xmm0                                                                                                                                                                                  
  0.00 │       vmovd  %edx,%xmm1                                                                                                                                                                                  
  0.02 │       movslq %r13d,%r14                                                                                                                                                                                  
  2.81 │       movzbq 0x19(%r10,%r14,1),%r8                                                                                                                                                                       
  2.79 │       movzbq 0x1f(%r10,%r14,1),%rcx                                                                                                                                                                      
  0.45 │       movzbq 0x1a(%r10,%r14,1),%rbx                                                                                                                                                                      
  0.60 │       movzbq 0x1b(%r10,%r14,1),%rdx                                                                                                                                                                      
  0.41 │       movzbq 0x1c(%r10,%r14,1),%rsi                                                                                                                                                                      
  1.63 │       movzbq 0x1d(%r10,%r14,1),%r13                                                                                                                                                                      
  0.78 │       movzbq 0x1e(%r10,%r14,1),%r14  
{code}

> Semijoin: Implement a Long -> Hash64 vector fast-path
> -----------------------------------------------------
>
>                 Key: HIVE-18866
>                 URL: https://issues.apache.org/jira/browse/HIVE-18866
>             Project: Hive
>          Issue Type: Improvement
>          Components: Vectorization
>            Reporter: Gopal V
>            Priority: Major
>              Labels: performance
>         Attachments: perf-hash64-long.png
>
>
> A significant amount of CPU is wasted with JMM restrictions on byte[] arrays.
> To transform from one Long -> another Long, this goes into a byte[] array, which shows up as a hotspot.
> !perf-hash64-long.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)