You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2018/03/24 07:34:00 UTC

[jira] [Comment Edited] (HIVE-18866) Semijoin: Implement a Long -> Hash64 vector fast-path

    [ https://issues.apache.org/jira/browse/HIVE-18866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412467#comment-16412467 ] 

Gopal V edited comment on HIVE-18866 at 3/24/18 7:33 AM:
---------------------------------------------------------

This codepath also runs hot during the HLL computation for column stats on Double & Longs.

{code}
 15.83 │       movzbq 0x18(%r10,%rsi,1),%rax                                                                                                                                                                      
  0.03 │       cmp    %ebx,%ecx                                                                                                                                                                                   
       │     . jae    32d                                                                                                                                                                                         
  0.03 │       vmovd  %ebx,%xmm0                                                                                                                                                                                  
  0.00 │       vmovd  %edx,%xmm1                                                                                                                                                                                  
  0.02 │       movslq %r13d,%r14                                                                                                                                                                                  
  2.81 │       movzbq 0x19(%r10,%r14,1),%r8                                                                                                                                                                       
  2.79 │       movzbq 0x1f(%r10,%r14,1),%rcx                                                                                                                                                                      
  0.45 │       movzbq 0x1a(%r10,%r14,1),%rbx                                                                                                                                                                      
  0.60 │       movzbq 0x1b(%r10,%r14,1),%rdx                                                                                                                                                                      
  0.41 │       movzbq 0x1c(%r10,%r14,1),%rsi                                                                                                                                                                      
  1.63 │       movzbq 0x1d(%r10,%r14,1),%r13                                                                                                                                                                      
  0.78 │       movzbq 0x1e(%r10,%r14,1),%r14  
{code}

{code}
 0.00 │       vmovq  %xmm2,%r10                                                                                                                                                                                 ▒
  0.03 │       mov    %r10d,%r10d                                                                                                                                                                                ▒
       │       movslq %r11d,%rsi                                                                                                                                                                                 ▒
       │       mov    %dil,0x19(%r8,%rsi,1)                                                                                                                                                                      ▒
  0.00 │       vmovq  %xmm2,%r11                                                                                                                                                                                 ▒
  0.05 │       sar    $0x8,%r11                                                                                                                                                                                  ▒
       │       vmovq  %xmm2,%r9                                                                                                                                                                                  ▒
       │       sar    $0x10,%r9                                                                                                                                                                                  ▒
  0.00 │       mov    %r11d,%r11d                                                                                                                                                                                ▒
  0.03 │       mov    %r9d,%r9d                                                                                                                                                                                  ▒
       │       vmovq  %xmm2,%rdi                                                                                                                                                                                 ▒
       │       sar    $0x18,%rdi                                                                                                                                                                                 ▒
  0.00 │       vmovq  %xmm2,%rdx                                                                                                                                                                                 ▒
  0.05 │       sar    $0x20,%rdx                                                                                                                                                                                 ▒
       │       mov    %edi,%eax                                                                                                                                                                                  ▒
       │       mov    %edx,%r13d                                                                                                                                                                                 ▒
  0.00 │       vmovq  %xmm2,%rdi                                                                                                                                                                                 ▒
  0.04 │       sar    $0x28,%rdi                                                                                                                                                                                 ▒
       │       mov    %edi,%edi                                                                                                                                                                                  ▒
  0.00 │       mov    %dil,0x1a(%r8,%rsi,1)                                                                                                                                                                      ▒
  0.00 │       mov    %r13b,0x1b(%r8,%rsi,1)                                                                                                                                                                     ▒
  0.05 │       mov    %al,0x1c(%r8,%rsi,1)                                                                                                                                                                       ▒
       │       mov    %r9b,0x1d(%r8,%rsi,1)                                                                                                                                                                      ▒
       │       mov    %r11b,0x1e(%r8,%rsi,1)                                                                                                                                                                     ▒
  0.00 │       mov    %r10b,0x1f(%r8,%rsi,1)                                                                                                                                                                     ▒
  0.03 │       test   %ebx,%ebx                                                                                                                                                                                  ▒
       │     . jne    b99                                                                                                                                                                                        ▒
  0.00 │       mov    %r8,%rsi                                                                                                                                                                                   ▒
       │       xor    %edx,%edx                                                                                                                                                                                  ▒
  0.00 │       mov    $0x19919,%r8d                                                                                                                                                                              ▒
  0.05 │       xchg   %ax,%ax                                                                                                                                                                                    ▒
  0.00 │     + callq  Lorg/apache/hive/common/util/Murmur3;.hash64                                                                                                                                               ▒
  0.01 │       mov    %rax,%rdx                                                                                                                                                                                  ▒
  0.02 │       mov    (%rsp),%rsi                                                                                                                                                                                ▒
  0.03 │     + callq  Lorg/apache/hadoop/hive/common/ndv/hll/HyperLogLog;.add  
{code}


was (Author: gopalv):
This codepath also runs hot during the HLL computation for column stats on Double & Longs.

{code}
 15.83 │       movzbq 0x18(%r10,%rsi,1),%rax                                                                                                                                                                      
  0.03 │       cmp    %ebx,%ecx                                                                                                                                                                                   
       │     . jae    32d                                                                                                                                                                                         
  0.03 │       vmovd  %ebx,%xmm0                                                                                                                                                                                  
  0.00 │       vmovd  %edx,%xmm1                                                                                                                                                                                  
  0.02 │       movslq %r13d,%r14                                                                                                                                                                                  
  2.81 │       movzbq 0x19(%r10,%r14,1),%r8                                                                                                                                                                       
  2.79 │       movzbq 0x1f(%r10,%r14,1),%rcx                                                                                                                                                                      
  0.45 │       movzbq 0x1a(%r10,%r14,1),%rbx                                                                                                                                                                      
  0.60 │       movzbq 0x1b(%r10,%r14,1),%rdx                                                                                                                                                                      
  0.41 │       movzbq 0x1c(%r10,%r14,1),%rsi                                                                                                                                                                      
  1.63 │       movzbq 0x1d(%r10,%r14,1),%r13                                                                                                                                                                      
  0.78 │       movzbq 0x1e(%r10,%r14,1),%r14  
{code}

> Semijoin: Implement a Long -> Hash64 vector fast-path
> -----------------------------------------------------
>
>                 Key: HIVE-18866
>                 URL: https://issues.apache.org/jira/browse/HIVE-18866
>             Project: Hive
>          Issue Type: Improvement
>          Components: Vectorization
>            Reporter: Gopal V
>            Priority: Major
>              Labels: performance
>         Attachments: perf-hash64-long.png
>
>
> A significant amount of CPU is wasted with JMM restrictions on byte[] arrays.
> To transform from one Long -> another Long, this goes into a byte[] array, which shows up as a hotspot.
> !perf-hash64-long.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)