You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2018/03/24 07:34:00 UTC
[jira] [Comment Edited] (HIVE-18866) Semijoin: Implement a Long ->
Hash64 vector fast-path
[ https://issues.apache.org/jira/browse/HIVE-18866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412467#comment-16412467 ]
Gopal V edited comment on HIVE-18866 at 3/24/18 7:33 AM:
---------------------------------------------------------
This codepath also runs hot during the HLL computation for column stats on Double & Longs.
{code}
15.83 │ movzbq 0x18(%r10,%rsi,1),%rax
0.03 │ cmp %ebx,%ecx
│ . jae 32d
0.03 │ vmovd %ebx,%xmm0
0.00 │ vmovd %edx,%xmm1
0.02 │ movslq %r13d,%r14
2.81 │ movzbq 0x19(%r10,%r14,1),%r8
2.79 │ movzbq 0x1f(%r10,%r14,1),%rcx
0.45 │ movzbq 0x1a(%r10,%r14,1),%rbx
0.60 │ movzbq 0x1b(%r10,%r14,1),%rdx
0.41 │ movzbq 0x1c(%r10,%r14,1),%rsi
1.63 │ movzbq 0x1d(%r10,%r14,1),%r13
0.78 │ movzbq 0x1e(%r10,%r14,1),%r14
{code}
{code}
0.00 │ vmovq %xmm2,%r10 ▒
0.03 │ mov %r10d,%r10d ▒
│ movslq %r11d,%rsi ▒
│ mov %dil,0x19(%r8,%rsi,1) ▒
0.00 │ vmovq %xmm2,%r11 ▒
0.05 │ sar $0x8,%r11 ▒
│ vmovq %xmm2,%r9 ▒
│ sar $0x10,%r9 ▒
0.00 │ mov %r11d,%r11d ▒
0.03 │ mov %r9d,%r9d ▒
│ vmovq %xmm2,%rdi ▒
│ sar $0x18,%rdi ▒
0.00 │ vmovq %xmm2,%rdx ▒
0.05 │ sar $0x20,%rdx ▒
│ mov %edi,%eax ▒
│ mov %edx,%r13d ▒
0.00 │ vmovq %xmm2,%rdi ▒
0.04 │ sar $0x28,%rdi ▒
│ mov %edi,%edi ▒
0.00 │ mov %dil,0x1a(%r8,%rsi,1) ▒
0.00 │ mov %r13b,0x1b(%r8,%rsi,1) ▒
0.05 │ mov %al,0x1c(%r8,%rsi,1) ▒
│ mov %r9b,0x1d(%r8,%rsi,1) ▒
│ mov %r11b,0x1e(%r8,%rsi,1) ▒
0.00 │ mov %r10b,0x1f(%r8,%rsi,1) ▒
0.03 │ test %ebx,%ebx ▒
│ . jne b99 ▒
0.00 │ mov %r8,%rsi ▒
│ xor %edx,%edx ▒
0.00 │ mov $0x19919,%r8d ▒
0.05 │ xchg %ax,%ax ▒
0.00 │ + callq Lorg/apache/hive/common/util/Murmur3;.hash64 ▒
0.01 │ mov %rax,%rdx ▒
0.02 │ mov (%rsp),%rsi ▒
0.03 │ + callq Lorg/apache/hadoop/hive/common/ndv/hll/HyperLogLog;.add
{code}
was (Author: gopalv):
This codepath also runs hot during the HLL computation for column stats on Double & Longs.
{code}
15.83 │ movzbq 0x18(%r10,%rsi,1),%rax
0.03 │ cmp %ebx,%ecx
│ . jae 32d
0.03 │ vmovd %ebx,%xmm0
0.00 │ vmovd %edx,%xmm1
0.02 │ movslq %r13d,%r14
2.81 │ movzbq 0x19(%r10,%r14,1),%r8
2.79 │ movzbq 0x1f(%r10,%r14,1),%rcx
0.45 │ movzbq 0x1a(%r10,%r14,1),%rbx
0.60 │ movzbq 0x1b(%r10,%r14,1),%rdx
0.41 │ movzbq 0x1c(%r10,%r14,1),%rsi
1.63 │ movzbq 0x1d(%r10,%r14,1),%r13
0.78 │ movzbq 0x1e(%r10,%r14,1),%r14
{code}
> Semijoin: Implement a Long -> Hash64 vector fast-path
> -----------------------------------------------------
>
> Key: HIVE-18866
> URL: https://issues.apache.org/jira/browse/HIVE-18866
> Project: Hive
> Issue Type: Improvement
> Components: Vectorization
> Reporter: Gopal V
> Priority: Major
> Labels: performance
> Attachments: perf-hash64-long.png
>
>
> A significant amount of CPU is wasted with JMM restrictions on byte[] arrays.
> To transform from one Long -> another Long, this goes into a byte[] array, which shows up as a hotspot.
> !perf-hash64-long.png!
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)