You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2015/04/07 09:34:12 UTC
[jira] [Commented] (HIVE-10180) Loop optimization for SIMD in
ColumnArithmeticColumn.txt
[ https://issues.apache.org/jira/browse/HIVE-10180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482746#comment-14482746 ]
Gopal V commented on HIVE-10180:
--------------------------------
[~chengxiang li]: patch looks self-explanatory, waiting for test runs.
I tested bigint+bigint on TPC-H to test this out & I'm seeing better assembly being generated in the inner loop (but not AVX2, maybe I need a later JDK8?).
{code}
7fa5b6eced29: cmp %r8d,%r10d
7fa5b6eced2c: jge 0x7fa5b6eced8a
7fa5b6eced2e: xchg %ax,%ax
7fa5b6eced30: vmovdqu 0x10(%r11,%r10,8),%xmm0
7fa5b6eced37: vpaddq 0x10(%rdx,%r10,8),%xmm0,%xmm0
7fa5b6eced3e: vmovdqu %xmm0,0x10(%rcx,%r10,8)
7fa5b6eced45: movslq %r10d,%rsi
7fa5b6eced48: vmovdqu 0x20(%r11,%rsi,8),%xmm0
7fa5b6eced4f: vpaddq 0x20(%rdx,%rsi,8),%xmm0,%xmm0
7fa5b6eced55: vmovdqu %xmm0,0x20(%rcx,%rsi,8)
7fa5b6eced5b: vmovdqu 0x30(%r11,%rsi,8),%xmm0
7fa5b6eced62: vpaddq 0x30(%rdx,%rsi,8),%xmm0,%xmm0
7fa5b6eced68: vmovdqu %xmm0,0x30(%rcx,%rsi,8)
7fa5b6eced6e: vmovdqu 0x40(%r11,%rsi,8),%xmm0
7fa5b6eced75: vpaddq 0x40(%rdx,%rsi,8),%xmm0,%xmm0
7fa5b6eced7b: vmovdqu %xmm0,0x40(%rcx,%rsi,8)
7fa5b6eced81: add $0x8,%r10d
7fa5b6eced85: cmp %r8d,%r10d
7fa5b6eced88: jl 0x7fa5b6eced30
7fa5b6eced8a: cmp 0x14(%rsp),%r10d
7fa5b6eced8f: je 0x7fa5b6ecedc8
{code}
Looks like there's a branch-miss for the jl back to the beginning of the loop.
Trying to get a linux perf cycle count of this, to confirm if that's actually real.
> Loop optimization for SIMD in ColumnArithmeticColumn.txt
> --------------------------------------------------------
>
> Key: HIVE-10180
> URL: https://issues.apache.org/jira/browse/HIVE-10180
> Project: Hive
> Issue Type: Sub-task
> Reporter: Chengxiang Li
> Assignee: Chengxiang Li
> Priority: Minor
> Attachments: HIVE-10180.1.patch, HIVE-10180.2.patch
>
>
> JVM is quite strict on the code schema which may executed with SIMD instructions, take a loop in DoubleColAddDoubleColumn.java for example,
> {code:java}
> for (int i = 0; i != n; i++) {
> outputVector[i] = vector1[0] + vector2[i];
> }
> {code}
> The "vector1[0]" reference would prevent JVM to execute this part of code with vectorized instructions, we need to assign the "vector1[0]" to a variable outside of loop, and use that variable in loop.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)