You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Teddy Choi (JIRA)" <ji...@apache.org> on 2017/02/21 11:05:44 UTC

[jira] [Comment Edited] (HIVE-15824) Loop optimization for SIMD in double comparisons

    [ https://issues.apache.org/jira/browse/HIVE-15824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875794#comment-15875794 ] 

Teddy Choi edited comment on HIVE-15824 at 2/21/17 11:05 AM:
-------------------------------------------------------------

I made a draft implementation with Double.doubleToRawLongBits. And it's slower than original. Maybe Unsafe.getLong can be an alternative.

Before
{noformat}
o.a.h.b.v.VectorizedComparisonBench.DoubleColEqualDoubleColumnBench.bench              avgt        2  1114.343 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColEqualDoubleScalarBench.bench              avgt        2   857.549 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterDoubleColumnBench.bench            avgt        2  1410.078 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterDoubleScalarBench.bench            avgt        2   705.738 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterEqualDoubleColumnBench.bench       avgt        2  1134.613 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterEqualDoubleScalarBench.bench       avgt        2   685.269 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessDoubleColumnBench.bench               avgt        2  1248.419 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessDoubleScalarBench.bench               avgt        2   664.593 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessEqualDoubleColumnBench.bench          avgt        2  1048.175 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessEqualDoubleScalarBench.bench          avgt        2   703.839 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColNotEqualDoubleColumnBench.bench           avgt        2  1042.366 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColNotEqualDoubleScalarBench.bench           avgt        2   823.484 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarEqualDoubleColumnBench.bench           avgt        2   913.149 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarGreaterDoubleColumnBench.bench         avgt        2   692.095 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarGreaterEqualDoubleColumnBench.bench    avgt        2  3446.145 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarLessDoubleColumnBench.bench            avgt        2   685.639 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarLessEqualDoubleColumnBench.bench       avgt        2   683.063 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarNotEqualDoubleColumnBench.bench        avgt        2   815.801 ±   NaN  ms/op
{noformat}

After
{noformat}
o.a.h.b.v.VectorizedComparisonBench.DoubleColEqualDoubleColumnBench.bench              avgt        2  2042.958 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColEqualDoubleScalarBench.bench              avgt        2  1448.942 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterDoubleColumnBench.bench            avgt        2  1413.319 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterDoubleScalarBench.bench            avgt        2   806.678 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterEqualDoubleColumnBench.bench       avgt        2  1512.124 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterEqualDoubleScalarBench.bench       avgt        2   968.015 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessDoubleColumnBench.bench               avgt        2  1395.853 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessDoubleScalarBench.bench               avgt        2   818.353 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessEqualDoubleColumnBench.bench          avgt        2  1493.037 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessEqualDoubleScalarBench.bench          avgt        2   946.533 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColNotEqualDoubleColumnBench.bench           avgt        2  1860.316 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColNotEqualDoubleScalarBench.bench           avgt        2  1356.694 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarEqualDoubleColumnBench.bench           avgt        2  1607.872 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarGreaterDoubleColumnBench.bench         avgt        2   790.920 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarGreaterEqualDoubleColumnBench.bench    avgt        2   971.054 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarLessDoubleColumnBench.bench            avgt        2   795.728 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarLessEqualDoubleColumnBench.bench       avgt        2   958.495 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarNotEqualDoubleColumnBench.bench        avgt        2  1496.665 ±   NaN  ms/op
{noformat}


was (Author: teddy.choi):
I made a draft implementation with Double.doubleToRawLongBits. And it's slower than original. Maybe Unsafe.getLong can be an alternative.

Before
{code}
o.a.h.b.v.VectorizedComparisonBench.DoubleColEqualDoubleColumnBench.bench              avgt        2  1114.343 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColEqualDoubleScalarBench.bench              avgt        2   857.549 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterDoubleColumnBench.bench            avgt        2  1410.078 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterDoubleScalarBench.bench            avgt        2   705.738 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterEqualDoubleColumnBench.bench       avgt        2  1134.613 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterEqualDoubleScalarBench.bench       avgt        2   685.269 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessDoubleColumnBench.bench               avgt        2  1248.419 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessDoubleScalarBench.bench               avgt        2   664.593 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessEqualDoubleColumnBench.bench          avgt        2  1048.175 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessEqualDoubleScalarBench.bench          avgt        2   703.839 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColNotEqualDoubleColumnBench.bench           avgt        2  1042.366 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColNotEqualDoubleScalarBench.bench           avgt        2   823.484 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarEqualDoubleColumnBench.bench           avgt        2   913.149 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarGreaterDoubleColumnBench.bench         avgt        2   692.095 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarGreaterEqualDoubleColumnBench.bench    avgt        2  3446.145 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarLessDoubleColumnBench.bench            avgt        2   685.639 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarLessEqualDoubleColumnBench.bench       avgt        2   683.063 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarNotEqualDoubleColumnBench.bench        avgt        2   815.801 ±   NaN  ms/op
{code}

After
{code}
o.a.h.b.v.VectorizedComparisonBench.DoubleColEqualDoubleColumnBench.bench              avgt        2  2042.958 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColEqualDoubleScalarBench.bench              avgt        2  1448.942 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterDoubleColumnBench.bench            avgt        2  1413.319 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterDoubleScalarBench.bench            avgt        2   806.678 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterEqualDoubleColumnBench.bench       avgt        2  1512.124 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColGreaterEqualDoubleScalarBench.bench       avgt        2   968.015 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessDoubleColumnBench.bench               avgt        2  1395.853 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessDoubleScalarBench.bench               avgt        2   818.353 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessEqualDoubleColumnBench.bench          avgt        2  1493.037 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColLessEqualDoubleScalarBench.bench          avgt        2   946.533 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColNotEqualDoubleColumnBench.bench           avgt        2  1860.316 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleColNotEqualDoubleScalarBench.bench           avgt        2  1356.694 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarEqualDoubleColumnBench.bench           avgt        2  1607.872 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarGreaterDoubleColumnBench.bench         avgt        2   790.920 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarGreaterEqualDoubleColumnBench.bench    avgt        2   971.054 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarLessDoubleColumnBench.bench            avgt        2   795.728 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarLessEqualDoubleColumnBench.bench       avgt        2   958.495 ±   NaN  ms/op
o.a.h.b.v.VectorizedComparisonBench.DoubleScalarNotEqualDoubleColumnBench.bench        avgt        2  1496.665 ±   NaN  ms/op
{code}

> Loop optimization for SIMD in double comparisons
> ------------------------------------------------
>
>                 Key: HIVE-15824
>                 URL: https://issues.apache.org/jira/browse/HIVE-15824
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Teddy Choi
>            Assignee: Teddy Choi
>
> Use Double.doubleToRawLongBits, HIVE-11533 to optimize double comparisons.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)