You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@impala.apache.org by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org> on 2017/10/18 01:23:48 UTC

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Taras Bobrovytsky has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8309


Change subject: IMPALA-5019: Decimal V2 addition
......................................................................

IMPALA-5019: Decimal V2 addition

In this patch, we implement the new decimal return type rules for
addition expressions. These rules become active when the query option
DECIMAL_V2 is enabled. The algorithm for determining the type of the
result is described in the JIRA.

DECIMAL V1:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,38)                                                 |
+----------------------------------------------------------------+

DECIMAL V2:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,6)                                                  |
+----------------------------------------------------------------+

This patch required backend changes. We implement an algorithm where
we handle the whole and fractional parts separately, and then combine
them to get the final result. This is more complex and slower. We try
to avoid this by first checking if the result would into int128.

Testing:
- Added expr tests.
- Tested locally on my machine with a script that generates random
  decimal numbers and checks that Impala adds them correctly.

Performance:

For the common case, performance remains the same.
  select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1)

  BEFORE: 4.74s
  AFTER:  4.73s

In this case, we check if it is necessary to do the complex addition,
and it turns out to be not necessary. We see a slowdown because the
result needs to be scaled down by dividing.
  select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19)

  BEFORE: 1.63s
  AFTER:  13.57s

In following case, we take the most complex path and see the most
signification perfmance hit.
  select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37))

  BEFORE: 1.63s
  AFTER: 20.57

Change-Id: I401049c56d910eb1546a178c909c923b01239336
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
3 files changed, 282 insertions(+), 69 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 5:

Patch 5 is a rebase.


-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 5
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Sat, 18 Nov 2017 00:25:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 3:

(5 comments)

I'm gradually understanding this better. Had some further suggestions to make it easier to follow but I think I'm getting close.

http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@179
PS3, Line 179:     int128_t& x_left, int128_t& x_right, int128_t& y_left, int128_t& y_right) {
Use pointers for output arguments: https://google.github.io/styleguide/cppguide.html#Reference_Arguments


http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@214
PS3, Line 214: right_overflow
carry_to_left might be a better name? Since the intent is to carry this into the left.


http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@243
PS3, Line 243: inline int128_t SubtractLarge(int128_t x, int x_scale, int128_t y, int y_scale,
What do you think about make it so that this always computes x - y, and making the caller responsible for switching the arguments? I think it would be easier to follow because there's one less case to handle.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305
PS2, Line 305:     RESULT_T y = 0;
> Delta scale is different in AdjustToSameScale(). There are two delta scales
Ah ok. I wonder if renaming the variable would make it clearer. Something like result_scale_delta to make it clearer that it's between the result scale and something else? Or result_scale_decrease to indicate that we need to scale down the result by this amount.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310
PS2, Line 310:   }
> Added comment.
Thanks. I still think something is missing in the AdjustToSameScale() name or comment, since I couldn't tell that it was intended to adjust them to the max scale without reading the code. Maybe it just needs a comment explaining how the output scale is computed.



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Mon, 30 Oct 2017 21:12:06 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................

IMPALA-5019: Decimal V2 addition

In this patch, we implement the new decimal return type rules for
addition expressions. These rules become active when the query option
DECIMAL_V2 is enabled. The algorithm for determining the type of the
result is described in the JIRA.

DECIMAL V1:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,38)                                                 |
+----------------------------------------------------------------+

DECIMAL V2:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,6)                                                  |
+----------------------------------------------------------------+

This patch required backend changes. We implement an algorithm where
we handle the whole and fractional parts separately, and then combine
them to get the final result. This is more complex and slower. We try
to avoid this by first checking if the result would fit into int128.

Testing:
- Added expr tests.
- Tested locally on my machine with a script that generates random
  decimal numbers and checks that Impala adds them correctly.

Performance:

For the common case, performance remains the same.
  select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1)

  BEFORE: 4.74s
  AFTER:  4.73s

In this case, we check if it is necessary to do the complex addition,
and it turns out to be not necessary. We see a slowdown because the
result needs to be scaled down by dividing.
  select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19)

  BEFORE: 1.63s
  AFTER:  13.57s

In following case, we take the most complex path and see the most
signification perfmance hit.
  select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37))

  BEFORE: 1.63s
  AFTER: 20.57

Change-Id: I401049c56d910eb1546a178c909c923b01239336
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.h
M be/src/runtime/decimal-value.inline.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
5 files changed, 392 insertions(+), 87 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/4
-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 4: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@243
PS3, Line 243: // of the numbers can be zero and one must be positive and the other one negative.
> The way it is currently implemented, I don't see an advantage in doing that
I looked at it again and it seems fine as is.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310
PS2, Line 310:     DCHECK(!*overflow) << "Cannot overflow unless result is Decimal16Value";
> Modified the comment of AdjustToSameScale() slightly. I was  debating chang
wfm



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Thu, 16 Nov 2017 23:39:27 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.

Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc@a2402
PS4, Line 2402: 
why does this rounding no longer happen in decimal v2?



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Fri, 17 Nov 2017 17:59:54 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.

Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc@a2402
PS4, Line 2402: 
> Regarding the first point:
I'm not sure what the intent of the original test case is and why it was written this way.  Zach added this test case as part of the rounding for cast to DECIMAL, so it seems we may be losing some coverage he intended to get.  Could you check with Zach to see what he thinks and if we should rewrite this (and other) test cases?

If we leave this test case here, I think a comment explaining why 999.5 is the "correct" answer would be helpful, given that CAST(999.55 as DECIMAL(4.1)) should normally result in 999.6



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Mon, 20 Nov 2017 22:01:04 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 2:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG@31
PS1, Line 31: to avoid this by first checking if the result would into int128.
> fit?
Done


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@166
PS2, Line 166: LeadingZeroAdjustment
> I think the name could make it clearer what it's computing. Maybe something
Done. I like the new name.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@170
PS2, Line 170: floor(log2(b))
> The table values are actually floor(log2(10^b)) for i = 0, 1, 2, 3... etc, 
Yes, exactly. Fixed the comment.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@181
PS2, Line 181: same_sign
> We usually make numeric/bool template arguments upper case to make it clear
Separated this large function into 3 smaller ones, as you suggested.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@185
PS2, Line 185:   // This is expected to be handled by the caller.
> What about the case when they're zero? That's also meant to be handled by t
It wasn't possible for this condition to fail because at least one must be greater than zero for the following reasons:
- If they are both zero, we will not be calling this function, because of the leading zero test.
- If one is negative and the other is zero, we will invert it on line 327

I modified this check in the new implementation.

Yes we do have cases, where we add zero.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@272
PS2, Line 272:     DCHECK_EQ(result_scale, std::max(this_scale, other_scale));
> A brief comment that this is guaranteed by the frontend migh help people ne
Done


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@277
PS2, Line 277:       DCHECK(!*overflow) << "Cannot overflow unless result is Decimal16Value";
> extra indent here
Done


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@287
PS2, Line 287:   if (this_scale < other_scale) {
> It might be slightly easier to follow if the branches were(delta_scale < 0)
Rewrote the code to make it more clear.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@299
PS2, Line 299: into
> long line.
Done


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305
PS2, Line 305:     bool ovf = AdjustToSameScale(*this, this_scale, other, other_scale,
> It feels a bit weird that we're calculating delta_scale above and inside Ad
Delta scale is different in AdjustToSameScale(). There are two delta scales in this patch: between x and y and between max(x_scale, y_scale) and result_scale.

What we are doing here and in AddLarge() is first adjust to the same scale, do the addition, then scale down to result_scale.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310
PS2, Line 310:     if (delta_scale > 0) {
> This might need a brief comment to explain why it's necessary (or perhaps t
Added comment.


http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java:

http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java@228
PS2, Line 228:    * TODO: implement V2 rules for ADD/SUB.
> TODO not needed.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Sat, 28 Oct 2017 00:02:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc@a2402
PS4, Line 2402: 
> I'm not sure what the intent of the original test case is and why it was wr
I think it's fine to leave as is actually because this problem is fixed in IMPALA-6183, look at patch set 5 (it was rebased on top of the fix). After that fix, these tests work as intended I think.



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Mon, 20 Nov 2017 22:54:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.

Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 5: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc@a2402
PS4, Line 2402: 
> I think it's fine to leave as is actually because this problem is fixed in 
Ah okay, yeah that looks better now.



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 5
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Nov 2017 17:51:27 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................

IMPALA-5019: Decimal V2 addition

In this patch, we implement the new decimal return type rules for
addition expressions. These rules become active when the query option
DECIMAL_V2 is enabled. The algorithm for determining the type of the
result is described in the JIRA.

DECIMAL V1:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,38)                                                 |
+----------------------------------------------------------------+

DECIMAL V2:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,6)                                                  |
+----------------------------------------------------------------+

This patch required backend changes. We implement an algorithm where
we handle the whole and fractional parts separately, and then combine
them to get the final result. This is more complex and slower. We try
to avoid this by first checking if the result would fit into int128.

Testing:
- Added expr tests.
- Tested locally on my machine with a script that generates random
  decimal numbers and checks that Impala adds them correctly.

Performance:

For the common case, performance remains the same.
  select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1)

  BEFORE: 4.74s
  AFTER:  4.73s

In this case, we check if it is necessary to do the complex addition,
and it turns out to be not necessary. We see a slowdown because the
result needs to be scaled down by dividing.
  select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19)

  BEFORE: 1.63s
  AFTER:  13.57s

In following case, we take the most complex path and see the most
signification perfmance hit.
  select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37))

  BEFORE: 1.63s
  AFTER: 20.57

Change-Id: I401049c56d910eb1546a178c909c923b01239336
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.h
M be/src/runtime/decimal-value.inline.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
5 files changed, 386 insertions(+), 76 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/5
-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 5
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................

IMPALA-5019: Decimal V2 addition

In this patch, we implement the new decimal return type rules for
addition expressions. These rules become active when the query option
DECIMAL_V2 is enabled. The algorithm for determining the type of the
result is described in the JIRA.

DECIMAL V1:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,38)                                                 |
+----------------------------------------------------------------+

DECIMAL V2:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,6)                                                  |
+----------------------------------------------------------------+

This patch required backend changes. We implement an algorithm where
we handle the whole and fractional parts separately, and then combine
them to get the final result. This is more complex and slower. We try
to avoid this by first checking if the result would fit into int128.

Testing:
- Added expr tests.
- Tested locally on my machine with a script that generates random
  decimal numbers and checks that Impala adds them correctly.

Performance:

For the common case, performance remains the same.
  select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1)

  BEFORE: 4.74s
  AFTER:  4.73s

In this case, we check if it is necessary to do the complex addition,
and it turns out to be not necessary. We see a slowdown because the
result needs to be scaled down by dividing.
  select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19)

  BEFORE: 1.63s
  AFTER:  13.57s

In following case, we take the most complex path and see the most
signification perfmance hit.
  select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37))

  BEFORE: 1.63s
  AFTER: 20.57

Change-Id: I401049c56d910eb1546a178c909c923b01239336
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
4 files changed, 387 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/3
-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc@a2402
PS4, Line 2402: 
> But 1000 - 0.45 = 999.55. When casting that to DECIMAL(4,1), why doesn't th
the result of power(10, 3) - 0.45 was a double. Converted to decimal it looked like this: 999.5500000000000000 (decimal(38,16)). Converted back to double it was something like 999.4999999 (due to IMPALA-6183), then converting to decimal made it 999.5. This is no longer the case.



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Sat, 18 Nov 2017 00:41:16 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8309/1/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java:

http://gerrit.cloudera.org:8080/#/c/8309/1/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java@268
PS1, Line 268:         // Not yet implemented - fall back to V1 rules.
I think this can now be removed now.



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Oct 2017 02:30:22 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 4:

I'm happy with this - did anyone else want to take a pass over it?


-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Thu, 16 Nov 2017 23:40:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@179
PS3, Line 179:     int128_t& x_left, int128_t& x_right, int128_t& y_left, int128_t& y_right) {
> Use pointers for output arguments: https://google.github.io/styleguide/cppg
Done


http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@214
PS3, Line 214: right_overflow
> carry_to_left might be a better name? Since the intent is to carry this int
Agreed, changed.


http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@243
PS3, Line 243: inline int128_t SubtractLarge(int128_t x, int x_scale, int128_t y, int y_scale,
> What do you think about make it so that this always computes x - y, and mak
The way it is currently implemented, I don't see an advantage in doing that. We do not use the subtraction operation in this function, we still add x and y together.

Are you suggesting to rewrite it, so that x and y are both expected to be positive and we will be doing x - y here? It's not quite clear to me which case would be eliminated if we rewrite it that way. (Both cases on line 264 would still need to be present)


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305
PS2, Line 305:     RESULT_T y = 0;
> Ah ok. I wonder if renaming the variable would make it clearer. Something l
Ranamed to result_scale_decrease


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310
PS2, Line 310:   }
> Thanks. I still think something is missing in the AdjustToSameScale() name 
Modified the comment of AdjustToSameScale() slightly. I was  debating changing the name to AdjustToMaxScale(), but that might be confusing (because max scale is 38)).



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Nov 2017 01:43:37 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc@a2402
PS4, Line 2402: 
> Still not following. The old V2 expected value was 999.6 not 999.5. 
Regarding the first point:
the type of "power(10, 3) - 0.45" in decimal v2 before this patch is decimal(38, 17).
the type of "power(10, 3) - 0.45" in decimal v2 after this patch is decimal(38, 16).

This causes the result of the conversion to double to be slightly different. 

Before this patch (in decimal v2) after converting to double it looks like this: 999.5500156861889
After this patch, it looks like this: 999.5499727558438

This is why the result was 999.5 and not 999.6 after the patch.

This is all due to IMPALA-6183, which was fixed (and is reflected in the latest patch).



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Mon, 20 Nov 2017 21:34:17 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................

IMPALA-5019: Decimal V2 addition

In this patch, we implement the new decimal return type rules for
addition expressions. These rules become active when the query option
DECIMAL_V2 is enabled. The algorithm for determining the type of the
result is described in the JIRA.

DECIMAL V1:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,38)                                                 |
+----------------------------------------------------------------+

DECIMAL V2:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,6)                                                  |
+----------------------------------------------------------------+

This patch required backend changes. We implement an algorithm where
we handle the whole and fractional parts separately, and then combine
them to get the final result. This is more complex and slower. We try
to avoid this by first checking if the result would fit into int128.

Testing:
- Added expr tests.
- Tested locally on my machine with a script that generates random
  decimal numbers and checks that Impala adds them correctly.

Performance:

For the common case, performance remains the same.
  select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1)

  BEFORE: 4.74s
  AFTER:  4.73s

In this case, we check if it is necessary to do the complex addition,
and it turns out to be not necessary. We see a slowdown because the
result needs to be scaled down by dividing.
  select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19)

  BEFORE: 1.63s
  AFTER:  13.57s

In following case, we take the most complex path and see the most
signification perfmance hit.
  select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37))

  BEFORE: 1.63s
  AFTER: 20.57

Change-Id: I401049c56d910eb1546a178c909c923b01239336
Reviewed-on: http://gerrit.cloudera.org:8080/8309
Reviewed-by: Taras Bobrovytsky <tb...@cloudera.com>
Reviewed-by: Dan Hecht <dh...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.h
M be/src/runtime/decimal-value.inline.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
5 files changed, 386 insertions(+), 76 deletions(-)

Approvals:
  Taras Bobrovytsky: Looks good to me, but someone else must approve
  Dan Hecht: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 6
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 2:

(12 comments)

I spent some time getting to understand the code. Still don't 100% comprehend it but getting closer.

http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG@31
PS1, Line 31: to avoid this by first checking if the result would into int128.
fit?


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@166
PS2, Line 166: LeadingZeroAdjustment
I think the name could make it clearer what it's computing. Maybe something like MinLeadingZerosAfterScaling()


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@170
PS2, Line 170: floor(log2(b))
The table values are actually floor(log2(10^b)) for i = 0, 1, 2, 3... etc, right? I found this notation a bit confusing.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@181
PS2, Line 181: same_sign
We usually make numeric/bool template arguments upper case to make it clear they're constants.

I'm not sure that templates are really necessary here though. It seems like we could just factor out the shared parts into helper functions and have different functions implementing the same sign/different sign algorithms. E.g.

  SeparateFractional(x, y, x_scale, y_scale, &x_left, &x_right, &y_left, &y_right);
  int max_scale = std::max(x_scale, y_scale);
  int delta_scale = max_scale - result_scale;
  ... do the same/different sign algorithm ...
 ...


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@185
PS2, Line 185:   // This is expected to be handled by the caller.
What about the case when they're zero? That's also meant to be handled by the caller, right? Do we test cases with multiplying by zero?


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@272
PS2, Line 272:     DCHECK_EQ(result_scale, std::max(this_scale, other_scale));
A brief comment that this is guaranteed by the frontend migh help people new to the code.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@277
PS2, Line 277:       DCHECK(!*overflow) << "Cannot overflow unless result is Decimal16Value";
extra indent here


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@287
PS2, Line 287:   if (this_scale < other_scale) {
It might be slightly easier to follow if the branches were(delta_scale < 0) and (delta_scale > 0) - it's less obvious this way that it's forcing delta_scale to be positive.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@299
PS2, Line 299: into
long line.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305
PS2, Line 305:     bool ovf = AdjustToSameScale(*this, this_scale, other, other_scale,
It feels a bit weird that we're calculating delta_scale above and inside AdjustToSameScale() - it seems like the delta_scale logic in this function and AdjustToSameScale() are tightly coupled - AdjustToSameScale() doesn't really abstract it away if the caller has to know the details to correctly adjust the output result. I'm not sure if there's a better way to factor it.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310
PS2, Line 310:     if (delta_scale > 0) {
This might need a brief comment to explain why it's necessary (or perhaps there's some way we can improve how AdjustToSameScale() is factored so that it's clearer what the expected result scale of that is).


http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java:

http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java@228
PS2, Line 228:    * TODO: implement V2 rules for ADD/SUB.
TODO not needed.



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Thu, 26 Oct 2017 00:18:26 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................

IMPALA-5019: Decimal V2 addition

In this patch, we implement the new decimal return type rules for
addition expressions. These rules become active when the query option
DECIMAL_V2 is enabled. The algorithm for determining the type of the
result is described in the JIRA.

DECIMAL V1:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,38)                                                 |
+----------------------------------------------------------------+

DECIMAL V2:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,6)                                                  |
+----------------------------------------------------------------+

This patch required backend changes. We implement an algorithm where
we handle the whole and fractional parts separately, and then combine
them to get the final result. This is more complex and slower. We try
to avoid this by first checking if the result would into int128.

Testing:
- Added expr tests.
- Tested locally on my machine with a script that generates random
  decimal numbers and checks that Impala adds them correctly.

Performance:

For the common case, performance remains the same.
  select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1)

  BEFORE: 4.74s
  AFTER:  4.73s

In this case, we check if it is necessary to do the complex addition,
and it turns out to be not necessary. We see a slowdown because the
result needs to be scaled down by dividing.
  select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19)

  BEFORE: 1.63s
  AFTER:  13.57s

In following case, we take the most complex path and see the most
signification perfmance hit.
  select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37))

  BEFORE: 1.63s
  AFTER: 20.57

Change-Id: I401049c56d910eb1546a178c909c923b01239336
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
4 files changed, 297 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 5: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 5
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Nov 2017 22:08:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1507/


-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 5
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Nov 2017 18:32:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc@a2402
PS4, Line 2402: 
> why does this rounding no longer happen in decimal v2?
This went away after I fixed IMPALA-6183.

What was happening is that power(10, 3) - 0.45 returned a decimal in Decimal V2 and was being converted to a double, which was imprecise. After converting back to decimal, rounding didn't happen because the value happened to be a little smaller than necessary for rounding.

I rebased the patch and fixed the conflicts.



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Sat, 18 Nov 2017 00:24:59 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.

Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc@a2402
PS4, Line 2402: 
> the result of power(10, 3) - 0.45 was a double. Converted to decimal it loo
Still not following. The old V2 expected value was 999.6 not 999.5. 

Stepping back, it seems the intent of this test case is to test rounding for CAST DOUBLE -> DECIMAL and using the double value of 999.55 for that.  So the other question is whether this test is no longer testing what it meant to test.



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Sat, 18 Nov 2017 00:50:39 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.

Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8309/4/be/src/exprs/expr-test.cc@a2402
PS4, Line 2402: 
> This went away after I fixed IMPALA-6183.
But 1000 - 0.45 = 999.55. When casting that to DECIMAL(4,1), why doesn't that round to 999.6? What am I missing?



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Sat, 18 Nov 2017 00:35:56 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
......................................................................


Patch Set 5: Code-Review+1

Carrying the +1 from Tim


-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 5
Gerrit-Owner: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <za...@cloudera.com>
Gerrit-Comment-Date: Sat, 18 Nov 2017 00:27:29 +0000
Gerrit-HasComments: No