You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@impala.apache.org by "anujphadke (Code Review)" <ge...@cloudera.org> on 2017/02/15 23:42:03 UTC

[Impala-ASF-CR] IMPALA-4848: Add WIDHT BUCKET() function

anujphadke has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6023

Change subject: IMPALA-4848: Add WIDHT_BUCKET() function
......................................................................

IMPALA-4848: Add WIDHT_BUCKET() function

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
4 files changed, 32 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 11:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@455
PS10, Line 455: a
> Done
This wasn't fixed.


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@505
PS10, Line 505: f 
> If expr < min_range, we return 0
Suppose min_range = -8888... and max_range=8888..., so the subtraction overflows. This function would return only 3 possible values then. If value=min_range-1, then we return 0. If value = max_range+1, then we return num_buckets. Otherwise, we return null. This seems like odd behavior to me.

I think we should always return null if there is an overflow when computing range_size. What do you think? (btw, if we're going to do this, this needs to be moved up to the top of the function)


http://gerrit.cloudera.org:8080/#/c/6023/11/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/11/be/src/exprs/math-functions-ir.cc@502
PS11, Line 502: Decimal16Value range_size;
              :   range_size =
nit: why not combine the two lines?


http://gerrit.cloudera.org:8080/#/c/6023/11/be/src/exprs/math-functions-ir.cc@524
PS11, Line 524:   if (int total_leading_zeros = BitUtil::CountLeadingZeros(abs(buckets.value())) +
Why do we need to declare total_leading_zeros here? Also, make this UNLIKELY


http://gerrit.cloudera.org:8080/#/c/6023/11/be/src/exprs/math-functions-ir.cc@529
PS11, Line 529: c
nit: capital letter


http://gerrit.cloudera.org:8080/#/c/6023/11/be/src/exprs/math-functions-ir.cc@531
PS11, Line 531:   if (!needs_int256 && BitUtil::CountLeadingZeros(range_size.value()) +
You should use MinLeadingZerosAfterScaling() here. Also, make this UNLIKELY


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions.h
File be/src/exprs/math-functions.h:

http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions.h@115
PS10, Line 115:   static BigIntVal WidthBucket(FunctionContext* ctx, const DecimalVal& expr,
> I made this private. Not sure, any pointers where I can move this function 
I think the place you moved it to is ok.



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 11
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Mar 2018 00:27:16 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

Hello Taras Bobrovytsky, Michael Brown, Alex Behm, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/6023

to look at the new patch set (#8).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket (expr decimal, min_val decimal, max_val decimal, num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This function
returns the bucket in which the expr value would fall. min_val and
max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 192 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 8
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

Hello Taras Bobrovytsky, Michael Brown, Alex Behm, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/6023

to look at the new patch set (#9).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket(expr decimal, min_val decimal, max_val decimal, num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This function
returns the bucket in which the expr value would fall. min_val and
max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 192 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/9
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 9
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 17:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2737/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 17
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Jun 2018 19:55:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDHT BUCKET() function

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.

Alex Behm has posted comments on this change.

Change subject: IMPALA-4848: Add WIDHT_BUCKET() function
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/3/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 456:   if (bucket_width == numeric_limits<float>::infinity()) {
> I don't think infinity is possible unless our min/max range are already wei
To simplify the handling of these strange cases, you might consider using DECIMAL instead of double throughout.


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 12:

(4 comments)

This change is getting close. Overall I'm pretty happy with it.

http://gerrit.cloudera.org:8080/#/c/6023/12/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/12/be/src/exprs/math-functions-ir.cc@502
PS12, Line 502: 
Nit: This empty line is in the wrong place, it should be above bool overflow = false


http://gerrit.cloudera.org:8080/#/c/6023/12/be/src/exprs/math-functions-ir.cc@503
PS12, Line 503:   Decimal16Value range_size = max_range.template Subtract<int128_t>(input_scale, min_range,
Nit: long line


http://gerrit.cloudera.org:8080/#/c/6023/12/be/src/exprs/math-functions-ir.cc@544
PS12, Line 544:   if (UNLIKELY(BitUtil::CountLeadingZeros(abs(buckets.value())) +
              :       BitUtil::CountLeadingZeros(abs(dist_from_min.value())) <= 128)) {
              :     needs_int256 = true;
              :   }
              : 
              :   // Check if scaling up range size overflows and if there is a need to store the
              :   // intermediate results in int256_t to avoid the overflow
              :   if (UNLIKELY(!needs_int256 && BitUtil::CountLeadingZeros(range_size.value()) +
              :       detail::MinLeadingZerosAfterScaling(BitUtil::CountLeadingZeros(
              :       range_size.value()), input_scale) <= 128)) {
              :     needs_int256 = true;
These two if statements should be combined into 1. It's weird to check !needs_int256 in the second one. Combine the two conditions with an OR.


http://gerrit.cloudera.org:8080/#/c/6023/12/be/src/exprs/math-functions-ir.cc@582
PS12, Line 582: )+1
nit: put spaces around the +



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 12
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Sat, 31 Mar 2018 00:56:55 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

Hello Taras Bobrovytsky, Michael Brown, Alex Behm, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/6023

to look at the new patch set (#12).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket(expr decimal, min_val decimal, max_val decimal,
  num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This
function returns the bucket in which the expr value would fall. min_val
and max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 224 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/12
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 12
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDHT BUCKET() function

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.

Alex Behm has posted comments on this change.

Change subject: IMPALA-4848: Add WIDHT_BUCKET() function
......................................................................


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/6023/3/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 437:     error_msg << "Number of buckets:" << num_buckets.val <<
We usually follow this format.

Number of buckets should be greater than zero: <given value goes here>


Line 445:     if (num_buckets.val < INT32_MAX) {
use numeric_limits instead of INT32_MAX


Line 456:   if (bucket_width == numeric_limits<float>::infinity()) {
I don't think infinity is possible unless our min/max range are already weird. I suggest we check the min/max range for infinity and NaN earlier and remove the check here.

Also, for any such interesting cases we should have tests.


Line 460:   if(bucket_width == 0) return IntVal::null();
UNLIKELY


Line 461:   double result = (expr.val - min_range.val) / bucket_width + 1;
put parenthesis around the division to make the intended evaluation order clear


Line 462:   if (result == numeric_limits<float>::infinity()) {
Same comment as above for infinity. I don't think we can have infinity here


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has uploaded a new patch set (#4).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 124 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDHT BUCKET() function

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.

Alex Behm has posted comments on this change.

Change subject: IMPALA-4848: Add WIDHT_BUCKET() function
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/6023/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 3456:   TestIsNull("width_bucket(NULL, 2, 14, 4)", TYPE_INT);
Test NULL for all args.

Add tests with extreme values to trigger edge cases. Also see my comments on the code.


http://gerrit.cloudera.org:8080/#/c/6023/1/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 427:   if (expr.is_null) return IntVal::null();
do all null checks in one if


Line 429:   if (min_range == max_range) {
use min_range.val directly because the == operator will check for null again


Line 430:     ctx->SetError("lower bound cannot be equal to upper bound");
Lower...


Line 434:     ctx->SetError("Number of buckets should be greater than zero");
should -> must

also print which value was given


Line 437:   if (expr.val >= max_range.val) return num_buckets.val + 1;
Comment that these cases go into the special under/overflow buckets.

We need to be mindful of the case where num_buckets is already MAX_INT, adding +1 here will overflow it. We should return null in that case. Also add a test.


Line 439:   double num_elems = (max_range.val - min_range.val) / num_buckets.val;
bucket_width?

This could become 0 in extreme cases, and then we'd get infinity in the next line. Casting from infinity to int results in undefined behavior, so I think we should handle this case, and add a test that triggers it.


http://gerrit.cloudera.org:8080/#/c/6023/1/be/src/exprs/math-functions.h
File be/src/exprs/math-functions.h:

Line 128:   static IntVal WidthBucket(FunctionContext* ctx, const DoubleVal& expr,
Move in the public section like the other functions.


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has uploaded a new patch set (#5).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 130 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 16: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 16
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Jun 2018 19:55:37 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDHT BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change.

Change subject: IMPALA-4848: Add WIDHT_BUCKET() function
......................................................................


Patch Set 1:

Will add a commit message.

Results of the tests match the postgres output.
aphadke-MBP-2:~ aphadke$ "/Applications/Postgres.app/Contents/Versions/9.6/bin/psql" -p5432 -d "aphadke"
psql (9.6.1)
Type "help" for help.

aphadke=# 
aphadke=# 
aphadke=# 
aphadke=# 
aphadke=# select width_bucket(6.3, 2, 17, 2);
 width_bucket 
--------------
            1
(1 row)

aphadke=# select width_bucket(11, 6, 14, 3);
 width_bucket 
--------------
            2
(1 row)

aphadke=# select width_bucket(-1, -5, 5, 3);
 width_bucket 
--------------
            2
(1 row)

aphadke=# select width_bucket(1, -5, 5, 3);
 width_bucket 
--------------
            2
(1 row)

aphadke=# select width_bucket(3, 5, 20.1, 4);
 width_bucket 
--------------
            0
(1 row)

aphadke=# select width_bucket(22, 5, 20.1, 4);
 width_bucket 
--------------
            5
(1 row)

aphadke=# select width_bucket(NULL, 2, 14, 4);
 width_bucket 
--------------
             
(1 row)

aphadke=# select width_bucket(NULL, 2, 14, 0);
 width_bucket 
--------------
             
(1 row)

aphadke=# select width_bucket(22, 5, 20.1, 0);
ERROR:  count must be greater than zero
aphadke=# select width_bucket(22, 5, 5, 0);
ERROR:  count must be greater than zero
aphadke=#

-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 15:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6023/13/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/6023/13/be/src/exprs/expr-test.cc@5331
PS13, Line 5331:     ", 40)", TYPE_BIGINT, 1);
> We should test the two conditions that can lead to requiring int256_t separ
Yes,
I tested individual scenarios by having 2 separate loops. I dont have the exact change. But something like this.
Made sure we hit only one case is true.
+  if(BitUtil::CountLeadingZeros(abs(buckets.value())) +
+            BitUtil::CountLeadingZeros(abs(dist_from_min.value())) <= 128) {
+
+   log(INFO)<< "Overflow while multiplying buckets.value * dist_from_min.value")";
+  }
+
+  if(BitUtil::CountLeadingZeros(range_size.value()) +
+            detail::MinLeadingZerosAfterScaling(BitUtil::CountLeadingZeros(
+                      range_size.value()), input_scale) <= 128) {
+
+    LOG(INFO) << "Overflow while scaling up range size";
+  }
+


http://gerrit.cloudera.org:8080/#/c/6023/13/be/src/exprs/expr-test.cc@5333
PS13, Line 5333:   // evaluated with int128_t. Incrementing one of them will require using int256_t for
> Maybe add a few more cases where int256_t is used? I would add these tests:
Done. 

smallest values where int256_t is needed and there should be an overflow - 
Since the result of the subtraction dist_from_min =  (expr - min_val) is stored in decimal16Val
 Since, dist_from_min * num_buckets might need to be stored in int256_t to avoid overflowing int128_t. 
Multiplying  Decimal16Val by intVal should not cause overflows when we store results in  int256_t



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 15
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Wed, 09 May 2018 00:01:46 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 22: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 22
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Sat, 30 Jun 2018 04:47:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@516
PS7, Line 516:   int256_t x = ConvertToInt256(buckets.value()) * ConvertToInt256(width_size.value());
> converting to int256 every time is going to be slow. If we care about speed
Won't this require a lot of memory to create such a array?
Ex: 
WidthBucketImpl(... min_range=3, max_range=2^31 - 1 , num_buckets=2^31-1)



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Mon, 09 Oct 2017 22:40:24 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

Hello Taras Bobrovytsky, Michael Brown, Alex Behm, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/6023

to look at the new patch set (#15).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket(expr decimal, min_val decimal, max_val decimal,
  num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This
function returns the bucket in which the expr value would fall. min_val
and max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 232 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/15
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 15
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.

Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 7:

Is this still moving forward or should it be abandoned?


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Thu, 16 Nov 2017 01:21:45 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 21: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 21
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Sat, 30 Jun 2018 01:25:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 10:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/6023/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/6023/10//COMMIT_MSG@10
PS10, Line 10: width_bucket(expr decimal, min_val decimal, max_val decimal, num_buckets int)
nit: long line


http://gerrit.cloudera.org:8080/#/c/6023/10//COMMIT_MSG@13
PS10, Line 13: is divided into num_buckets buckets having identical sizes. This function
nit: long line


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@514
PS9, Line 514:   dist_from_min = expr.template Subtract<int128_t>(input_scale, min_range, input_scale,
> Right, this overflow won't happen without overflowing the subtraction on l 
If we never expect this to happen, make this a DCHECK and add a comment.


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@557
PS9, Line 557:       return BigIntVal::null();
> Moved it above.
I think it's bad practice to have checks "for safety". It makes the code slower and harder to reason about for people who are reading the code. If something is impossible, we should not be checking for it. Add a DCHECK instead.


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@438
PS10, Line 438: buckets
nit: should be num_buckets instead of buckets


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@444
PS10, Line 444: result
nit: "results"


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@455
PS10, Line 455: a
nit: "an overflow"


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@482
PS10, Line 482:   if (min_range >= max_range) {
UNLIKELY


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@483
PS10, Line 483:     ctx->SetError("Lower bound cannot be greater than or equal to the upper bound");
Add an end to end test for each error message.


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@487
PS10, Line 487:   if (expr < min_range) return 0;
UNLIKELY


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@489
PS10, Line 489:   if (expr >= max_range) {
UNLIKELY


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@504
PS10, Line 504:       input_scale, input_precision, input_scale, false, &overflow);
UNLIKELY(overflow)


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@505
PS10, Line 505: f(
I think this check needs to happen above the line 487. Imagine if this overflows, but expr < min_range, then we will return zero. If we increase expr to be equal to min_range, this function will start returning null.

Add a test case for this.


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@507
PS10, Line 507:     error_msg << "Overflow while evaluating the difference between min_range: " <<
Add an end to end test for this error message (and others).


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@518
PS10, Line 518:     error_msg << "Overflow while evaluating the difference between expr: " <<
Add an end to end test for this error message.


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@536
PS10, Line 536:   if (abs(range_size.value()) * DecimalUtil::GetScaleMultiplier<int128_t>(input_scale) <
I am not convinced this check is correct. In which case will this check be false?


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@538
PS10, Line 538:     needs_int256 = true;
Is it possible to tell statically that needs_int256 is false? For example, by looking at the precision and scale of min_range and max_range.

We want to order the checks from fastest to slowest. First, do the static check. If static check did not rule out int256, do the leading zero check. If the leading zero check did not rule out the need for int256, do this check. If All the checks did not rule out int256, then set needs_int256 to true.


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@566
PS10, Line 566:     int128_t y = DecimalUtil::MultiplyByScale<int128_t>(range_size.value(),
We need to DCHECK that it did not overflow here. (It should be easy to do if you rebase this patch because I modified this function recently and added an optional overflow dcheck)


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions.h
File be/src/exprs/math-functions.h:

http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions.h@115
PS10, Line 115:   static BigIntVal WidthBucketImpl(FunctionContext* ctx,const T1& expr,
Does this need to be declared in this file? Isn't it an internal function?



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 10
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Tue, 06 Feb 2018 00:00:05 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/expr-test.cc@4247
PS7, Line 4247:   TestValue("width_bucket(6.3, 2, 17, 2)", TYPE_BIGINT, 1);
You should include a case like (10000, 1, 6, 3)


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@429
PS7, Line 429: bucket_width
This should be called bucket_number to make it more clear


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@431
PS7, Line 431: width_size
width_size is a confusing name. This should be called something like "distance_from_min".


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@516
PS7, Line 516:   int256_t x = ConvertToInt256(buckets.value()) * ConvertToInt256(width_size.value());
> Won't this require a lot of memory to create such a array?
Sure, that's true. I don't think that this function is meant to be used that way though. It's difficult to imagine someone wanting to use over 1000 buckets. In that case, we could fall back to the current implementation.



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Oct 2017 00:41:56 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDHT BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has uploaded a new patch set (#2).

Change subject: IMPALA-4848: Add WIDHT_BUCKET() function
......................................................................

IMPALA-4848: Add WIDHT_BUCKET() function

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
4 files changed, 61 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@516
PS7, Line 516:   int256_t x = ConvertToInt256(buckets.value()) * ConvertToInt256(width_size.value());
> Sure, that's true. I don't think that this function is meant to be used tha
This idea may give a nice performance boost (if it works) because all the heavy lifting (such as dividing and maybe converting to int256) is done once at the beginning when we are constructing the array. The array will contain values that have the same precision and scale as the input expression so all we have to do is O(log(number of buckets)) comparisons of expr to array elements in case of binary search. Or O(number of buckets) comparisons if number of buckets is small.

If number of buckets is very large, we can fall back to the current implementation. (we can have 2 versions WidthBucketImpl such as WidthBucketImplSmall WidthBucketImplLarge)



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Oct 2017 00:49:40 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change.

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/5/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 452: //         We can store the results in a decimal8Value. But this change hard codes to
> It would be good to summarize the calculations and comment on the chosen ty
Done. 
Do you think it would useful to move the comments at relevant places inside the functions instead of adding it before the definition?


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDHT BUCKET() function

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.

Michael Brown has posted comments on this change.

Change subject: IMPALA-4848: Add WIDHT_BUCKET() function
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/3//COMMIT_MSG
Commit Message:

PS3, Line 7: WIDHT_BUCKET()
Nit that's bugging me: spelling :)


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@429
PS7, Line 429: bucket_width
> Done
Please ignore my last comment. Accidentally got posted when replying to Dan's comment. I still havn't uploaded my latest patch.


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@431
PS7, Line 431: width_size
> Done
Please ignore my last comment. Accidentally got posted when replying to Dan's comment. I still havn't uploaded my latest patch.


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@479
PS7, Line 479:     result.val = num_buckets.val;
> Done
Please ignore my last comment. Accidentally got posted when replying to Dan's comment. I still haven't uploaded my latest patch.


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@516
PS7, Line 516:   int256_t x = ConvertToInt256(buckets.value()) * ConvertToInt256(width_size.value());
> This patch stores intermediate results in int256_t only when needed. Uses i
Please ignore my last comment. Accidentally got posted when replying to Dan's comment. I still havn't uploaded my latest patch.



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Thu, 16 Nov 2017 06:20:27 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 15:

(1 comment)

After rebasing I am seeing a few test failures because of a particular commit -

https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/main/java/org/apache/impala/analysis/Expr.java#L513

If the input (children type is non decimal it throws an exception because we hit the precondition. The chilType is a TinyInt in this case.

[localhost:21000] default> select width_bucket(NULL, 5, 20.1, 4);
Query: select width_bucket(NULL, 5, 20.1, 4)
Query submitted at: 2018-06-19 15:10:25 (Coordinator: http://anuj-OptiPlex-9020:25000)
ERROR: IllegalStateException: null

Works fine this way -
Query: select width_bucket(NULL, 5.0, 20.1, 4)
Query submitted at: 2018-06-19 15:10:21 (Coordinator: http://anuj-OptiPlex-9020:25000)
Query progress can be monitored at: http://anuj-OptiPlex-9020:25000/query_plan?query_id=c8413395d2645a21:d1ff71de00000000
+----------------------------------+
| width_bucket(null, 5.0, 20.1, 4) |
+----------------------------------+
| NULL                             |
+----------------------------------+
Fetched 1 row(s) in 0.11s

http://gerrit.cloudera.org:8080/#/c/6023/15/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/6023/15/fe/src/main/java/org/apache/impala/analysis/Expr.java@490
PS15, Line 490:       result = ((ScalarType)result).getMinResolutionDecimal();
> When I implemented this using decimalVal. I found a bug.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 15
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 Jun 2018 23:56:15 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 15:

(3 comments)

This looks ok to me. I trust Taras has a deeper understanding of the decimal logic than I do.

Is there some way we can add this to the decimal fuzz test?

http://gerrit.cloudera.org:8080/#/c/6023/15/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/6023/15/be/src/exprs/expr-test.cc@582
PS15, Line 582:   void TestErrorString(const string& expr, const string& error_string) {
Can you document briefly what the function does?

E.g. "Execute 'expr' and check that the returned error ends with 'error_string'"


http://gerrit.cloudera.org:8080/#/c/6023/15/be/src/exprs/expr-test.cc@587
PS15, Line 587:     status = executor_->FetchResult(&result_row);
ASSERT_FALSE(status.ok())

Otherwise we'll deference the null message below and crash.


http://gerrit.cloudera.org:8080/#/c/6023/15/be/src/exprs/expr-test.cc@588
PS15, Line 588:     ASSERT_TRUE(status.msg().msg().compare(status.msg().msg().size() -
This is basically EndsWith(), right? Can you factor out into a utility in StringUtil, just to make it more obvious what the code is doing.



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 15
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Fri, 15 Jun 2018 00:48:48 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.

Alex Behm has posted comments on this change.

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 4:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/6023/4/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 31: #include "exprs/math-functions.h"
order alphabetically


Line 430:   Decimal16Value width_size, range_size;
move declarations closer to where they are used


Line 432:   //FE casts all the arguments to same time.
// FE casts all the arguments to the same type.


Line 433:   int input_scale = ctx->impl()->GetConstFnAttr(FunctionContextImpl::ARG_TYPE_SCALE,1);
formatting, space after ','


Line 434:   int input_precision = ctx->impl()->GetConstFnAttr(
formatting, space after ','


Line 442:   if(expr < min_range) {
single line and formatting (please fix formatting everywhere)


Line 447:     if (num_buckets.val < std::numeric_limits<int32_t>::max()) {
How about we make the return value a BIGINT and simplify this code (no need to check for overflow)


Line 448:       return num_buckets.val + 1;
Create an IntVal, that's clearer to read


Line 457:   range_size = max_range.template Subtract<__int128_t>(input_scale, min_range,
use int128_t consistently


Line 460:     ctx->SetError("Overflow while bucket_width evaluation");
Please make the error more specific, so we can see where the overflow happened.


Line 472:   int256_t x = ConvertToInt256(width_size.value() *  buckets.value());
Don't you need to convert the arguments before the multiplication?


Line 475:   // to avoid avoid computing reulting scale and precision
typo: resulting


Line 476:   int256_t y = DecimalUtil::MultiplyByScale<int256_t>(ConvertToInt256(range_size.value()),
The current approach seems fine, but let's avoid going to int256_t if we can. Operations with that type are very expensive.


http://gerrit.cloudera.org:8080/#/c/6023/4/be/src/exprs/math-functions.h
File be/src/exprs/math-functions.h:

Line 111:         const IntVal& num_buckets);
weird indentation


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

Hello Taras Bobrovytsky, Michael Brown, Alex Behm, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/6023

to look at the new patch set (#10).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket(expr decimal, min_val decimal, max_val decimal, num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This function
returns the bucket in which the expr value would fall. min_val and
max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 210 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 10
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has uploaded a new patch set (#7).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 169 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 18: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 18
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Jun 2018 21:56:37 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 15:

We should find a reviewer for this, or maybe Taras can +2 if he's comfortable with it.


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 15
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Mon, 21 May 2018 23:43:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 13: Code-Review+1

(2 comments)

I'm happy with this change. Alex, can you take a look please?

http://gerrit.cloudera.org:8080/#/c/6023/13/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/6023/13/be/src/exprs/expr-test.cc@5331
PS13, Line 5331:   TestValue("width_bucket(100000000, 199999.77777777777777777777777777, 99999999999.99999"
We should test the two conditions that can lead to requiring int256_t separately, which is what we are trying to do here. Are we sure that one condition is true and the other is false in these two tests?


http://gerrit.cloudera.org:8080/#/c/6023/13/be/src/exprs/expr-test.cc@5333
PS13, Line 5333: 
Maybe add a few more cases where int256_t is used? I would add these tests:
- smallest values where int256_t is needed.
- largest values where int256_t is not needed
- largest values where int256_t is needed and there is no overflow
- smallest values where int256_t is needed and there should be an overflow



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 13
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Mon, 02 Apr 2018 23:41:06 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@479
PS7, Line 479:     result.val = num_buckets.val;
I think it's clearer and simpler to write:
result.val = num_buckets.val + 1;


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@516
PS7, Line 516:   int256_t x = ConvertToInt256(buckets.value()) * ConvertToInt256(width_size.value());
converting to int256 every time is going to be slow. If we care about speed (and I think we do), it would be a good idea to implement this without having to convert to int256.

One approach that I'm thinking is to construct an array of Decimals with precision and scale the same as the input expression. Each entry indicates the boundaries of the buckets.

For example, we are calling WidthBucketImpl(... min_range=3, max_range=13, num_buckets=2) we would construct an array that looks like [3, 8, 13]. This would be a one time expensive operation that hopefully gets codegened away. For each value, we would have to do a binary search to find where it fits in the array. (For example, 10 is between 8 and 13, so it goes to bucket 2). If the array is small enough, we can simply iterate over it.

There may be other better approaches, but it looks to me that converting to int256 (and then doing int 256 multiplications and divisions) every time even if we are dealing with tiny decimals is not the right way to go.



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Mon, 09 Oct 2017 21:38:23 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 20:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/20/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/20/be/src/exprs/math-functions-ir.cc@565
PS20, Line 565:       ctx->SetError("Encountered division by 0");
> clang fails with division by 0. Since we check for 
This is not the right way to deal with this. I looked at it again, and I really don't think it's possible for y to be zero. This needless check will make the code slower.

Include this comment on the line that is causing the problem to suppress that clang error:

// NOLINT: clang-tidy thinks y may equal zero here.

You just need to put this on line 584 I think



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 20
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Fri, 29 Jun 2018 21:54:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 7:

(4 comments)

Yes, I have been discussing these approaches with Taras and Alex. 
I have benchmarked these  approaches -

Created a large table with1073741824 rows . The patch with 
DoubleVal outperforms (patch set 3)

Using just int 256
444.58s
453.40s

Using int128
159.28s
155.25s

Binary search approach // This was done with float array and need to change this to using decimalVal
109.21s
109.20s

DoubleVal (patch set 3)
104.20s
104.20s

Current status -
Will send out a patch which uses  int128_t (int256_t in case of overflows) for storing the intermediate results very soon. Will continue working on exploring the binary search approach later and will send out a follow up patch if we see performance improvements.

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@429
PS7, Line 429: bucket_width
> This should be called bucket_number to make it more clear
Done


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@431
PS7, Line 431: width_size
> width_size is a confusing name. This should be called something like "dista
Done


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@479
PS7, Line 479:     result.val = num_buckets.val;
> I think it's clearer and simpler to write:
Done


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@516
PS7, Line 516:   int256_t x = ConvertToInt256(buckets.value()) * ConvertToInt256(width_size.value());
> This idea may give a nice performance boost (if it works) because all the h
This patch stores intermediate results in int256_t only when needed. Uses int128_t otherwise.
Will do some more benchmarking and tests for the binary search approach and will post a follow up patch.



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Thu, 16 Nov 2017 06:12:55 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 15:

Also, I agree with Tim that it's a good idea to add this to the fuzz test. I think that it is ok to do that in a separate commit.


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 15
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Wed, 20 Jun 2018 22:57:10 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 20:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/20/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/20/be/src/exprs/math-functions-ir.cc@565
PS20, Line 565:       ctx->SetError("Encountered division by 0");
clang fails with division by 0. Since we check for 
We have a check above min_range >= max_range on line 519. Can't think of a case to hit this.
I added this to get pass the clang failure.
@taras - Can you take an another look? Thanks!



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 20
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Fri, 29 Jun 2018 00:30:15 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change.

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/4/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 476:   int256_t y = DecimalUtil::MultiplyByScale<int256_t>(ConvertToInt256(range_size.value()),
> The current approach seems fine, but let's avoid going to int256_t if we ca
Agree.
But I think we could overflow when performing the multiplication to evaluate x for both decimal8Value and decimal16Value (if x is not int256_t).

And since we are avoiding evaluating the scale and precision for the division,  the numerator and the denominator have to be the same type(again int256_t)?
What do you think?


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 9:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/expr-test.cc@4653
PS9, Line 4653:   TestValue("width_bucket(22, 5, 20.1, 4)", TYPE_BIGINT, 5);
Add some tests with nulls.

For example:
(NULL, 5, 20, 3)
(12, NULL, 20, 3)
(12, 5, NULL, 3)
(12, 5, 20, NULL)


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@455
PS9, Line 455: min_range = -1 (or any negative value) 
You can just write:
min_range < 0


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460
PS9, Line 460: Incase
"In case"


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460
PS9, Line 460:  t
extra space here


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460
PS9, Line 460: functions
function*


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@461
PS9, Line 461: decimal8Value
I am not sure it makes sense to even mention decimal8Value here. Why would you even think about storing it in a decimal 8?


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@461
PS9, Line 461: stroring
spelling


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@480
PS9, Line 480:   if (min_range == max_range) {
What if min_range > max_range?

Also, add a test case.


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@481
PS9, Line 481: Lower bound cannot be equal to upper bound
Would it make sense to include the name of the function in the error message? Do we normally do that?


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@489
PS9, Line 489: num_buckets.val;
how about: num_buckets.val + 1

Also, add a test case where num_buckets is a maximally large integer, so adding one causes an overflow.


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@494
PS9, Line 494:   // FE casts all the arguments to the same type.
Maybe expand this comment a little? E.g

"expr", "min_range" and "max_range" are guaranteed to be the same scale and precision by the FE


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@503
PS9, Line 503:   if(overflow) {
Have we considered what happens when subtraction is successful (and overflow is false), but the result is larger than the largest allowed decimal (10^38 - 1). Add a test case like this.

For example:
min range = -100
max_range = 10^38 - 1

The result of that subtraction should not overflow, because it fits into int128_t


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@514
PS9, Line 514:   if(overflow) {
Can this overflow ever happen without overflowing on line 503? Add a test case for this overflow and error message. (You might need to add some End to end tests to test for the error message)


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@533
PS9, Line 533:   if(needs_int256) {
Nit: Put a space between "if" and the opening brace. (here and elsewhere)


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@539
PS9, Line 539: avoid
avoid written twice


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@549
PS9, Line 549:     // bump up the denominator scale so that the scale of the numerator and denominator
Can you explain this a little better? It's not really clear why we have to scale it up.


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@551
PS9, Line 551:     int128_t y = DecimalUtil::MultiplyByScale<int128_t>(range_size.value(),
Is it possible for this to overflow?


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@557
PS9, Line 557:     ctx->SetError("Overflow while dividing by the range_size");
This should be moved up to around line 543.

Add a test case for this error message.


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@571
PS9, Line 571:   if (UNLIKELY(num_buckets.val <= 0)) {
Why add this check here, instead of around line 480 where other similar check are located?


http://gerrit.cloudera.org:8080/#/c/6023/9/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/6023/9/fe/src/main/java/org/apache/impala/analysis/Expr.java@459
PS9, Line 459: 
Why this empty line?



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 9
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Nov 2017 00:08:38 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDHT BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change.

Change subject: IMPALA-4848: Add WIDHT_BUCKET() function
......................................................................


Patch Set 2:

(7 comments)

Please review this change after  I push a new patch which addresses some overflow bugs.

http://gerrit.cloudera.org:8080/#/c/6023/1/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 427:     const IntVal& num_buckets) {
> do all null checks in one if
Done


Line 429:     return IntVal::null();
> use min_range.val directly because the == operator will check for null agai
Done


Line 430:   }
> Lower...
Done


Line 434:   }
> should -> must
Done


http://gerrit.cloudera.org:8080/#/c/6023/2/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 455:   double bucket_width = (max_range.val - min_range.val) / num_buckets.val;
This can overflow. Working on handling this.
select width_bucket(1.5e+200, -1.7e+308, 1.2e+308, 900);


Line 457:   int64_t result = static_cast<int64_t>((expr.val - min_range.val) / bucket_width) + 1;
This can overflow. Working on handling this.


http://gerrit.cloudera.org:8080/#/c/6023/1/be/src/exprs/math-functions.h
File be/src/exprs/math-functions.h:

Line 128:   /// Returns true if no parse_res == PARSE_SUCCESS || parse_res == PARSE_OVERFLOW.
> Move in the public section like the other functions.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

Hello Taras Bobrovytsky, Michael Brown, Tim Armstrong, Alex Behm, Dan Hecht, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/6023

to look at the new patch set (#20).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket(expr decimal, min_val decimal, max_val decimal,
  num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This
function returns the bucket in which the expr value would fall. min_val
and max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M be/src/util/string-util.cc
M be/src/util/string-util.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
7 files changed, 254 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/20
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 20
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 22:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2762/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 22
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Sat, 30 Jun 2018 01:25:45 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket(expr decimal, min_val decimal, max_val decimal,
  num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This
function returns the bucket in which the expr value would fall. min_val
and max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Reviewed-on: http://gerrit.cloudera.org:8080/6023
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M be/src/util/string-util.cc
M be/src/util/string-util.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
7 files changed, 246 insertions(+), 7 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 23
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 15:

I already let Anuj know offline that it's ok to remove the precondition.


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 15
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Wed, 20 Jun 2018 22:55:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

Hello Taras Bobrovytsky, Michael Brown, Alex Behm, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/6023

to look at the new patch set (#14).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket(expr decimal, min_val decimal, max_val decimal,
  num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This
function returns the bucket in which the expr value would fall. min_val
and max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 234 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/14
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 14
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/15/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/6023/15/fe/src/main/java/org/apache/impala/analysis/Expr.java@490
PS15, Line 490:       result = ((ScalarType)result).getMinResolutionDecimal();
> Why do we need this?
When I implemented this using decimalVal. I found a bug.
Without this change -
[localhost:21000] > select width_bucket(29 , 1, 400, 7);
Query: select width_bucket(29 , 1, 400, 7)
Query submitted at: 2017-02-28 18:41:40 (Coordinator: http://anuj-OptiPlex-9020:25000)
ERROR: AnalysisException: null
CAUSED BY: IllegalStateException: null

I debugged it and found that  
getREsolvedWildCardType() does not return a decimalType for this specific sequence when
child(0) -> TinyInt 
chile(1) -> tinyInt
child(2) -> smalInt 

result = Type.getAssignmentCompatibleType(result, childType, false);

when the result is Decimal(3,0) and childType is SMALLINT the result  it returns is a result which is SMALLINT.

The query hits this check and fails -
Preconditions.checkState(result.isDecimal() && !result.isWildcardDecimal());

Had discussed this with Alex and he mentioned that we can fix this as part of this change.



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 15
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Sat, 26 May 2018 03:59:38 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

Hello Taras Bobrovytsky, Michael Brown, Alex Behm, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/6023

to look at the new patch set (#13).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket(expr decimal, min_val decimal, max_val decimal,
  num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This
function returns the bucket in which the expr value would fall. min_val
and max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 221 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/13
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 13
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 22: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 22
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Sat, 30 Jun 2018 01:25:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

Hello Taras Bobrovytsky, Michael Brown, Tim Armstrong, Alex Behm, Dan Hecht, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/6023

to look at the new patch set (#21).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket(expr decimal, min_val decimal, max_val decimal,
  num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This
function returns the bucket in which the expr value would fall. min_val
and max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M be/src/util/string-util.cc
M be/src/util/string-util.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
7 files changed, 246 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/21
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 21
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.

Alex Behm has posted comments on this change.

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/5/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 452:   Decimal16Value buckets = Decimal16Value::FromInt(input_precision, input_scale,
It would be good to summarize the calculations and comment on the chosen types, e.g. why are we using Dec16 here?

In general, please move definitions closer to where they are used. The buckets are only used a lot further down where it might become clearer why it should be Dec16.

Adding those comments and doing some cleanup will make reviewing much easier. I'll still need to think more carefully about the logic.


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has uploaded a new patch set (#6).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 169 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 16: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 16
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Fri, 22 Jun 2018 23:46:11 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change.

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 5:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/6023/4/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 31: #include "exprs/decimal-operators.h"
> order alphabetically
Done


Line 430:     const T1& max_range, const IntVal& num_buckets) {
> move declarations closer to where they are used
Done


Line 432:   bool overflow = false;
> // FE casts all the arguments to the same type.
Done


Line 433:   // FE casts all the arguments to the same type.
> formatting, space after ','
Done


Line 434:   int input_scale = ctx->impl()->GetConstFnAttr(FunctionContextImpl::ARG_TYPE_SCALE, 1);
> formatting, space after ','
Done


Line 442: 
> single line and formatting (please fix formatting everywhere)
Done


Line 447:     result.val = num_buckets.val;
> How about we make the return value a BIGINT and simplify this code (no need
Done


Line 457:       input_scale, input_precision, input_scale, false, &overflow);
> use int128_t consistently
Done


Line 460:     error_msg << "Overflow while evaluating the difference between min_range: " <<
> Please make the error more specific, so we can see where the overflow happe
Done


Line 472:     ctx->SetError(error_msg.str().c_str());
> Don't you need to convert the arguments before the multiplication?
Done


Line 475: 
> typo: resulting
Done


Line 476:   // resulting scale should be 2 * input_scale as per multiplication rules
> I think we'll need to be more stingy with the types. For example, the input
Discussed this with Alex.

For this  particular scenario where we might need to store results with decimal8Values inputs into int256_t.

In this particular case- where expr and min_range are decimal8Values. This subtraction can overflow
into a decimal16Values 

ex:
expr = max(decimal8Value) 
min_range = -1 (or any negative value)


width_size = expr.template Subtract<__int128_t>(input_scale, min_range, input_scale,
   input_precision, input_scale, false, &overflow);

width_size has to be decimal16values (to store this overflowed value) even for decimal8values.

Should we templetize  this function? Since we convert intermediate results to decimal16value .


http://gerrit.cloudera.org:8080/#/c/6023/4/be/src/exprs/math-functions.h
File be/src/exprs/math-functions.h:

Line 111:       const IntVal& num_buckets);
> weird indentation
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDHT BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has uploaded a new patch set (#3).

Change subject: IMPALA-4848: Add WIDHT_BUCKET() function
......................................................................

IMPALA-4848: Add WIDHT_BUCKET() function

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
4 files changed, 74 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

Hello Taras Bobrovytsky, Michael Brown, Alex Behm, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/6023

to look at the new patch set (#11).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket(expr decimal, min_val decimal, max_val decimal,
  num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This
function returns the bucket in which the expr value would fall. min_val
and max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
5 files changed, 227 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 11
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 11:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/6023/11/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/6023/11/be/src/exprs/expr-test.cc@585
PS11, Line 585:     ASSERT_TRUE(status.msg().msg().compare(status.msg().msg().size() -
status.msg().msg() is prepended by SQLSTATE_GENERAL_ERROR string. I am using to compare to check if status.msg().msg() ends with the error string.
https://github.com/apache/impala/blob/19ab465b391167e99658c9270a2a3d76b2b2652f/be/src/service/impala-server.cc#L231


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@438
PS10, Line 438: num_buc
> nit: should be num_buckets instead of buckets
Done


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@444
PS10, Line 444: result
> nit: "results"
Done


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@455
PS10, Line 455: a
> nit: "an overflow"
Done


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@482
PS10, Line 482:   if (UNLIKELY(min_range >= max_range)) {
> UNLIKELY
Done


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@487
PS10, Line 487:   if (UNLIKELY(expr < min_range)) return 0;
> UNLIKELY
Done


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@489
PS10, Line 489:   if (UNLIKELY(expr >= max_range)) {
> UNLIKELY
Done


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@504
PS10, Line 504:       input_scale, input_precision, input_scale, false, &overflow);
> UNLIKELY(overflow)
Done


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@505
PS10, Line 505: f 
> I think this check needs to happen above the line 487. Imagine if this over
If expr < min_range, we return 0
so If I move this above l487 it will start returning null if max_range - min_range overflows.
If  expr < min_range why should we perform this subtraction if the bucket number can be evaluated without performing this subtraction


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@507
PS10, Line 507:     error_msg << "Overflow while evaluating the difference between min_range: " <<
> Add an end to end test for this error message (and others).
Done


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@518
PS10, Line 518:   Decimal16Value buckets = Decimal16Value::FromInt(input_precision, input_scale,
> Add an end to end test for this error message.
Since we won't hit this without overflowing above. 
Added a test  where expr - min will overflow, but we detect the  overflow during subtraction above and return.


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@536
PS10, Line 536: 
> I am not convinced this check is correct. In which case will this check be 
Removed and added the leading zero check.


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@536
PS10, Line 536: 
> I am not convinced this check is correct. In which case will this check be 
Removed this and just added the leading zero check.


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@538
PS10, Line 538:   if (needs_int256) {
> Is it possible to tell statically that needs_int256 is false? For example, 
Added the leading zero check to detect if we need int256. Would it suffice?


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@566
PS10, Line 566:     const DecimalVal& min_range, const DecimalVal& max_range,
> We need to DCHECK that it did not overflow here. (It should be easy to do i
Done


http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions.h
File be/src/exprs/math-functions.h:

http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions.h@115
PS10, Line 115:   static BigIntVal WidthBucket(FunctionContext* ctx, const DecimalVal& expr,
> Does this need to be declared in this file? Isn't it an internal function?
I made this private. Not sure, any pointers where I can move this function to?



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 11
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Thu, 22 Feb 2018 00:26:32 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change.

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/6023/3//COMMIT_MSG
Commit Message:

PS3, Line 7: WIDTH_BUCKET()
> Nit that's bugging me: spelling :)
Done


http://gerrit.cloudera.org:8080/#/c/6023/3/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 445: 
> use numeric_limits instead of INT32_MAX
Done


Line 456: 
> To simplify the handling of these strange cases, you might consider using D
Done


Line 460:     ctx->SetError("Overflow while bucket_width evaluation");
> UNLIKELY
Done


Line 461:     return IntVal::null();
> put parenthesis around the division to make the intended evaluation order c
This code has changed.


Line 462:   }
> Same comment as above for infinity. I don't think we can have infinity here
not needed with the decimalVal changes


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 17: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2737/


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 17
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Jun 2018 23:23:58 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.

Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 15:

Maybe Taras can comment on whether that precondition is valid in this case.


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 15
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Wed, 20 Jun 2018 17:53:16 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

Hello Taras Bobrovytsky, Michael Brown, Tim Armstrong, Alex Behm, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/6023

to look at the new patch set (#16).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket(expr decimal, min_val decimal, max_val decimal,
  num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This
function returns the bucket in which the expr value would fall. min_val
and max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M be/src/util/string-util.cc
M be/src/util/string-util.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
7 files changed, 246 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/16
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 16
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/15/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/6023/15/fe/src/main/java/org/apache/impala/analysis/Expr.java@490
PS15, Line 490:       result = ((ScalarType)result).getMinResolutionDecimal();
Why do we need this?



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 15
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Tue, 22 May 2018 00:17:54 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 17: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 17
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Jun 2018 19:55:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.

Alex Behm has posted comments on this change.

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/4/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 476:   int256_t y = DecimalUtil::MultiplyByScale<int256_t>(ConvertToInt256(range_size.value()),
> Agree.
I think we'll need to be more stingy with the types. For example, the input args are converted to Decimal16 here, but I think if we used the smallest types, then we can avoid going to int256_t for Decimal4 and Decimal8.


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

Hello Taras Bobrovytsky, Michael Brown, Tim Armstrong, Alex Behm, Dan Hecht, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/6023

to look at the new patch set (#19).

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................

IMPALA-4848: Add WIDTH_BUCKET() function

Syntax :
width_bucket(expr decimal, min_val decimal, max_val decimal,
  num_buckets int)

This function creates equiwidth histograms , where the histogram range
is divided into num_buckets buckets having identical sizes. This
function returns the bucket in which the expr value would fall. min_val
and max_val are the minimum and maximum value of the histogram range
respectively.

-> This function returns NULL if expr is a NULL.
-> This function returns 0 if expr < min_val
-> This function returns num_buckets + 1 if expr > max_val

 E.g.
[localhost:21000] > select width_bucket(8, 1, 20, 3);
+---------------------------+
| width_bucket(8, 1, 20, 3) |
+---------------------------+
| 2                         |
+---------------------------+

Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/exprs/math-functions.h
M be/src/util/string-util.cc
M be/src/util/string-util.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/Expr.java
7 files changed, 254 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6023/19
-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 19
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 10:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/expr-test.cc@4653
PS9, Line 4653:   TestValue("is_nan(0.0)", TYPE_BOOLEAN, false);
> Add some tests with nulls.
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@455
PS9, Line 455: Subtracting a negative min_range from e
> You can just write:
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460
PS9, Line 460: cimal8
> "In case"
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460
PS9, Line 460: re the re
> function*
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460
PS9, Line 460: n 
> extra space here
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@461
PS9, Line 461: store the res
> I am not sure it makes sense to even mention decimal8Value here. Why would 
It doesn't. 
I was trying to detail out why everything needs to be stored as a decimal16Value (asked in PS5).
I was trying to convey that this cannot be stored in a decimal8value if it does not overflow. 
I ll remove it since it is sounding confusing.


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@461
PS9, Line 461: ith the 
> spelling
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@480
PS9, Line 480:   }
> What if min_range > max_range?
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@481
PS9, Line 481: 
> Would it make sense to include the name of the function in the error messag
We don't in all the cases.
https://github.com/apache/incubator-impala/blob/master/be/src/exprs/string-functions-ir.cc#L338


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@489
PS9, Line 489: _range) {
> how about: num_buckets.val + 1
I am storing in num_buckets in a BigIntVal first and then incrementing to avoid the overflow issue.

result.val = num_buckets.val + 1 // will evaluate the RHS first and assign the overflowed value to result.

Adding 1 later after storing it in a BigIntVal to avoid this.

I have a test case -
TestValue("width_bucket(22, 5, 20.1, 2147483647)", TYPE_BIGINT, 2147483648)


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@494
PS9, Line 494:   }
> Maybe expand this comment a little? E.g
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@503
PS9, Line 503:   range_size = max_range.template Subtract<int128_t>(input_scale, min_range,
> Actually I'm not correct here. It does overflow in the case I provided.
We have a test case for this. 
 // Test max - min will overflow during  width_bucket evaluation
  TestError("width_bucket(11, -9, 99999999999999999999999999999999999999, 4000)");


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@514
PS9, Line 514:   dist_from_min = expr.template Subtract<int128_t>(input_scale, min_range, input_scale,
> Can this overflow ever happen without overflowing on line 503? Add a test c
Right, this overflow won't happen without overflowing the subtraction on l 503. 
It's not needed. Should we have this extra check for safety?


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@533
PS9, Line 533: 
> Nit: Put a space between "if" and the opening brace. (here and elsewhere)
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@539
PS9, Line 539: 
> avoid written twice
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@549
PS9, Line 549:     // match the scale of the numerator.
> Can you explain this a little better? It's not really clear why we have to 
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@551
PS9, Line 551:         range_size.value()), input_scale);
> Is it possible for this to overflow?
Yes. I added a check for this and set needs_int256 to true if it overflows.
Added a a test for this.


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@557
PS9, Line 557:       return BigIntVal::null();
> This should be moved up to around line 543.
Moved it above.
This ideally should not overflow  since the result of the division is the bucket number which is a IntVal.
Are you fine with having this check for safety?


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@571
PS9, Line 571: }
> Why add this check here, instead of around line 480 where other similar che
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 10
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Jan 2018 03:34:32 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 18:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2748/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 18
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Jun 2018 21:56:38 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 9:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/expr-test.cc@4247
PS7, Line 4247:   TestStringValue("typeOf(4294967296)", "BIGINT");
> You should include a case like (10000, 1, 6, 3)
Do you mean to add a test case when expr > max_val of the width bucket?

We have a similar case here -
TestValue("width_bucket(22, 5, 20.1, 4)", TYPE_BIGINT, 5);


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@429
PS7, Line 429: onst FloatVa
> Please ignore my last comment. Accidentally got posted when replying to Dan
Done


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@431
PS7, Line 431: loatVal(fm
> Please ignore my last comment. Accidentally got posted when replying to Dan
Done


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@479
PS7, Line 479: 
> Please ignore my last comment. Accidentally got posted when replying to Dan
if expr > max_val and num_buckets is the max value that can be represented by intVal. Adding 1 to num_buckets will cause an overflow.
Hence, I am storing it in a BigIntVal and then adding 1


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@516
PS7, Line 516:     error_msg << "Overflow while evaluating the difference between expr: " <<
> Please ignore my last comment. Accidentally got posted when replying to Dan
Switched to using int128_t. Using int256_t incase of overflows.
Will investigate the binary search approach and post a follow up patch if there are perf improvements.



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 9
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Fri, 17 Nov 2017 09:11:45 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@503
PS9, Line 503:   if(overflow) {
> Have we considered what happens when subtraction is successful (and overflo
Actually I'm not correct here. It does overflow in the case I provided.

Wouldn't hurt to add the test case anyways.



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 9
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Nov 2017 01:43:24 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 12:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/10/be/src/exprs/math-functions-ir.cc@505
PS10, Line 505: f 
> Suppose min_range = -8888... and max_range=8888..., so the subtraction over
I am fine with it. I have moved the check above.


http://gerrit.cloudera.org:8080/#/c/6023/11/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/11/be/src/exprs/math-functions-ir.cc@502
PS11, Line 502: 
              :   Decimal16Val
> nit: why not combine the two lines?
Done


http://gerrit.cloudera.org:8080/#/c/6023/11/be/src/exprs/math-functions-ir.cc@524
PS11, Line 524: 
> Why do we need to declare total_leading_zeros here? Also, make this UNLIKEL
Done


http://gerrit.cloudera.org:8080/#/c/6023/11/be/src/exprs/math-functions-ir.cc@529
PS11, Line 529: e
> nit: capital letter
Done


http://gerrit.cloudera.org:8080/#/c/6023/11/be/src/exprs/math-functions-ir.cc@531
PS11, Line 531:     return result;
> You should use MinLeadingZerosAfterScaling() here. Also, make this UNLIKELY
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 12
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Tue, 27 Mar 2018 20:59:52 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.

anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 16:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6023/15/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/6023/15/be/src/exprs/expr-test.cc@582
PS15, Line 582: 
> Can you document briefly what the function does?
Done


http://gerrit.cloudera.org:8080/#/c/6023/15/be/src/exprs/expr-test.cc@587
PS15, Line 587:   void TestError(const string& expr) {
> ASSERT_FALSE(status.ok())
Done


http://gerrit.cloudera.org:8080/#/c/6023/15/be/src/exprs/expr-test.cc@588
PS15, Line 588:     GetValue(expr, INVALID_TYPE, /* expect_error */ true);
> This is basically EndsWith(), right? Can you factor out into a utility in S
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 16
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Fri, 22 Jun 2018 23:39:58 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.

Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 18: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2748/


-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 18
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Wed, 27 Jun 2018 01:13:14 +0000
Gerrit-HasComments: No