You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org> on 2017/10/09 21:38:23 UTC

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@479
PS7, Line 479:     result.val = num_buckets.val;
I think it's clearer and simpler to write:
result.val = num_buckets.val + 1;


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@516
PS7, Line 516:   int256_t x = ConvertToInt256(buckets.value()) * ConvertToInt256(width_size.value());
converting to int256 every time is going to be slow. If we care about speed (and I think we do), it would be a good idea to implement this without having to convert to int256.

One approach that I'm thinking is to construct an array of Decimals with precision and scale the same as the input expression. Each entry indicates the boundaries of the buckets.

For example, we are calling WidthBucketImpl(... min_range=3, max_range=13, num_buckets=2) we would construct an array that looks like [3, 8, 13]. This would be a one time expensive operation that hopefully gets codegened away. For each value, we would have to do a binary search to find where it fits in the array. (For example, 10 is between 8 and 13, so it goes to bucket 2). If the array is small enough, we can simply iterate over it.

There may be other better approaches, but it looks to me that converting to int256 (and then doing int 256 multiplications and divisions) every time even if we are dealing with tiny decimals is not the right way to go.



-- 
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-Comment-Date: Mon, 09 Oct 2017 21:38:23 +0000
Gerrit-HasComments: Yes