You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2018/08/17 15:49:00 UTC

[jira] [Commented] (IMPALA-7202) Add WIDTH_BUCKET() function to the decimal fuzz test

    [ https://issues.apache.org/jira/browse/IMPALA-7202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584100#comment-16584100 ] 

ASF subversion and git services commented on IMPALA-7202:
---------------------------------------------------------

Commit a8b32dbafa1f015c8316f205e32bbdce349f2474 in impala's branch refs/heads/master from Zoltan Borok-Nagy
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=a8b32db ]

IMPALA-7412: width_bucket() function overflows too easily

Running the tests of https://gerrit.cloudera.org/#/c/10859/
it turned out that the width_bucket() function overflows
very often.

A common problem is that the function tries to cast the
'num_buckets' parameter to the decimal determined by the
Frontend. When the Frontend determined the precision and
scale of this decimal it only considered the decimal
arguments and ignored everything else. Therefore the
determined precision and scale is often not suitable for
the 'num_buckets' parameter.

WidthBucketImpl() has three decimal arguments, all of them
have the same byte size, precision, and scale. So it is
possible to interpret them as plain integers and still
calculate the proper bucket.

I included the python test cases from IMPALA-7202 developed
by Taras Bobrovytsky.

For performance test I used the following query:

SELECT sum(width_bucket(cast(l_orderkey AS DECIMAL(30, 10)),
           0, 5500000, 1000000))
FROM tpch_parquet.lineitem;

The new implementation executed it in ~0.3 seconds.
The old implementation executed it in ~0.8 seconds.

Change-Id: I728cc05d9aef8d081e6f2da66146f6d7b75dbb57
Reviewed-on: http://gerrit.cloudera.org:8080/11160
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Add WIDTH_BUCKET() function to the decimal fuzz test
> ----------------------------------------------------
>
>                 Key: IMPALA-7202
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7202
>             Project: IMPALA
>          Issue Type: Task
>          Components: Infrastructure
>            Reporter: Taras Bobrovytsky
>            Assignee: Zoltán Borók-Nagy
>            Priority: Critical
>
> We need to add the new WIDTH_BUCKET() function to the decimal fuzz test for better coverage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org