You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Robbie Zhang (Jira)" <ji...@apache.org> on 2021/09/03 13:45:00 UTC

[jira] [Created] (HIVE-25498) Query with more than 32 count distinct functions returns wrong result

Robbie Zhang created HIVE-25498:
-----------------------------------

             Summary: Query with more than 32 count distinct functions returns wrong result
                 Key: HIVE-25498
                 URL: https://issues.apache.org/jira/browse/HIVE-25498
             Project: Hive
          Issue Type: Bug
            Reporter: Robbie Zhang


If there are more than 32 "COUNT(DISTINCT COL)" functions in a query, all these COUNT functions in this query return 0 instead of the proper values.

Here are the queries to reproduce this issue:
{code:java}
set hive.cbo.enable=true;
create table test_count (c0 string, c1 string, c2 string, c3 string, c4 string, c5 string, c6 string, c7 string, c8 string, c9 string, c10 string, c11 string, c12 string, c13 string, c14 string, c15 string, c16 string, c17 string, c18 string, c19 string, c20 string, c21 string, c22 string, c23 string, c24 string, c25 string, c26 string, c27 string, c28 string, c29 string, c30 string, c31 string, c32 string);
INSERT INTO test_count values ('c0', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', 'c7', 'c8', 'c9', 'c10', 'c11', 'c12', 'c13', 'c14', 'c15', 'c16', 'c17', 'c18', 'c19', 'c20', 'c21', 'c22', 'c23', 'c24', 'c25', 'c26', 'c27', 'c28', 'c29', 'c30', 'c31', 'c32'); 
select count (distinct c0), count(distinct c1), count(distinct c2), count(distinct c3), count(distinct c4), count(distinct c5), count(distinct c6), count(distinct c7), count(distinct c8), count(distinct c9), count(distinct c10), count(distinct c11), count(distinct c12), count(distinct c13), count(distinct c14), count(distinct c15), count(distinct c16), count(distinct c17), count(distinct c18), count(distinct c19), count(distinct c20), count(distinct c21), count(distinct c22), count(distinct c23), count(distinct c24), count(distinct c25), count(distinct c26), count(distinct c27), count(distinct c28), count(distinct c29), count(distinct c30), count(distinct c31), count(distinct c32) from test_count;
{code}
 This bug is caused by HiveExpandDistinctAggregatesRule.getGroupingIdValue() which uses int type. When there are more than 32 groupings the values overflow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)