You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "manabu nagamine (Jira)" <ji...@apache.org> on 2022/05/22 23:51:00 UTC

[jira] [Created] (DRILL-8231) Wrong result in the COUNT function position.

manabu nagamine created DRILL-8231:
--------------------------------------

             Summary: Wrong result in the COUNT function position.
                 Key: DRILL-8231
                 URL: https://issues.apache.org/jira/browse/DRILL-8231
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 1.18.0
            Reporter: manabu nagamine
         Attachments: drill.zip

Hi Team.
We using Drill 1.18.

There is a phenomenon that the count values of COL4452 are different in the execution results of the following queries.
The only difference is that the positions of COL4452 and COL6408 have been changed.
{code:java}
1. 
select COUNT(DISTINCT val2) COL4452, SUM(CAST(val11 as BIGINT)+CAST(val12 as BIGINT)) COL6408 from dfs.root.`/drill/data/*/log_15872_R_79_*.parquet` WHERE 1 = 1  and ( ( dir0 between '01' and '10' )  ) and ( LOG_DATE >= '2022-04-01 00:00:00.000000' and LOG_DATE <= '2022-04-30 23:59:59.000000'); 
2.
select SUM(CAST(val11 as BIGINT)+CAST(val12 as BIGINT)) COL6408, COUNT(DISTINCT val2) COL4452 from dfs.root.`/drill/data/*/log_15872_R_79_*.parquet` WHERE 1 = 1  and ( ( dir0 between '01' and '10' )  ) and ( LOG_DATE >= '2022-04-01 00:00:00.000000' and LOG_DATE <= '2022-04-30 23:59:59.000000');{code}
As for the actual data, the count with COL4452 at the beginning of 1. is correct.
I am having trouble understanding the cause of this phenomenon.

Can anybody help me?Thanks in advance.

Attached the parquet log file.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)