You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "James Turton (Jira)" <ji...@apache.org> on 2023/02/21 11:40:00 UTC

[jira] [Closed] (DRILL-8381) Add support for filtered aggregate calls

     [ https://issues.apache.org/jira/browse/DRILL-8381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

James Turton closed DRILL-8381.
-------------------------------

> Add support for filtered aggregate calls
> ----------------------------------------
>
>                 Key: DRILL-8381
>                 URL: https://issues.apache.org/jira/browse/DRILL-8381
>             Project: Apache Drill
>          Issue Type: New Feature
>    Affects Versions: 1.20.3
>            Reporter: Vova Vysotskyi
>            Assignee: Vova Vysotskyi
>            Priority: Major
>             Fix For: 1.21.0
>
>
> Currently, Drill ignores filters for filtered aggregate calls and returns incorrect results.
> Here is the example query for which Drill will return incorrect results:
> {code:sql}
> SELECT count(n_name) FILTER(WHERE n_regionkey = 1) AS nations_count_in_1_region,
> count(n_name) FILTER(WHERE n_regionkey = 2) AS nations_count_in_2_region,
> count(n_name) FILTER(WHERE n_regionkey = 3) AS nations_count_in_3_region,
> count(n_name) FILTER(WHERE n_regionkey = 4) AS nations_count_in_4_region,
> count(n_name) FILTER(WHERE n_regionkey = 0) AS nations_count_in_0_region
> FROM cp.`tpch/nation.parquet`
> {code}
> {noformat}
> +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
> | nations_count_in_1_region | nations_count_in_2_region | nations_count_in_3_region | nations_count_in_4_region | nations_count_in_0_region |
> +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
> | 25                        | 25                        | 25                        | 25                        | 25                        |
> +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
> {noformat}
> But the correct result is
> {noformat}
> +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
> | nations_count_in_1_region | nations_count_in_2_region | nations_count_in_3_region | nations_count_in_4_region | nations_count_in_0_region |
> +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
> | 5                         | 5                         | 5                         | 5                         | 5                         |
> +---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
> {noformat}
> Side note:
> The query above could be rewritten using PIVOT:
> {code:sql}
> SELECT `1` nations_count_in_1_region, `2` nations_count_in_2_region, `3` nations_count_in_3_region, `4` nations_count_in_4_region, `0` nations_count_in_0_region
> FROM (SELECT n_name, n_regionkey FROM cp.`tpch/nation.parquet`) 
> PIVOT(count(n_name) FOR n_regionkey IN (0, 1, 2, 3, 4))
> {code}
> And will return correct results when this issue is fixed and Calcite is updated to 1.33.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)