You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Gopal Vijayaraghavan (Jira)" <ji...@apache.org> on 2021/04/05 21:13:00 UTC

[jira] [Created] (HIVE-24976) CBO: count(distinct) in a window function fails CBO

Gopal Vijayaraghavan created HIVE-24976:
-------------------------------------------

             Summary: CBO: count(distinct) in a window function fails CBO
                 Key: HIVE-24976
                 URL: https://issues.apache.org/jira/browse/HIVE-24976
             Project: Hive
          Issue Type: Bug
          Components: CBO
            Reporter: Gopal Vijayaraghavan


{code}
create temporary table tmp_tbl(
`rule_id` string,
`severity` string,
`alert_id` string,
`alert_type` string);

explain cbo
select `k`.`rule_id`,
count(distinct `k`.`alert_id`) over(partition by `k`.`rule_id`) `subj_cnt`
from tmp_tbl k
;


explain
select `k`.`rule_id`,
count(distinct `k`.`alert_id`) over(partition by `k`.`rule_id`) `subj_cnt`
from tmp_tbl k
;
{code}

Fails CBO, because the count(distinct) is not being recognized as belonging to a windowing operation.

So it throws the following exception

{code}
            throw new CalciteSemanticException("Distinct without an aggregation.",
                    UnsupportedFeature.Distinct_without_an_aggreggation);
{code}

https://github.com/apache/hive/blob/73c3770d858b063c69dea6c64a759f8fdacad460/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4914

This prevents a query like this from using a materialized view which already exists in the system (the MV obviously does not contain this expression, but represents a complex transform from a JSON structure into a columnar layout).




--
This message was sent by Atlassian Jira
(v8.3.4#803005)