You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Vladimir Sitnikov <si...@gmail.com> on 2020/01/04 19:58:32 UTC

[DISCUSS] MaterializationTest#testAggregateMaterializationOnCountDistinctQuery1 is very fragile

Hi,

It looks like testAggregateMaterializationOnCountDistinctQuery1 is invalid.

The test creates materialization for
select deptno, empid, salary from emps group by deptno, empid, salary

Then it issues the SQL:

select deptno, count(distinct empid) as c from (
select deptno, empid
from emps
group by deptno, empid
group by deptno


The expected plan is
EnumerableAggregate(group=[{0}], C=[COUNT($1)])
  EnumerableTableScan(table=[[hr, m0]]

However, that does not work if the optimizer knows emps.empid is a unique
key for the table.
The materialized view is created as "select deptno, empid, salary from
emps" (because grouping is not needed),
and the materialized view loses uniqueness information, thus it can't
effectively use the materialized view later (see
https://issues.apache.org/jira/browse/CALCITE-3682 ).

I'm inclined to either disable the test or remove empid from grouping
column.
However, if I remove empid, then distinct should probably be removed as
well.

Any thoughts?

Vladimir