You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@calcite.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2020/09/02 10:41:00 UTC

[jira] [Created] (CALCITE-4213) Druid plans with small intervals should be chosen over full interval scan plus filter

Stamatis Zampetakis created CALCITE-4213:
--------------------------------------------

             Summary: Druid plans with small intervals should be chosen over full interval scan plus filter
                 Key: CALCITE-4213
                 URL: https://issues.apache.org/jira/browse/CALCITE-4213
             Project: Calcite
          Issue Type: Bug
          Components: druid-adapter
            Reporter: Stamatis Zampetakis


The problem was observed due to the failure of DruidAdapterIT#testFilterTimestamp.
{code:sql}
 select count(*) as c
from "foodmart"
where extract(year from "timestamp") = 1997
and extract(month from "timestamp") in (4, 6)
{code}
+Expected+
{noformat}
EnumerableInterpreter
 DruidQuery(table=[[foodmart, foodmart]], intervals=[[1997-04-01T00:00:00.000Z/1997-05-01T00:00:00.000Z, 1997-06-01T00:00:00.000Z/1997-07-01T00:00:00.000Z]], projects=[[0]], groups=[{}], aggs=[[COUNT()]])
{noformat}
+Actual+
{noformat}
EnumerableInterpreter
  DruidQuery(table=[[foodmart, foodmart]], intervals=[[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z]], filter=[AND(=(EXTRACT(FLAG(YEAR), $0), 1997), OR(=(EXTRACT(FLAG(MONTH), $0), 4), =(EXTRACT(FLAG(MONTH), $0), 6)))], groups=[{}], aggs=[[COUNT()]])
{noformat}

Observe that the actual plan has an interval that basically touches all data so in most cases it is less efficient than the expected one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)