You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Jesus Camacho Rodriguez (JIRA)" <ji...@apache.org> on 2017/07/10 18:22:00 UTC
[jira] [Resolved] (CALCITE-1828) Push the FILTER clause into Druid
as a Filtered Aggregator
[ https://issues.apache.org/jira/browse/CALCITE-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jesus Camacho Rodriguez resolved CALCITE-1828.
----------------------------------------------
Resolution: Fixed
Fix Version/s: 1.14.0
Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/551b562, thanks [~zhumayun]!
> Push the FILTER clause into Druid as a Filtered Aggregator
> -----------------------------------------------------------
>
> Key: CALCITE-1828
> URL: https://issues.apache.org/jira/browse/CALCITE-1828
> Project: Calcite
> Issue Type: Improvement
> Components: druid
> Affects Versions: 1.12.0
> Reporter: Zain Humayun
> Assignee: Zain Humayun
> Fix For: 1.14.0
>
>
> Druid has support for a special aggregator it calls the [Filtered Aggregator|http://druid.io/docs/latest/querying/aggregations.html] that allows aggregations to occur with filters independent to other filters in the Druid query.
> An example where the filtered aggregator is useful:
> {code:sql}
> SELECT
> sum("col1") FILTER (WHERE <condition1>),
> sum("col2") FILTER (WHERE <condition2>)
> FROM "table";
> {code}
> Currently, calcite will scan Druid, then do the filtering and aggregation itself. With filtered aggregators, both the filter and aggregation and be pushed into Druid.
> *A few comments/questions:*
> 1) If all conditions in the filter clause are the same, then instead of pushing filtered aggregators individually, it would make more sense to push 1 single filter into the Druid query. I.e the filters can be factored out into 1 filter. I don't see calcite currently do this, does it have such a rule in place already?
> 2) The filters can/should only be pushed if they are filtering on dimension columns
> 3) Currently, the above query would create the following relation:
> DruidQuery -> Project -> Aggregate. There is already a rule called {{DruidAggregateProjectRule}} which matches the previous relation. Is it better to add logic to that rule, or to create a new rule that also matches that relation?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)