You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Muhammad Gelbana <m....@gmail.com> on 2017/06/01 11:20:02 UTC

How to pushdown aggregates to JDBC datasources

​​
The following query
*SELECT COUNT(**CNTRn**) FROM incorta.SLS.CNTR*

Is translated to the following
*SELECT COUNT("**CNTRn**") AS "EXPR$0"*
*FROM (SELECT "**CNTRn**" FROM "**SLS**"."**CNTR**") AS "t"*

But I want to pushdown the whole aggregation so the output query would be
*SELECT COUNT("**CNTRn**") FROM "**SLS**"."**CNTR**"*

To do so, I wrote a rule that matches *LogicalAggregate* with input
*JdbcProject*, and transformed the *LogicalAggregate* to a *JdbcAggregate*
node with it's input being the *JdbcProject*.

public void onMatch(RelOptRuleCall call) {
>     LogicalAggregate aggregate = call.rel(0);
>     RelNode input = ((HepRelVertex) aggregate.getInput()).getCurrentRel();
>     JdbcAggregate newAggregate;
>     try {
>         newAggregate = new JdbcRules.JdbcAggregate(aggregate.getCluster(),
> aggregate.getTraitSet().replace(this.outConvention), *input*,
> aggregate.indicator, aggregate.getGroupSet(), aggregate.getGroupSets(),
> aggregate.getAggCallList());
>         call.transformTo(newAggregate);
>     } catch (InvalidRelException e) {
>         e.printStackTrace();
>     }
> }


But that threw the following exception because the aggregation class type
of the aggregate call is of type *DrillCalciteSqlAggFunctionWrapper* while
*JdbcRules.AGG_FUNCS* (i.e. *ImmutableList*) doesn't contain this type
(Check *org.apache.calcite.adapter.jdbc.JdbcRules.JdbcAggregate.canImplement(SqlAggFunction,
SqlDialect)* for details)

org.apache.calcite.rel.InvalidRelException: cannot implement aggregate
> function COUNT
> at
> org.apache.calcite.adapter.jdbc.JdbcRules$JdbcAggregate.<init>(JdbcRules.java:710)


So I tried providing a different aggregate call list *manually*:

public void onMatch(RelOptRuleCall call) {
>     LogicalAggregate aggregate = call.rel(0);
>     RelNode input = ((HepRelVertex) aggregate.getInput()).getCurrentRel();
>     JdbcAggregate newAggregate;
>     try {
>         AggregateCall agg = aggregate.getAggCallList().get(0);
>         AggregateCall newAgg =
> AggregateCall.create(SqlStdOperatorTable.COUNT, agg.isDistinct(),
> agg.getArgList(), agg.filterArg, agg.type, agg.name);
>         List<AggregateCall> *sqlAggCallList* = Arrays.asList(newAgg);
>         newAggregate = new JdbcRules.JdbcAggregate(aggregate.getCluster(),
> aggregate.getTraitSet().replace(this.outConvention), input,
> aggregate.indicator, aggregate.getGroupSet(), aggregate.getGroupSets(),
> *sqlAggCallList*);
>         call.transformTo(newAggregate);
>     } catch (InvalidRelException e) {
>         e.printStackTrace();
>     }
> }


And this output the following query
*SELECT COUNT("**CNTRn**") AS "EXPR$0"*
*FROM (SELECT "**CNTRn**" FROM "**SLS**"."**CNTR**") AS "t"*

But this isn't the syntax I was hoping for because the datasource I'm using
doesn't support subqueries.

Again, this is what I'm aiming for
*SELECT COUNT("**CNTRn**") FROM "**SLS**"."**CNTR**"*

*---------------------*
*Muhammad Gelbana*
http://www.linkedin.com/in/mgelbana

Re: How to pushdown aggregates to JDBC datasources

Posted by Muhammad Gelbana <m....@gmail.com>.
You're correct, my mistake.

Now I unwrapped this drill-specific class and converted the
*LogicalAggregate* to *JdbcAggregate* my self. But the unparsed SQL is
still using a subquery.

Drill seems to use a custom build of Calcite v.1.4. The difference between
the *implement* method of JdbcAggregate in v1.4 and the master branch is
considerably different.

Thank you.


*---------------------*
*Muhammad Gelbana*
http://www.linkedin.com/in/mgelbana

On Thu, Jun 1, 2017 at 5:31 PM, Julian Hyde <jh...@apache.org> wrote:

> There is already a rule that does this: JdbcRules.JdbcAggregateRule.
>
> I don’t know what DrillCalciteSqlAggFunctionWrapper is; it seems
> Drill-specific.
>
> > On Jun 1, 2017, at 4:20 AM, Muhammad Gelbana <m....@gmail.com>
> wrote:
> >
> > ​​
> > The following query
> > *SELECT COUNT(**CNTRn**) FROM incorta.SLS.CNTR*
> >
> > Is translated to the following
> > *SELECT COUNT("**CNTRn**") AS "EXPR$0"*
> > *FROM (SELECT "**CNTRn**" FROM "**SLS**"."**CNTR**") AS "t"*
> >
> > But I want to pushdown the whole aggregation so the output query would be
> > *SELECT COUNT("**CNTRn**") FROM "**SLS**"."**CNTR**"*
> >
> > To do so, I wrote a rule that matches *LogicalAggregate* with input
> > *JdbcProject*, and transformed the *LogicalAggregate* to a
> *JdbcAggregate*
> > node with it's input being the *JdbcProject*.
> >
> > public void onMatch(RelOptRuleCall call) {
> >>    LogicalAggregate aggregate = call.rel(0);
> >>    RelNode input = ((HepRelVertex) aggregate.getInput()).
> getCurrentRel();
> >>    JdbcAggregate newAggregate;
> >>    try {
> >>        newAggregate = new JdbcRules.JdbcAggregate(
> aggregate.getCluster(),
> >> aggregate.getTraitSet().replace(this.outConvention), *input*,
> >> aggregate.indicator, aggregate.getGroupSet(), aggregate.getGroupSets(),
> >> aggregate.getAggCallList());
> >>        call.transformTo(newAggregate);
> >>    } catch (InvalidRelException e) {
> >>        e.printStackTrace();
> >>    }
> >> }
> >
> >
> > But that threw the following exception because the aggregation class type
> > of the aggregate call is of type *DrillCalciteSqlAggFunctionWrapper*
> while
> > *JdbcRules.AGG_FUNCS* (i.e. *ImmutableList*) doesn't contain this type
> > (Check *org.apache.calcite.adapter.jdbc.JdbcRules.JdbcAggregate.
> canImplement(SqlAggFunction,
> > SqlDialect)* for details)
> >
> > org.apache.calcite.rel.InvalidRelException: cannot implement aggregate
> >> function COUNT
> >> at
> >> org.apache.calcite.adapter.jdbc.JdbcRules$JdbcAggregate.<
> init>(JdbcRules.java:710)
> >
> >
> > So I tried providing a different aggregate call list *manually*:
> >
> > public void onMatch(RelOptRuleCall call) {
> >>    LogicalAggregate aggregate = call.rel(0);
> >>    RelNode input = ((HepRelVertex) aggregate.getInput()).
> getCurrentRel();
> >>    JdbcAggregate newAggregate;
> >>    try {
> >>        AggregateCall agg = aggregate.getAggCallList().get(0);
> >>        AggregateCall newAgg =
> >> AggregateCall.create(SqlStdOperatorTable.COUNT, agg.isDistinct(),
> >> agg.getArgList(), agg.filterArg, agg.type, agg.name);
> >>        List<AggregateCall> *sqlAggCallList* = Arrays.asList(newAgg);
> >>        newAggregate = new JdbcRules.JdbcAggregate(
> aggregate.getCluster(),
> >> aggregate.getTraitSet().replace(this.outConvention), input,
> >> aggregate.indicator, aggregate.getGroupSet(), aggregate.getGroupSets(),
> >> *sqlAggCallList*);
> >>        call.transformTo(newAggregate);
> >>    } catch (InvalidRelException e) {
> >>        e.printStackTrace();
> >>    }
> >> }
> >
> >
> > And this output the following query
> > *SELECT COUNT("**CNTRn**") AS "EXPR$0"*
> > *FROM (SELECT "**CNTRn**" FROM "**SLS**"."**CNTR**") AS "t"*
> >
> > But this isn't the syntax I was hoping for because the datasource I'm
> using
> > doesn't support subqueries.
> >
> > Again, this is what I'm aiming for
> > *SELECT COUNT("**CNTRn**") FROM "**SLS**"."**CNTR**"*
> >
> > *---------------------*
> > *Muhammad Gelbana*
> > http://www.linkedin.com/in/mgelbana
>
>

Re: How to pushdown aggregates to JDBC datasources

Posted by Julian Hyde <jh...@apache.org>.
There is already a rule that does this: JdbcRules.JdbcAggregateRule.

I don’t know what DrillCalciteSqlAggFunctionWrapper is; it seems Drill-specific.

> On Jun 1, 2017, at 4:20 AM, Muhammad Gelbana <m....@gmail.com> wrote:
> 
> ​​
> The following query
> *SELECT COUNT(**CNTRn**) FROM incorta.SLS.CNTR*
> 
> Is translated to the following
> *SELECT COUNT("**CNTRn**") AS "EXPR$0"*
> *FROM (SELECT "**CNTRn**" FROM "**SLS**"."**CNTR**") AS "t"*
> 
> But I want to pushdown the whole aggregation so the output query would be
> *SELECT COUNT("**CNTRn**") FROM "**SLS**"."**CNTR**"*
> 
> To do so, I wrote a rule that matches *LogicalAggregate* with input
> *JdbcProject*, and transformed the *LogicalAggregate* to a *JdbcAggregate*
> node with it's input being the *JdbcProject*.
> 
> public void onMatch(RelOptRuleCall call) {
>>    LogicalAggregate aggregate = call.rel(0);
>>    RelNode input = ((HepRelVertex) aggregate.getInput()).getCurrentRel();
>>    JdbcAggregate newAggregate;
>>    try {
>>        newAggregate = new JdbcRules.JdbcAggregate(aggregate.getCluster(),
>> aggregate.getTraitSet().replace(this.outConvention), *input*,
>> aggregate.indicator, aggregate.getGroupSet(), aggregate.getGroupSets(),
>> aggregate.getAggCallList());
>>        call.transformTo(newAggregate);
>>    } catch (InvalidRelException e) {
>>        e.printStackTrace();
>>    }
>> }
> 
> 
> But that threw the following exception because the aggregation class type
> of the aggregate call is of type *DrillCalciteSqlAggFunctionWrapper* while
> *JdbcRules.AGG_FUNCS* (i.e. *ImmutableList*) doesn't contain this type
> (Check *org.apache.calcite.adapter.jdbc.JdbcRules.JdbcAggregate.canImplement(SqlAggFunction,
> SqlDialect)* for details)
> 
> org.apache.calcite.rel.InvalidRelException: cannot implement aggregate
>> function COUNT
>> at
>> org.apache.calcite.adapter.jdbc.JdbcRules$JdbcAggregate.<init>(JdbcRules.java:710)
> 
> 
> So I tried providing a different aggregate call list *manually*:
> 
> public void onMatch(RelOptRuleCall call) {
>>    LogicalAggregate aggregate = call.rel(0);
>>    RelNode input = ((HepRelVertex) aggregate.getInput()).getCurrentRel();
>>    JdbcAggregate newAggregate;
>>    try {
>>        AggregateCall agg = aggregate.getAggCallList().get(0);
>>        AggregateCall newAgg =
>> AggregateCall.create(SqlStdOperatorTable.COUNT, agg.isDistinct(),
>> agg.getArgList(), agg.filterArg, agg.type, agg.name);
>>        List<AggregateCall> *sqlAggCallList* = Arrays.asList(newAgg);
>>        newAggregate = new JdbcRules.JdbcAggregate(aggregate.getCluster(),
>> aggregate.getTraitSet().replace(this.outConvention), input,
>> aggregate.indicator, aggregate.getGroupSet(), aggregate.getGroupSets(),
>> *sqlAggCallList*);
>>        call.transformTo(newAggregate);
>>    } catch (InvalidRelException e) {
>>        e.printStackTrace();
>>    }
>> }
> 
> 
> And this output the following query
> *SELECT COUNT("**CNTRn**") AS "EXPR$0"*
> *FROM (SELECT "**CNTRn**" FROM "**SLS**"."**CNTR**") AS "t"*
> 
> But this isn't the syntax I was hoping for because the datasource I'm using
> doesn't support subqueries.
> 
> Again, this is what I'm aiming for
> *SELECT COUNT("**CNTRn**") FROM "**SLS**"."**CNTR**"*
> 
> *---------------------*
> *Muhammad Gelbana*
> http://www.linkedin.com/in/mgelbana