You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Han Mingcong <ha...@hotmail.com> on 2019/09/12 07:06:33 UTC

How does Calcite implement `Column Prune`?

Hi all,
I’m learning query optimization recently. As known, Calcite uses a volcano optimizer which is different from other optimizers such as SparkSQL’s Catalyst. But I’m curious how does volcano optimizer implement rules like `ColumnPruning` in Catalyst? Or which transformation rule does Calcite use to achieve it?
For example, we have such a SQL:

select a from t where b > 10;

If the schema of t is `a int, b int, c int, …`, we only need two columns ‘a’ and ‘b’ when scan table ‘t’.


Mingcong Han
2019.9.12

Re: How does Calcite implement `Column Prune`?

Posted by Danny Chan <yu...@gmail.com>.
Hi, Han Mingcong ~

I guess the topic you touched is the “project push down”.

One way is like Stamatis said, use the RelFieldTrimmer which belong to the sql-to-rel conversion phrase now, the default value is “false”, you can open it through the [1].

Another way is like Chunwei said, you can write rules by your self, you may need the ProjectXXXTransposeRule to transpose the project all the way down to the source(scan) node, and a rule maybe named ProjectScanRule to pass the required fields to the source.


[1] https://github.com/apache/calcite/blob/f95f74a13a20413bb0074f0a3c94901a7a88305c/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L5681

Best,
Danny Chan
在 2019年9月12日 +0800 PM4:15,Han Mingcong <ha...@hotmail.com>,写道:
> Hi all,
> I’m learning query optimization recently. As known, Calcite uses a volcano optimizer which is different from other optimizers such as SparkSQL’s Catalyst. But I’m curious how does volcano optimizer implement rules like `ColumnPruning` in Catalyst? Or which transformation rule does Calcite use to achieve it?
> For example, we have such a SQL:
>
> select a from t where b > 10;
>
> If the schema of t is `a int, b int, c int, …`, we only need two columns ‘a’ and ‘b’ when scan table ‘t’.
>
>
> Mingcong Han
> 2019.9.12

Re: How does Calcite implement `Column Prune`?

Posted by Chunwei Lei <ch...@gmail.com>.
Hi, Ham and Stamatis.

In our use case,  we use transformation rules like ProjectXXXTransposeRule
to do column pruning.

For example,  if we have a sql:

      select a from t where b > 10;

 the logical plan is

LogcaiProject(a)
    LogicalFilter(b>10)
        LogcicalTableScan(a,b,c)


then we can get LogcicalTableScan(a,b) after applying
ProjectFilterTransposeRule and ProjectTableScanTransposeRule.

I hope it helps.


Best,
Chunwei


On Thu, Sep 12, 2019 at 8:15 PM Stamatis Zampetakis <za...@gmail.com>
wrote:

> Hi Han,
>
> I guess what you are looking for is RelFieldTrimmer [1] and it is not
> implemented as a transformation rule but as a separate optimization phase
> in this case.
>
> Best,
> Stamatis
>
> [1]
>
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql2rel/RelFieldTrimmer.java
>
> On Thu, Sep 12, 2019 at 10:15 AM Han Mingcong <ha...@hotmail.com>
> wrote:
>
> > Hi all,
> > I’m learning query optimization recently. As known, Calcite uses a
> volcano
> > optimizer which is different from other optimizers such as SparkSQL’s
> > Catalyst. But I’m curious how does volcano optimizer implement rules like
> > `ColumnPruning` in Catalyst? Or which transformation rule does Calcite
> use
> > to achieve it?
> > For example, we have such a SQL:
> >
> > select a from t where b > 10;
> >
> > If the schema of t is `a int, b int, c int, …`, we only need two columns
> > ‘a’ and ‘b’ when scan table ‘t’.
> >
> >
> > Mingcong Han
> > 2019.9.12
> >
>

Re: How does Calcite implement `Column Prune`?

Posted by Stamatis Zampetakis <za...@gmail.com>.
Hi Han,

I guess what you are looking for is RelFieldTrimmer [1] and it is not
implemented as a transformation rule but as a separate optimization phase
in this case.

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql2rel/RelFieldTrimmer.java

On Thu, Sep 12, 2019 at 10:15 AM Han Mingcong <ha...@hotmail.com>
wrote:

> Hi all,
> I’m learning query optimization recently. As known, Calcite uses a volcano
> optimizer which is different from other optimizers such as SparkSQL’s
> Catalyst. But I’m curious how does volcano optimizer implement rules like
> `ColumnPruning` in Catalyst? Or which transformation rule does Calcite use
> to achieve it?
> For example, we have such a SQL:
>
> select a from t where b > 10;
>
> If the schema of t is `a int, b int, c int, …`, we only need two columns
> ‘a’ and ‘b’ when scan table ‘t’.
>
>
> Mingcong Han
> 2019.9.12
>