You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by James Daniel <dj...@gmail.com> on 2021/05/18 16:35:35 UTC
Tracking column's origin
Hi, all.
I am trying to rewrite the query plan by removing some nodes but faced with
issues related to manipulating column ref indexes.
Let's consider the following calcite plan:
LogicalProject([...])
LogicalJoin(condition=[=($0, $4)], joinType=[inner])
LogicalTableScan(table=[[DB, R]])
LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
LogicalProject(a=[$3], $f0=[true])
LogicalTableScan(table=[[DB, S]])
In the join condition =($0, $4), the RHS column ref $4 is actually coming
from the $3 of table S.
To know this, we track down few nodes in the tree starting from the RHS
child of the join node.
But it becomes tricky when we use a more complex situation.
1) So, I wonder is there a utility class or method to support this purpose?
Furthermore, when we remove LogicalProject(a=[$3], $f0=[true]), we have to
manipulate all related column ref index starting from a parent of that
project node to the root node but manually tracking and shifting column ref
index is really a matter due to its complexity.
2) So I wonder the current Calcite impl has a utility class or methods to
help this situation.
3) Also, would you give me some general guidelines for implementing this
kind of stuff in Calcite?
Thanks,
James
Re: Tracking column's origin
Posted by JiaTao Tao <ta...@gmail.com>.
Hi
org.apache.calcite.rel.metadata.RelMetadataQuery#getColumnOrigins may help
Regards!
Aron Tao
James Daniel <dj...@gmail.com> 于2021年5月19日周三 上午12:35写道:
> Hi, all.
> I am trying to rewrite the query plan by removing some nodes but faced with
> issues related to manipulating column ref indexes.
>
> Let's consider the following calcite plan:
>
> LogicalProject([...])
> LogicalJoin(condition=[=($0, $4)], joinType=[inner])
> LogicalTableScan(table=[[DB, R]])
> LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(a=[$3], $f0=[true])
> LogicalTableScan(table=[[DB, S]])
>
>
> In the join condition =($0, $4), the RHS column ref $4 is actually coming
> from the $3 of table S.
> To know this, we track down few nodes in the tree starting from the RHS
> child of the join node.
> But it becomes tricky when we use a more complex situation.
> 1) So, I wonder is there a utility class or method to support this purpose?
>
> Furthermore, when we remove LogicalProject(a=[$3], $f0=[true]), we have to
> manipulate all related column ref index starting from a parent of that
> project node to the root node but manually tracking and shifting column ref
> index is really a matter due to its complexity.
> 2) So I wonder the current Calcite impl has a utility class or methods to
> help this situation.
>
> 3) Also, would you give me some general guidelines for implementing this
> kind of stuff in Calcite?
>
> Thanks,
> James
>
回复:Tracking column's origin
Posted by 953396112 <13...@qq.com>.
Hi James:
1) I guess you want to trace the column's origin in original table. In Calcite, we can use `RelMetadatauery.getColumnOrigin()` to trace the column's origin.The method tracks the origin of columns.Here is a unit test 'org.apache.calcite.test.RelMetadataTest#testCalcColumnOriginsTable' for your reference.
2) After removing a specific operator, the column reference of the parent operator will be affected. It seems that no tool class can do this. Generally speaking, I will traverse to a specific operator pattern to modify the related column reference and generate a new RelNode. Maybe we use `RelOptRule` or `RelShuttle` to do this.
I hope it can help you.
Xu
------------------ 原始邮件 ------------------
发件人: "dev" <djames17691@gmail.com>;
发送时间: 2021年5月19日(星期三) 凌晨0:35
收件人: "dev"<dev@calcite.apache.org>;
主题: Tracking column's origin
Hi, all.
I am trying to rewrite the query plan by removing some nodes but faced with
issues related to manipulating column ref indexes.
Let's consider the following calcite plan:
LogicalProject([...])
LogicalJoin(condition=[=($0, $4)], joinType=[inner])
LogicalTableScan(table=[[DB, R]])
LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
LogicalProject(a=[$3], $f0=[true])
LogicalTableScan(table=[[DB, S]])
In the join condition =($0, $4), the RHS column ref $4 is actually coming
from the $3 of table S.
To know this, we track down few nodes in the tree starting from the RHS
child of the join node.
But it becomes tricky when we use a more complex situation.
1) So, I wonder is there a utility class or method to support this purpose?
Furthermore, when we remove LogicalProject(a=[$3], $f0=[true]), we have to
manipulate all related column ref index starting from a parent of that
project node to the root node but manually tracking and shifting column ref
index is really a matter due to its complexity.
2) So I wonder the current Calcite impl has a utility class or methods to
help this situation.
3) Also, would you give me some general guidelines for implementing this
kind of stuff in Calcite?
Thanks,
James