You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Anton Haidai <an...@gmail.com> on 2018/10/29 08:18:36 UTC

VolcanoPlanner changes Joins structure only once

Hello!
I have the following problem. Sample SQL input is the following:
SELECT *
FROM X
INNER JOIN A
ON X.id = A.id
INNER JOIN X2
ON X.id = X2.id
INNER JOIN X3
ON X.id = X3.id

According to a custom cost model used, it would beneficial to move the
select from the table "A" as high as possible in LogicalJoins tree. So the
optimal solution is to transform the query into the following form (move
the table "A" into the last join making it a right node of a top
LogicalJoin):
SELECT *
FROM X
INNER JOIN X2
ON X.id = X2.id
INNER JOIN X3
ON X.id = X3.id
INNER JOIN A
ON X.id = A.id

In order to achieve this, JoinPushThroughJoinRule.RIGHT rule should be
applied twice with the table "A" as a the "B" node (in terms of rule's
source code comments). However, when trying to optimize the SQL input
listed using VolcanoPlanner, the result is the following:
SELECT *
FROM X
INNER JOIN X2
ON X.id = X2.id
INNER JOIN A
ON X.id = A.id
INNER JOIN X3
ON X.id = X3.id

So while the cost was improved, VolcanoPlanner did only the first step
towards the optimal solution. According to my debugging, there were no
attempts to apply JoinPushThroughJoinRule.RIGHT one more time to this
modified structure with the first step done.
But if using this modified structure as a new input SQL, the optimal result
is achieved.
So looks like VolcanoPlanner is able to make only one step towards optimal
solution applying JoinPushThroughJoinRule.RIGHT, but can't reach the
optimal solution when it is required to apply JoinPushThroughJoinRule.RIGHT
twice.

Are there any known VolcanoPlanner limitations that could be related to
this behavior?
-- 
Best regards,
Anton.

Re: VolcanoPlanner changes Joins structure only once

Posted by Julian Hyde <jh...@apache.org>.
There’s nothing intrinsic in VolcanoPlanner that prevents a rule from firing more than once. (HepPlanner is a different story.)

It would be useful to see the relational algebra. (E.g. are these right-deep trees? What column is used for “X.id <http://x.id/>” in each case? Those are important details that affect whether the rule can fire.) So, enable tracing for the planner.

> On Oct 29, 2018, at 1:18 AM, Anton Haidai <an...@gmail.com> wrote:
> 
> Hello!
> I have the following problem. Sample SQL input is the following:
> SELECT *
> FROM X
> INNER JOIN A
> ON X.id = A.id
> INNER JOIN X2
> ON X.id = X2.id
> INNER JOIN X3
> ON X.id = X3.id
> 
> According to a custom cost model used, it would beneficial to move the
> select from the table "A" as high as possible in LogicalJoins tree. So the
> optimal solution is to transform the query into the following form (move
> the table "A" into the last join making it a right node of a top
> LogicalJoin):
> SELECT *
> FROM X
> INNER JOIN X2
> ON X.id = X2.id
> INNER JOIN X3
> ON X.id = X3.id
> INNER JOIN A
> ON X.id = A.id
> 
> In order to achieve this, JoinPushThroughJoinRule.RIGHT rule should be
> applied twice with the table "A" as a the "B" node (in terms of rule's
> source code comments). However, when trying to optimize the SQL input
> listed using VolcanoPlanner, the result is the following:
> SELECT *
> FROM X
> INNER JOIN X2
> ON X.id = X2.id
> INNER JOIN A
> ON X.id = A.id
> INNER JOIN X3
> ON X.id = X3.id
> 
> So while the cost was improved, VolcanoPlanner did only the first step
> towards the optimal solution. According to my debugging, there were no
> attempts to apply JoinPushThroughJoinRule.RIGHT one more time to this
> modified structure with the first step done.
> But if using this modified structure as a new input SQL, the optimal result
> is achieved.
> So looks like VolcanoPlanner is able to make only one step towards optimal
> solution applying JoinPushThroughJoinRule.RIGHT, but can't reach the
> optimal solution when it is required to apply JoinPushThroughJoinRule.RIGHT
> twice.
> 
> Are there any known VolcanoPlanner limitations that could be related to
> this behavior?
> -- 
> Best regards,
> Anton.