You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Anton Haidai <an...@gmail.com> on 2018/10/26 16:15:22 UTC

Can't apply JoinPushThroughJoinRule.RIGHT twice

Hello!
I have the following problem. Sample SQL input is the following:
SELECT *
FROM X
INNER JOIN A
ON X.id = A.id
INNER JOIN X2
ON X.id = X2.id
INNER JOIN X3
ON X.id = X3.id

According to a custom cost model used, it would beneficial to move select
from the table "A" as high as possible in LogicalJoins tree. So the optimal
solution is to transform the query into the following form (move the table
"A" into the last join making it a right node of a top LogicalJoin):
SELECT *
FROM X
INNER JOIN X2
ON X.id = X2.id
INNER JOIN X3
ON X.id = X3.id
INNER JOIN A
ON X.id = A.id
In order to achieve this, JoinPushThroughJoinRule.RIGHT rule should be
applied twice with table "A" as a "B" node (in terms of rule's source code
comments). However, when trying to optimize the SQL input listed using
VolcanoPlanner, the result is the following:
SELECT *
FROM X
INNER JOIN X2
ON X.id = X2.id
INNER JOIN A
ON X.id = A.id
INNER JOIN X3
ON X.id = X3.id

So while the cost was improved, VolcanoPlanner did only the first step
towards the optimal solution. According to my debugging, there were no
attempts to apply JoinPushThroughJoinRule.RIGHT one more time to this
modified structure with the first step done.
But if using this modified structure as a new input SQL, the optimal result
is achieved.
So looks like VolcanoPlanner is able to make one step towards optimal
solution applying JoinPushThroughJoinRule.RIGHT, but can't reach the
optimal solution when it is required to apply
JoinPushThroughJoinRule.RIGHT  twice in order to achieve it.

Are there any hints regarding this problem?
-- 
Best regards,
Anton.