You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Xiening Dai (Jira)" <ji...@apache.org> on 2019/11/06 19:28:00 UTC
[jira] [Comment Edited] (CALCITE-3479) Stack overflow error thrown when running join query

    [ https://issues.apache.org/jira/browse/CALCITE-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968631#comment-16968631 ] 

Xiening Dai edited comment on CALCITE-3479 at 11/6/19 7:27 PM:
---------------------------------------------------------------

I took a brief look at this, and I believe unfortunately it's a regression caused by fix of CALCITE-2166.

As [~julianhyde] mentioned, there could be cycles in the memo. I draw a quick example to show the problem. Originally the RelSubSet A has a best rel which uses subset C as input. But for some reason the cost of C or D is increased then with patch of 2166, we have to recalculate and decide the new best rel for subset A. Now there's this rel node (noted as "New Best") which uses subset B as input and form a cycle with subset A. During the calculation since Subset A's best cost is not updated yet (still the one before increasing), so it's likely that "New Best" node become the new best relnode for Subset A. Then you get into the trouble of having a cycle in your best plan.

I don't think there's easy way to fix this. There's a chicken-egg problem here: without re-calculating rel node costs in Subset A, we cannot determine the new best cost; but re-calculating the rel node costs would require getting the best cost of Subset A itself due to cyclic reference. For now, I can undo the patch of 2166 to mitigate this problem. But in longer run, we really need to see if the cyclic references are really needed by the framework and if we can remove them (CALCITE-790 ?).

!IMG_5089.jpg|width=532,height=398!


was (Author: xndai):
I took a brief look at this, and I believe unfortunately it's a regression caused by fix of CALCITE-2166.

As [~julianhyde] mentioned, there could be cycles in the memo. I draw a quick example to show the problem. Originally the RelSubSet A has a best rel which uses subset C as input. But for some reason the cost of C or D is increased then with patch of 2166, we have to recalculate and decide the new best rel for subset A. Now there's this rel node (noted as "New Best") which uses subset B as input and form a cycle with subset A. During the calculation since Subset A's best cost is not updated yet (still the one before increasing), so it's likely that "New Best" node become the new best relnode for Subset A. Then you get into the trouble of having a cycle in your best plan.

I don't think there's easy way to fix this. There's a chicken-egg problem here: without re-calculating rel node costs in Subset A, we cannot determine the new best cost; but re-calculating the rel node costs would require getting the best cost of Subset A itself due to cyclic reference. For now, I can undo the patch of 2166 to mitigate this problem. But in longer run, we really need to see if the cyclic references are really needed by the framework and if we can remove them (CALCITE-790 ?).

 

> Stack overflow error thrown when running join query
> ---------------------------------------------------
>
>                 Key: CALCITE-3479
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3479
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Amit Chavan
>            Priority: Major
>         Attachments: IMG_5089.jpg, TestCalcite.java
>
>
> A unit test in our project is failing which is as below.
> The query in question is 
> {code}SELECT * FROM tblspace1.t1 t10, tblspace1.t2 t20 WHERE t20.n1 = 3 AND t10.n1 = 3 AND t20.n1 = t10.n1{code}
> I get a stack overflow error- 
> {noformat}
> -- Mid Plan-- Mid Plan
> LogicalProject(subset=[rel#19:Subset#4.ENUMERABLE.[]], k1=[$0], n1=[$1], s1=[$2], k2=[$3], n10=[$4], s2=[$5])
>   LogicalFilter(subset=[rel#16:Subset#3.NONE.[]], condition=[AND(=($4, 3), =($1, 3), =($4, $1))])
>     LogicalJoin(subset=[rel#14:Subset#2.NONE.[]], condition=[true], joinType=[inner])
>       EnumerableTableScan(subset=[rel#11:Subset#0.ENUMERABLE.[]], table=[[tblspace1, t1]])
>       EnumerableTableScan(subset=[rel#12:Subset#1.ENUMERABLE.[]], table=[[tblspace1, t2]])
> {noformat}
> {code}
> java.lang.StackOverflowError at 
> org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:639) at org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:643) at org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:643) at org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:643) at org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:643) at org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:643)
> {code}
> Also attaching the UT code in the ticket



--
This message was sent by Atlassian Jira
(v8.3.4#803005)