You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Aitozi (Jira)" <ji...@apache.org> on 2023/06/15 15:23:00 UTC
[jira] [Updated] (CALCITE-5784) Generate the same correlationId for the same query

     [ https://issues.apache.org/jira/browse/CALCITE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aitozi updated CALCITE-5784:
----------------------------
    Description: 
Currently, the CTE query block will be sql2rel multiple times if be referenced multiple times. If it contains a lateral join, it will generate different correlation id. See below:


{code:java}
    String sql = "WITH a AS (SELECT ename, job, empno, r FROM emp, LATERAL TABLE (ramp(empno)) as T(r))"
        + " SELECT * from a a1, a a2 WHERE a1.r = a2.empno";
    sql(sql).ok();
{code}


{code:java}
LogicalProject(ENAME=[$0], JOB=[$1], EMPNO=[$2], R=[$3], ENAME0=[$4], JOB0=[$5], EMPNO0=[$6], R0=[$7])
  LogicalJoin(condition=[=($3, $6)], joinType=[inner])
    LogicalProject(ENAME=[$1], JOB=[$2], EMPNO=[$0], R=[$9])
      LogicalCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{0}])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
        LogicalTableFunctionScan(invocation=[RAMP($cor0.EMPNO)], rowType=[RecordType(INTEGER I)])
    LogicalProject(ENAME=[$1], JOB=[$2], EMPNO=[$0], R=[$9])
      LogicalCorrelate(correlation=[$cor1], joinType=[inner], requiredColumns=[{0}])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
        LogicalTableFunctionScan(invocation=[RAMP($cor1.EMPNO)], rowType=[RecordType(INTEGER I)])

{code}

It will generate two correlation id. In flink there is a subplan reuse based on the digest, the same query produce diff digest will break this functionality


  was:
Currently, the CTE query block will be sql2rel multiple times if be referenced multiple times. If it contains a lateral join, it will generate different correlation id. See below:


{code:java}
    String sql = "WITH a AS (SELECT ename, job, empno, r FROM emp, LATERAL TABLE (ramp(empno)) as T(r))"
        + " SELECT * from a a1, a a2 WHERE a1.r = a2.empno";
    sql(sql).ok();
{code}


{code:java}
LogicalProject(ENAME=[$0], JOB=[$1], EMPNO=[$2], R=[$3], ENAME0=[$4], JOB0=[$5], EMPNO0=[$6], R0=[$7])
  LogicalJoin(condition=[=($3, $6)], joinType=[inner])
    LogicalProject(ENAME=[$1], JOB=[$2], EMPNO=[$0], R=[$9])
      LogicalCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{0}])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
        LogicalTableFunctionScan(invocation=[RAMP($cor0.EMPNO)], rowType=[RecordType(INTEGER I)])
    LogicalProject(ENAME=[$1], JOB=[$2], EMPNO=[$0], R=[$9])
      LogicalCorrelate(correlation=[$cor1], joinType=[inner], requiredColumns=[{0}])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]])
        LogicalTableFunctionScan(invocation=[RAMP($cor1.EMPNO)], rowType=[RecordType(INTEGER I)])

{code}

It will generate two correlation id. In flink there is a subplan reuse based on the digest, it will break this functionality



> Generate the same correlationId for the same query
> --------------------------------------------------
>
>                 Key: CALCITE-5784
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5784
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Aitozi
>            Priority: Major
>
> Currently, the CTE query block will be sql2rel multiple times if be referenced multiple times. If it contains a lateral join, it will generate different correlation id. See below:
> {code:java}
>     String sql = "WITH a AS (SELECT ename, job, empno, r FROM emp, LATERAL TABLE (ramp(empno)) as T(r))"
>         + " SELECT * from a a1, a a2 WHERE a1.r = a2.empno";
>     sql(sql).ok();
> {code}
> {code:java}
> LogicalProject(ENAME=[$0], JOB=[$1], EMPNO=[$2], R=[$3], ENAME0=[$4], JOB0=[$5], EMPNO0=[$6], R0=[$7])
>   LogicalJoin(condition=[=($3, $6)], joinType=[inner])
>     LogicalProject(ENAME=[$1], JOB=[$2], EMPNO=[$0], R=[$9])
>       LogicalCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{0}])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>         LogicalTableFunctionScan(invocation=[RAMP($cor0.EMPNO)], rowType=[RecordType(INTEGER I)])
>     LogicalProject(ENAME=[$1], JOB=[$2], EMPNO=[$0], R=[$9])
>       LogicalCorrelate(correlation=[$cor1], joinType=[inner], requiredColumns=[{0}])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>         LogicalTableFunctionScan(invocation=[RAMP($cor1.EMPNO)], rowType=[RecordType(INTEGER I)])
> {code}
> It will generate two correlation id. In flink there is a subplan reuse based on the digest, the same query produce diff digest will break this functionality



--
This message was sent by Atlassian Jira
(v8.20.10#820010)