You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2023/04/25 15:29:00 UTC

[jira] [Updated] (HIVE-27296) HiveRelDecorrelator does not handle correlation with Values

     [ https://issues.apache.org/jira/browse/HIVE-27296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stamatis Zampetakis updated HIVE-27296:
---------------------------------------
    Description: 
The {{HiveRelDecorrelator}} does not cope well with {{Values}} expressions and when such expression exists in the plan it fails to remove the respective {{{}Correlate{}}}.

In HIVE-27278, we discovered a query that has a correlation over an empty {{Values}} expression.
{code:sql}
EXPLAIN CBO SELECT id FROM t1 WHERE NULL IN (SELECT NULL FROM t2 where t1.id = t2.id);{code}
The CBO plan after decorrelation is shown below.
{noformat}
HiveProject(id=[$0])
  LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{}])
    HiveTableScan(table=[[default, t1]], table:alias=[t1])
    HiveValues(tuples=[[]])
{noformat}
Although, in HIVE-27278 we could find a solution for a plan that contains an empty {{Values}} there can be queries with correlations on non-empty {{Values}} and for those we don't have a solution at the moment.

Normally after decorrelation we shouldn't have any {{Correlate}} expressions in the plan.

The problem starts from [HiveRelDecorrelator.decorrelate(Values)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L471] that returns null when it encounters the {{Values}} expression.

Later, in [HiveRelDecorrelator.decorrelate(Correlate)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L1247] it will bail out when treating the {{Correlate}} since one of the inputs is not rewritten.

The problem is still there in latest Calcite (CALCITE-5568).

  was:
The {{HiveRelDecorrelator}} does not cope well with {{Values}} expressions and when such expression exists in the plan it fails to remove the respective {{Correlate}}.

In HIVE-27298, we discovered a query that has a correlation over an empty {{Values}} expression.
{code:sql}
EXPLAIN CBO SELECT id FROM t1 WHERE NULL IN (SELECT NULL FROM t2 where t1.id = t2.id);{code}

The CBO plan after decorrelation is shown below.
{noformat}
HiveProject(id=[$0])
  LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{}])
    HiveTableScan(table=[[default, t1]], table:alias=[t1])
    HiveValues(tuples=[[]])
{noformat}

Although, in HIVE-27298 we could find a solution for a plan that contains an empty {{Values}} there can be queries with correlations on non-empty {{Values}} and for those we don't have a solution at the moment.

Normally after decorrelation we shouldn't have any {{Correlate}} expressions in the plan.

The problem starts from [HiveRelDecorrelator.decorrelate(Values)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L471] that returns null when it encounters the {{Values}} expression.

Later, in [HiveRelDecorrelator.decorrelate(Correlate)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L1247] it will bail out when treating the {{Correlate}} since one of the inputs is not rewritten. 

The problem is still there in latest Calcite (CALCITE-5568).


> HiveRelDecorrelator does not handle correlation with Values
> -----------------------------------------------------------
>
>                 Key: HIVE-27296
>                 URL: https://issues.apache.org/jira/browse/HIVE-27296
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>
> The {{HiveRelDecorrelator}} does not cope well with {{Values}} expressions and when such expression exists in the plan it fails to remove the respective {{{}Correlate{}}}.
> In HIVE-27278, we discovered a query that has a correlation over an empty {{Values}} expression.
> {code:sql}
> EXPLAIN CBO SELECT id FROM t1 WHERE NULL IN (SELECT NULL FROM t2 where t1.id = t2.id);{code}
> The CBO plan after decorrelation is shown below.
> {noformat}
> HiveProject(id=[$0])
>   LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{}])
>     HiveTableScan(table=[[default, t1]], table:alias=[t1])
>     HiveValues(tuples=[[]])
> {noformat}
> Although, in HIVE-27278 we could find a solution for a plan that contains an empty {{Values}} there can be queries with correlations on non-empty {{Values}} and for those we don't have a solution at the moment.
> Normally after decorrelation we shouldn't have any {{Correlate}} expressions in the plan.
> The problem starts from [HiveRelDecorrelator.decorrelate(Values)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L471] that returns null when it encounters the {{Values}} expression.
> Later, in [HiveRelDecorrelator.decorrelate(Correlate)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L1247] it will bail out when treating the {{Correlate}} since one of the inputs is not rewritten.
> The problem is still there in latest Calcite (CALCITE-5568).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)