You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2023/04/25 15:29:00 UTC
[jira] [Updated] (HIVE-27296) HiveRelDecorrelator does not handle correlation with Values
[ https://issues.apache.org/jira/browse/HIVE-27296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stamatis Zampetakis updated HIVE-27296:
---------------------------------------
Description:
The {{HiveRelDecorrelator}} does not cope well with {{Values}} expressions and when such expression exists in the plan it fails to remove the respective {{{}Correlate{}}}.
In HIVE-27278, we discovered a query that has a correlation over an empty {{Values}} expression.
{code:sql}
EXPLAIN CBO SELECT id FROM t1 WHERE NULL IN (SELECT NULL FROM t2 where t1.id = t2.id);{code}
The CBO plan after decorrelation is shown below.
{noformat}
HiveProject(id=[$0])
LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{}])
HiveTableScan(table=[[default, t1]], table:alias=[t1])
HiveValues(tuples=[[]])
{noformat}
Although, in HIVE-27278 we could find a solution for a plan that contains an empty {{Values}} there can be queries with correlations on non-empty {{Values}} and for those we don't have a solution at the moment.
Normally after decorrelation we shouldn't have any {{Correlate}} expressions in the plan.
The problem starts from [HiveRelDecorrelator.decorrelate(Values)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L471] that returns null when it encounters the {{Values}} expression.
Later, in [HiveRelDecorrelator.decorrelate(Correlate)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L1247] it will bail out when treating the {{Correlate}} since one of the inputs is not rewritten.
The problem is still there in latest Calcite (CALCITE-5568).
was:
The {{HiveRelDecorrelator}} does not cope well with {{Values}} expressions and when such expression exists in the plan it fails to remove the respective {{Correlate}}.
In HIVE-27298, we discovered a query that has a correlation over an empty {{Values}} expression.
{code:sql}
EXPLAIN CBO SELECT id FROM t1 WHERE NULL IN (SELECT NULL FROM t2 where t1.id = t2.id);{code}
The CBO plan after decorrelation is shown below.
{noformat}
HiveProject(id=[$0])
LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{}])
HiveTableScan(table=[[default, t1]], table:alias=[t1])
HiveValues(tuples=[[]])
{noformat}
Although, in HIVE-27298 we could find a solution for a plan that contains an empty {{Values}} there can be queries with correlations on non-empty {{Values}} and for those we don't have a solution at the moment.
Normally after decorrelation we shouldn't have any {{Correlate}} expressions in the plan.
The problem starts from [HiveRelDecorrelator.decorrelate(Values)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L471] that returns null when it encounters the {{Values}} expression.
Later, in [HiveRelDecorrelator.decorrelate(Correlate)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L1247] it will bail out when treating the {{Correlate}} since one of the inputs is not rewritten.
The problem is still there in latest Calcite (CALCITE-5568).
> HiveRelDecorrelator does not handle correlation with Values
> -----------------------------------------------------------
>
> Key: HIVE-27296
> URL: https://issues.apache.org/jira/browse/HIVE-27296
> Project: Hive
> Issue Type: Bug
> Reporter: Stamatis Zampetakis
> Assignee: Stamatis Zampetakis
> Priority: Major
>
> The {{HiveRelDecorrelator}} does not cope well with {{Values}} expressions and when such expression exists in the plan it fails to remove the respective {{{}Correlate{}}}.
> In HIVE-27278, we discovered a query that has a correlation over an empty {{Values}} expression.
> {code:sql}
> EXPLAIN CBO SELECT id FROM t1 WHERE NULL IN (SELECT NULL FROM t2 where t1.id = t2.id);{code}
> The CBO plan after decorrelation is shown below.
> {noformat}
> HiveProject(id=[$0])
> LogicalCorrelate(correlation=[$cor0], joinType=[semi], requiredColumns=[{}])
> HiveTableScan(table=[[default, t1]], table:alias=[t1])
> HiveValues(tuples=[[]])
> {noformat}
> Although, in HIVE-27278 we could find a solution for a plan that contains an empty {{Values}} there can be queries with correlations on non-empty {{Values}} and for those we don't have a solution at the moment.
> Normally after decorrelation we shouldn't have any {{Correlate}} expressions in the plan.
> The problem starts from [HiveRelDecorrelator.decorrelate(Values)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L471] that returns null when it encounters the {{Values}} expression.
> Later, in [HiveRelDecorrelator.decorrelate(Correlate)|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L1247] it will bail out when treating the {{Correlate}} since one of the inputs is not rewritten.
> The problem is still there in latest Calcite (CALCITE-5568).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)