You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Maytas Monsereenusorn (Jira)" <ji...@apache.org> on 2020/06/07 00:34:00 UTC
[jira] [Updated] (CALCITE-4048) Allow NULL in IN clause to be
converted to OR
[ https://issues.apache.org/jira/browse/CALCITE-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Maytas Monsereenusorn updated CALCITE-4048:
-------------------------------------------
Description:
Currently, Calcite convert a value list with NULL in the IN clause into an inline table. The code is in SqlToRelConverter, where we call !containsNullLiteral(valueList). For example, if I have a SQL select * from druid.foo where dim in ('a', null) then Calcite will not convert this to ...dim='a' OR dim=null but to a join on top of an aggregate. Furthermore, this can (I think?) be simplify to just dim='a' since dim=null is always null. (https://issues.apache.org/jira/browse/CALCITE-373 and [https://www.techrepublic.com/article/oracle-tip-understand-how-nulls-affect-in-and-exists/] -> _"When you use IN, you're telling SQL to take a value and compare it against every value or set of values in a list using =. If any NULL values exist, a row will not be returned__-even if both values are NULL."_ and [https://stackoverflow.com/questions/6362112/in-clause-with-null-or-is-null])
The plan I got after converting SqlNode to RelNode for the above example:
{code:java}
LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4], m1=[$5], m2=[$6], unique_dim1=[$7])
LogicalJoin(condition=[=($3, $8)], joinType=[inner])
LogicalTableScan(table=[[druid, foo]])
LogicalAggregate(group=[{0}])
LogicalValues(tuples=[[{ 'a' }, { null }]]){code}
however, i expect:
{code:java}
LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4], m1=[$5], m2=[$6], unique_dim1=[$7])
LogicalFilter(condition=[OR(=($3, 'a'), =($3, null))])
LogicalTableScan(table=[[druid, foo]]){code}
or something like:
{code:java}
LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4], m1=[$5], m2=[$6], unique_dim1=[$7])
LogicalFilter(condition=[=($3, 'a')])
LogicalTableScan(table=[[druid, foo]]){code}
I think line [https://github.com/apache/calcite/blob/feae6fbc328e3a7c87693951d1623f8b47ccea59/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L1102] can be promoted, we can move the null literals comparison to the last of composition predicates (OR/AND) instead of forbidden all the IN to OR(AND) conversion if the IN value list contains nulls.
was:
Currently, Calcite convert a value list with NULL in the IN clause into an inline table. The code is in SqlToRelConverter, where we call !containsNullLiteral(valueList). For example, if I have a SQL select * from druid.foo where dim in ('a', null) then Calcite will not convert this to ...dim='a' OR dim=null but to a join on top of an aggregate. Furthermore, this can (I think?) be simplify to just dim='a' since dim=null is always null. (https://issues.apache.org/jira/browse/CALCITE-373 and [https://www.techrepublic.com/article/oracle-tip-understand-how-nulls-affect-in-and-exists/] -> _"When you use IN, you're telling SQL to take a value and compare it against every value or set of values in a list using =. If any NULL values exist, a row will not be returned--even if both values are NULL."_ and [https://stackoverflow.com/questions/6362112/in-clause-with-null-or-is-null])
The plan I got after converting SqlNode to RelNode for the above example:
{code:java}
LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4], m1=[$5], m2=[$6], unique_dim1=[$7])
LogicalJoin(condition=[=($3, $8)], joinType=[inner])
LogicalTableScan(table=[[druid, foo]])
LogicalAggregate(group=[{0}])
LogicalValues(tuples=[[{ 'a' }, { null }]]){code}
however, i expect:
{code:java}
LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4], m1=[$5], m2=[$6], unique_dim1=[$7])
LogicalFilter(condition=[OR(=($3, 'a'), =($3, null))])
LogicalTableScan(table=[[druid, foo]]){code}
or something like:
{code:java}
LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4], m1=[$5], m2=[$6], unique_dim1=[$7])
LogicalFilter(condition=[=($3, 'a')])
LogicalTableScan(table=[[druid, foo]]){code}
I think line [https://github.com/apache/calcite/blob/feae6fbc328e3a7c87693951d1623f8b47ccea59/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L1102] can be promoted, we can move the null literals comparison to the last of composition predicates (OR/AND) instead of forbidden all the IN to OR(AND) conversion if the IN value list contains nulls.
> Allow NULL in IN clause to be converted to OR
> ---------------------------------------------
>
> Key: CALCITE-4048
> URL: https://issues.apache.org/jira/browse/CALCITE-4048
> Project: Calcite
> Issue Type: Improvement
> Reporter: Maytas Monsereenusorn
> Priority: Minor
>
> Currently, Calcite convert a value list with NULL in the IN clause into an inline table. The code is in SqlToRelConverter, where we call !containsNullLiteral(valueList). For example, if I have a SQL select * from druid.foo where dim in ('a', null) then Calcite will not convert this to ...dim='a' OR dim=null but to a join on top of an aggregate. Furthermore, this can (I think?) be simplify to just dim='a' since dim=null is always null. (https://issues.apache.org/jira/browse/CALCITE-373 and [https://www.techrepublic.com/article/oracle-tip-understand-how-nulls-affect-in-and-exists/] -> _"When you use IN, you're telling SQL to take a value and compare it against every value or set of values in a list using =. If any NULL values exist, a row will not be returned__-even if both values are NULL."_ and [https://stackoverflow.com/questions/6362112/in-clause-with-null-or-is-null])
>
> The plan I got after converting SqlNode to RelNode for the above example:
> {code:java}
> LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4], m1=[$5], m2=[$6], unique_dim1=[$7])
> LogicalJoin(condition=[=($3, $8)], joinType=[inner])
> LogicalTableScan(table=[[druid, foo]])
> LogicalAggregate(group=[{0}])
> LogicalValues(tuples=[[{ 'a' }, { null }]]){code}
> however, i expect:
> {code:java}
> LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4], m1=[$5], m2=[$6], unique_dim1=[$7])
> LogicalFilter(condition=[OR(=($3, 'a'), =($3, null))])
> LogicalTableScan(table=[[druid, foo]]){code}
> or something like:
> {code:java}
> LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4], m1=[$5], m2=[$6], unique_dim1=[$7])
> LogicalFilter(condition=[=($3, 'a')])
> LogicalTableScan(table=[[druid, foo]]){code}
> I think line [https://github.com/apache/calcite/blob/feae6fbc328e3a7c87693951d1623f8b47ccea59/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L1102] can be promoted, we can move the null literals comparison to the last of composition predicates (OR/AND) instead of forbidden all the IN to OR(AND) conversion if the IN value list contains nulls.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)