You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by "Maytas Monsereenusorn (Jira)" <ji...@apache.org> on 2020/06/07 00:33:00 UTC

[jira] [Created] (CALCITE-4048) Allow NULL in IN clause to be converted to OR

Maytas Monsereenusorn created CALCITE-4048:
----------------------------------------------

             Summary: Allow NULL in IN clause to be converted to OR
                 Key: CALCITE-4048
                 URL: https://issues.apache.org/jira/browse/CALCITE-4048
             Project: Calcite
          Issue Type: Improvement
            Reporter: Maytas Monsereenusorn


Currently, Calcite convert a value list with NULL in the IN clause into an inline table. The code is in SqlToRelConverter, where we call !containsNullLiteral(valueList). For example, if I have a SQL select * from druid.foo where dim in ('a', null) then Calcite will not convert this to ...dim='a' OR dim=null but to a join on top of an aggregate. Furthermore, this can (I think?) be simplify to just dim='a' since dim=null is always null. (https://issues.apache.org/jira/browse/CALCITE-373 and [https://www.techrepublic.com/article/oracle-tip-understand-how-nulls-affect-in-and-exists/] -> _"When you use IN, you're telling SQL to take a value and compare it against every value or set of values in a list using =. If any NULL values exist, a row will not be returned--even if both values are NULL."_ and [https://stackoverflow.com/questions/6362112/in-clause-with-null-or-is-null]) 

 
The plan I got after converting SqlNode to RelNode for the above example:
{code:java}
LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4], m1=[$5], m2=[$6], unique_dim1=[$7])
  LogicalJoin(condition=[=($3, $8)], joinType=[inner])
    LogicalTableScan(table=[[druid, foo]])
    LogicalAggregate(group=[{0}])
      LogicalValues(tuples=[[{ 'a' }, { null }]]){code}
however, i expect:
{code:java}
LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4], m1=[$5], m2=[$6], unique_dim1=[$7])
  LogicalFilter(condition=[OR(=($3, 'a'), =($3, null))])
    LogicalTableScan(table=[[druid, foo]]){code}
or something like:
{code:java}
LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4], m1=[$5], m2=[$6], unique_dim1=[$7])
  LogicalFilter(condition=[=($3, 'a')])
    LogicalTableScan(table=[[druid, foo]]){code}
I think line [https://github.com/apache/calcite/blob/feae6fbc328e3a7c87693951d1623f8b47ccea59/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L1102] can be promoted, we can move the null literals comparison to the last of composition predicates (OR/AND) instead of forbidden all the IN to OR(AND) conversion if the IN value list contains nulls.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)