You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Vladimir Sitnikov <si...@gmail.com> on 2020/11/08 18:46:04 UTC

ProjectReduceExpressionsRule vs matchNullability: should RelRule be allowed to change field nullability?

Hi,

Can anybody clarify if RelRule should be allowed to change RelNode field
nullability?

For instance, RelOptRulesTest#testReduceNullableToNotNull has the following
SQL:
...empno + *case* when 'a' = 'a' then 1 else *null end* as newcol...

Of course, the initial type is nullable int (since the last CASE branch is
null), however,
you might see that the branch is never taken.

A year or so ago the test ensured that the simplified output was +($0,
CAST(1):INTEGER,
however CALCITE-2852 RexNode simplification does not traverse unknown
functions (see [1])
<https://github.com/apache/calcite/pull/1053/files#diff-bc6d90f2392477eb02e40d8bc26ad2d8b0d6bd6dd2714ea0c2b94b3043ff9a8eR6952>
altered the behavior, so the actual test output became NEWCOL=[+($0, 1)]
which means the rule was allowed
to modify row type.

It is strange if a rule could replace a RelNode with a different type, so I
would expect all transformations to be type-preserving.
Should those transformations be allowed?
Should the planner deny such transformations?
Should the planner add "cast to nullable" in case the original field was
nullable?

[1]:
https://github.com/apache/calcite/pull/1053/files#diff-bc6d90f2392477eb02e40d8bc26ad2d8b0d6bd6dd2714ea0c2b94b3043ff9a8eR6952

Vladimir

Re: ProjectReduceExpressionsRule vs matchNullability: should RelRule be allowed to change field nullability?

Posted by Julian Hyde <jh...@apache.org>.
Probably RelRule should not change nullability. But we have done so in
the past, and I suspect we continue to do so today. It would be
interesting to find a list of rules that do this.

Deducing that a field is never null is very useful. With that
knowledge, further optimizations become possible. So, we must continue
to do it. But we also have to abide by the contract that we don't
change row type (except for field names).

How to square that circle? I think we should keep the columns' type as
nullable, but generate predicates (see RelMdPredicates,
RelOptPredicateList) that indicate that the field is never null.

We already use 'not null' predicates during the simplify process. If
we are simplifying 'x is null or x > 5', by the time we process the 'x
> 5' factor we will have deduced that x is not null. Predicates are
good, and we're making ever more use of them these days.

Similar issues could occur with other type constraints. Imagine simplifying

  select 'abc' from t1 where false
  union all
  select 'xy' from t2

We could tighten the type from CHAR(3) to CHAR(2) when we notice that
no rows will come from t1. But should we?

Julian

On Sun, Nov 8, 2020 at 10:46 AM Vladimir Sitnikov
<si...@gmail.com> wrote:
>
> Hi,
>
> Can anybody clarify if RelRule should be allowed to change RelNode field
> nullability?
>
> For instance, RelOptRulesTest#testReduceNullableToNotNull has the following
> SQL:
> ...empno + *case* when 'a' = 'a' then 1 else *null end* as newcol...
>
> Of course, the initial type is nullable int (since the last CASE branch is
> null), however,
> you might see that the branch is never taken.
>
> A year or so ago the test ensured that the simplified output was +($0,
> CAST(1):INTEGER,
> however CALCITE-2852 RexNode simplification does not traverse unknown
> functions (see [1])
> <https://github.com/apache/calcite/pull/1053/files#diff-bc6d90f2392477eb02e40d8bc26ad2d8b0d6bd6dd2714ea0c2b94b3043ff9a8eR6952>
> altered the behavior, so the actual test output became NEWCOL=[+($0, 1)]
> which means the rule was allowed
> to modify row type.
>
> It is strange if a rule could replace a RelNode with a different type, so I
> would expect all transformations to be type-preserving.
> Should those transformations be allowed?
> Should the planner deny such transformations?
> Should the planner add "cast to nullable" in case the original field was
> nullable?
>
> [1]:
> https://github.com/apache/calcite/pull/1053/files#diff-bc6d90f2392477eb02e40d8bc26ad2d8b0d6bd6dd2714ea0c2b94b3043ff9a8eR6952
>
> Vladimir