You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Christian Beikov <ch...@gmail.com> on 2017/08/23 09:08:15 UTC

Casts because of nullability differences

Hello,

I came to notice that the use of a nullable column in a context where a 
non-nullable is expected, causes a cast that is essentially useless. Am 
I missing something and is there reason to why that is done?

-- 

Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*

Re: Casts because of nullability differences

Posted by Christian Beikov <ch...@gmail.com>.
It apparently has something to do with MV substitutions as I couldn't 
reproduce it in any other way than using a query that can be substituted 
by an MV. Hard to say where the problem might be. Is there a specific 
phase in the planning for this optimization that might not be triggered 
anymore after substitutions? Any idea where I should dig into?


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 24.08.2017 um 21:31 schrieb Julian Hyde:
> Optimized away later. In fact during SQL generation. SQL has no concept of “casting away nullability”.
>
>> On Aug 23, 2017, at 6:10 PM, Christian Beikov <ch...@gmail.com> wrote:
>>
>> RelOptMaterialization creates the cast which calls eventually into RexUtil.generateCastExpressions. That method, probably rightfully, is strict about nulls. The question is, whether the cast should be optimized away later or it shouldn't be created in the first place. What do you say?
>>
>>
>> Mit freundlichen Grüßen,
>> ------------------------------------------------------------------------
>> *Christian Beikov*
>> Am 24.08.2017 um 02:05 schrieb Julian Hyde:
>>> You didn’t say that the casts were ending up in generated SQL. That would be a bug. Probably in RelToSqlConverter.
>>>
>>>
>>>> On Aug 23, 2017, at 4:35 PM, Christian Beikov <ch...@gmail.com> wrote:
>>>>
>>>> I get that the cast is done "internally" to handle nulls properly, but why would such a cast end up in the SQL that gets sent to the DBMS? Can you point me to a direction where to look for these generated casts? I'd like to try if I can avoid rendering these casts, as some DBMS planners might get confused.
>>>>
>>>>
>>>> Mit freundlichen Grüßen,
>>>> ------------------------------------------------------------------------
>>>> *Christian Beikov*
>>>> Am 23.08.2017 um 21:20 schrieb Julian Hyde:
>>>>> It’s difficult to say for sure whether the cast is necessary. For some of us consistency, is an end in itself. :)
>>>>>
>>>>> If a null was to occur, then yes, the cast would throw. But it’s also possible that Calcite can deduce that a null is impossible, in which case it wouldn’t even generate the logic.
>>>>>
>>>>> I have this argument with Ashutosh from time to time. He tends to say we can never be sure which expressions might or might not produce nulls, so why bother being so strict? I’d rather use every piece of information we can get, because it might allow us to optimize.
>>>>>
>>>>> By the way, there are cases where we can deduce that columns are not-nullable but we cannot change the row-type due to the RelOptRule contract. We ought to be generating RelMdPredicates in these cases.
>>>>>
>>>>> Julian
>>>>>
>>>>>
>>>>>
>>>>>> On Aug 23, 2017, at 11:33 AM, Christian Beikov <ch...@gmail.com> wrote:
>>>>>>
>>>>>> I am using materializations and my materialization table has nullable columns. Calcite correctly determines that the underlying query can never produce null for these columns, but I didn't encode that into the table definitino. I was just wondering why the casts are generated in the SQL that is sent to the owner of the materialization table.
>>>>>>
>>>>>> Thanks for the explanation, I suppose the cast causes proper exceptions to be thrown when null is encountered instead of simply a NPE? Or is the cast in such a case really unnecessary?
>>>>>>
>>>>>>
>>>>>> Mit freundlichen Grüßen,
>>>>>> ------------------------------------------------------------------------
>>>>>> *Christian Beikov*
>>>>>> Am 23.08.2017 um 20:01 schrieb Julian Hyde:
>>>>>>> I presume you’re talking about RexNodes. It makes a lot of things easier if the argument to function call are EXACTLY the type required by that function. The system inserts implicit casts where necessary.
>>>>>>>
>>>>>>> If you’re generating Java code (in Enumerable convention) there is a lot of difference between INTEGER NOT NULL and INTEGER. The former can be represented by int, the latter only by java.lang.Integer.
>>>>>>>
>>>>>>> Julian
>>>>>>>
>>>>>>>
>>>>>>>> On Aug 23, 2017, at 2:08 AM, Christian Beikov <ch...@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I came to notice that the use of a nullable column in a context where a non-nullable is expected, causes a cast that is essentially useless. Am I missing something and is there reason to why that is done?
>>>>>>>>
>>>>>>>> -- 
>>>>>>>>
>>>>>>>> Mit freundlichen Grüßen,
>>>>>>>> ------------------------------------------------------------------------
>>>>>>>> *Christian Beikov*


Re: Casts because of nullability differences

Posted by Julian Hyde <jh...@apache.org>.
Optimized away later. In fact during SQL generation. SQL has no concept of “casting away nullability”.

> On Aug 23, 2017, at 6:10 PM, Christian Beikov <ch...@gmail.com> wrote:
> 
> RelOptMaterialization creates the cast which calls eventually into RexUtil.generateCastExpressions. That method, probably rightfully, is strict about nulls. The question is, whether the cast should be optimized away later or it shouldn't be created in the first place. What do you say?
> 
> 
> Mit freundlichen Grüßen,
> ------------------------------------------------------------------------
> *Christian Beikov*
> Am 24.08.2017 um 02:05 schrieb Julian Hyde:
>> You didn’t say that the casts were ending up in generated SQL. That would be a bug. Probably in RelToSqlConverter.
>> 
>> 
>>> On Aug 23, 2017, at 4:35 PM, Christian Beikov <ch...@gmail.com> wrote:
>>> 
>>> I get that the cast is done "internally" to handle nulls properly, but why would such a cast end up in the SQL that gets sent to the DBMS? Can you point me to a direction where to look for these generated casts? I'd like to try if I can avoid rendering these casts, as some DBMS planners might get confused.
>>> 
>>> 
>>> Mit freundlichen Grüßen,
>>> ------------------------------------------------------------------------
>>> *Christian Beikov*
>>> Am 23.08.2017 um 21:20 schrieb Julian Hyde:
>>>> It’s difficult to say for sure whether the cast is necessary. For some of us consistency, is an end in itself. :)
>>>> 
>>>> If a null was to occur, then yes, the cast would throw. But it’s also possible that Calcite can deduce that a null is impossible, in which case it wouldn’t even generate the logic.
>>>> 
>>>> I have this argument with Ashutosh from time to time. He tends to say we can never be sure which expressions might or might not produce nulls, so why bother being so strict? I’d rather use every piece of information we can get, because it might allow us to optimize.
>>>> 
>>>> By the way, there are cases where we can deduce that columns are not-nullable but we cannot change the row-type due to the RelOptRule contract. We ought to be generating RelMdPredicates in these cases.
>>>> 
>>>> Julian
>>>> 
>>>> 
>>>> 
>>>>> On Aug 23, 2017, at 11:33 AM, Christian Beikov <ch...@gmail.com> wrote:
>>>>> 
>>>>> I am using materializations and my materialization table has nullable columns. Calcite correctly determines that the underlying query can never produce null for these columns, but I didn't encode that into the table definitino. I was just wondering why the casts are generated in the SQL that is sent to the owner of the materialization table.
>>>>> 
>>>>> Thanks for the explanation, I suppose the cast causes proper exceptions to be thrown when null is encountered instead of simply a NPE? Or is the cast in such a case really unnecessary?
>>>>> 
>>>>> 
>>>>> Mit freundlichen Grüßen,
>>>>> ------------------------------------------------------------------------
>>>>> *Christian Beikov*
>>>>> Am 23.08.2017 um 20:01 schrieb Julian Hyde:
>>>>>> I presume you’re talking about RexNodes. It makes a lot of things easier if the argument to function call are EXACTLY the type required by that function. The system inserts implicit casts where necessary.
>>>>>> 
>>>>>> If you’re generating Java code (in Enumerable convention) there is a lot of difference between INTEGER NOT NULL and INTEGER. The former can be represented by int, the latter only by java.lang.Integer.
>>>>>> 
>>>>>> Julian
>>>>>> 
>>>>>> 
>>>>>>> On Aug 23, 2017, at 2:08 AM, Christian Beikov <ch...@gmail.com> wrote:
>>>>>>> 
>>>>>>> Hello,
>>>>>>> 
>>>>>>> I came to notice that the use of a nullable column in a context where a non-nullable is expected, causes a cast that is essentially useless. Am I missing something and is there reason to why that is done?
>>>>>>> 
>>>>>>> -- 
>>>>>>> 
>>>>>>> Mit freundlichen Grüßen,
>>>>>>> ------------------------------------------------------------------------
>>>>>>> *Christian Beikov*
> 


Re: Casts because of nullability differences

Posted by Christian Beikov <ch...@gmail.com>.
RelOptMaterialization creates the cast which calls eventually into 
RexUtil.generateCastExpressions. That method, probably rightfully, is 
strict about nulls. The question is, whether the cast should be 
optimized away later or it shouldn't be created in the first place. What 
do you say?


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 24.08.2017 um 02:05 schrieb Julian Hyde:
> You didn’t say that the casts were ending up in generated SQL. That would be a bug. Probably in RelToSqlConverter.
>
>
>> On Aug 23, 2017, at 4:35 PM, Christian Beikov <ch...@gmail.com> wrote:
>>
>> I get that the cast is done "internally" to handle nulls properly, but why would such a cast end up in the SQL that gets sent to the DBMS? Can you point me to a direction where to look for these generated casts? I'd like to try if I can avoid rendering these casts, as some DBMS planners might get confused.
>>
>>
>> Mit freundlichen Grüßen,
>> ------------------------------------------------------------------------
>> *Christian Beikov*
>> Am 23.08.2017 um 21:20 schrieb Julian Hyde:
>>> It’s difficult to say for sure whether the cast is necessary. For some of us consistency, is an end in itself. :)
>>>
>>> If a null was to occur, then yes, the cast would throw. But it’s also possible that Calcite can deduce that a null is impossible, in which case it wouldn’t even generate the logic.
>>>
>>> I have this argument with Ashutosh from time to time. He tends to say we can never be sure which expressions might or might not produce nulls, so why bother being so strict? I’d rather use every piece of information we can get, because it might allow us to optimize.
>>>
>>> By the way, there are cases where we can deduce that columns are not-nullable but we cannot change the row-type due to the RelOptRule contract. We ought to be generating RelMdPredicates in these cases.
>>>
>>> Julian
>>>
>>>
>>>
>>>> On Aug 23, 2017, at 11:33 AM, Christian Beikov <ch...@gmail.com> wrote:
>>>>
>>>> I am using materializations and my materialization table has nullable columns. Calcite correctly determines that the underlying query can never produce null for these columns, but I didn't encode that into the table definitino. I was just wondering why the casts are generated in the SQL that is sent to the owner of the materialization table.
>>>>
>>>> Thanks for the explanation, I suppose the cast causes proper exceptions to be thrown when null is encountered instead of simply a NPE? Or is the cast in such a case really unnecessary?
>>>>
>>>>
>>>> Mit freundlichen Grüßen,
>>>> ------------------------------------------------------------------------
>>>> *Christian Beikov*
>>>> Am 23.08.2017 um 20:01 schrieb Julian Hyde:
>>>>> I presume you’re talking about RexNodes. It makes a lot of things easier if the argument to function call are EXACTLY the type required by that function. The system inserts implicit casts where necessary.
>>>>>
>>>>> If you’re generating Java code (in Enumerable convention) there is a lot of difference between INTEGER NOT NULL and INTEGER. The former can be represented by int, the latter only by java.lang.Integer.
>>>>>
>>>>> Julian
>>>>>
>>>>>
>>>>>> On Aug 23, 2017, at 2:08 AM, Christian Beikov <ch...@gmail.com> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I came to notice that the use of a nullable column in a context where a non-nullable is expected, causes a cast that is essentially useless. Am I missing something and is there reason to why that is done?
>>>>>>
>>>>>> -- 
>>>>>>
>>>>>> Mit freundlichen Grüßen,
>>>>>> ------------------------------------------------------------------------
>>>>>> *Christian Beikov*


Re: Casts because of nullability differences

Posted by Julian Hyde <jh...@apache.org>.
You didn’t say that the casts were ending up in generated SQL. That would be a bug. Probably in RelToSqlConverter.


> On Aug 23, 2017, at 4:35 PM, Christian Beikov <ch...@gmail.com> wrote:
> 
> I get that the cast is done "internally" to handle nulls properly, but why would such a cast end up in the SQL that gets sent to the DBMS? Can you point me to a direction where to look for these generated casts? I'd like to try if I can avoid rendering these casts, as some DBMS planners might get confused.
> 
> 
> Mit freundlichen Grüßen,
> ------------------------------------------------------------------------
> *Christian Beikov*
> Am 23.08.2017 um 21:20 schrieb Julian Hyde:
>> It’s difficult to say for sure whether the cast is necessary. For some of us consistency, is an end in itself. :)
>> 
>> If a null was to occur, then yes, the cast would throw. But it’s also possible that Calcite can deduce that a null is impossible, in which case it wouldn’t even generate the logic.
>> 
>> I have this argument with Ashutosh from time to time. He tends to say we can never be sure which expressions might or might not produce nulls, so why bother being so strict? I’d rather use every piece of information we can get, because it might allow us to optimize.
>> 
>> By the way, there are cases where we can deduce that columns are not-nullable but we cannot change the row-type due to the RelOptRule contract. We ought to be generating RelMdPredicates in these cases.
>> 
>> Julian
>> 
>> 
>> 
>>> On Aug 23, 2017, at 11:33 AM, Christian Beikov <ch...@gmail.com> wrote:
>>> 
>>> I am using materializations and my materialization table has nullable columns. Calcite correctly determines that the underlying query can never produce null for these columns, but I didn't encode that into the table definitino. I was just wondering why the casts are generated in the SQL that is sent to the owner of the materialization table.
>>> 
>>> Thanks for the explanation, I suppose the cast causes proper exceptions to be thrown when null is encountered instead of simply a NPE? Or is the cast in such a case really unnecessary?
>>> 
>>> 
>>> Mit freundlichen Grüßen,
>>> ------------------------------------------------------------------------
>>> *Christian Beikov*
>>> Am 23.08.2017 um 20:01 schrieb Julian Hyde:
>>>> I presume you’re talking about RexNodes. It makes a lot of things easier if the argument to function call are EXACTLY the type required by that function. The system inserts implicit casts where necessary.
>>>> 
>>>> If you’re generating Java code (in Enumerable convention) there is a lot of difference between INTEGER NOT NULL and INTEGER. The former can be represented by int, the latter only by java.lang.Integer.
>>>> 
>>>> Julian
>>>> 
>>>> 
>>>>> On Aug 23, 2017, at 2:08 AM, Christian Beikov <ch...@gmail.com> wrote:
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> I came to notice that the use of a nullable column in a context where a non-nullable is expected, causes a cast that is essentially useless. Am I missing something and is there reason to why that is done?
>>>>> 
>>>>> -- 
>>>>> 
>>>>> Mit freundlichen Grüßen,
>>>>> ------------------------------------------------------------------------
>>>>> *Christian Beikov*
> 


Re: Casts because of nullability differences

Posted by Christian Beikov <ch...@gmail.com>.
I get that the cast is done "internally" to handle nulls properly, but 
why would such a cast end up in the SQL that gets sent to the DBMS? Can 
you point me to a direction where to look for these generated casts? I'd 
like to try if I can avoid rendering these casts, as some DBMS planners 
might get confused.


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 23.08.2017 um 21:20 schrieb Julian Hyde:
> It’s difficult to say for sure whether the cast is necessary. For some of us consistency, is an end in itself. :)
>
> If a null was to occur, then yes, the cast would throw. But it’s also possible that Calcite can deduce that a null is impossible, in which case it wouldn’t even generate the logic.
>
> I have this argument with Ashutosh from time to time. He tends to say we can never be sure which expressions might or might not produce nulls, so why bother being so strict? I’d rather use every piece of information we can get, because it might allow us to optimize.
>
> By the way, there are cases where we can deduce that columns are not-nullable but we cannot change the row-type due to the RelOptRule contract. We ought to be generating RelMdPredicates in these cases.
>
> Julian
>
>
>
>> On Aug 23, 2017, at 11:33 AM, Christian Beikov <ch...@gmail.com> wrote:
>>
>> I am using materializations and my materialization table has nullable columns. Calcite correctly determines that the underlying query can never produce null for these columns, but I didn't encode that into the table definitino. I was just wondering why the casts are generated in the SQL that is sent to the owner of the materialization table.
>>
>> Thanks for the explanation, I suppose the cast causes proper exceptions to be thrown when null is encountered instead of simply a NPE? Or is the cast in such a case really unnecessary?
>>
>>
>> Mit freundlichen Grüßen,
>> ------------------------------------------------------------------------
>> *Christian Beikov*
>> Am 23.08.2017 um 20:01 schrieb Julian Hyde:
>>> I presume you’re talking about RexNodes. It makes a lot of things easier if the argument to function call are EXACTLY the type required by that function. The system inserts implicit casts where necessary.
>>>
>>> If you’re generating Java code (in Enumerable convention) there is a lot of difference between INTEGER NOT NULL and INTEGER. The former can be represented by int, the latter only by java.lang.Integer.
>>>
>>> Julian
>>>
>>>
>>>> On Aug 23, 2017, at 2:08 AM, Christian Beikov <ch...@gmail.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I came to notice that the use of a nullable column in a context where a non-nullable is expected, causes a cast that is essentially useless. Am I missing something and is there reason to why that is done?
>>>>
>>>> -- 
>>>>
>>>> Mit freundlichen Grüßen,
>>>> ------------------------------------------------------------------------
>>>> *Christian Beikov*


Re: Casts because of nullability differences

Posted by Julian Hyde <jh...@apache.org>.
It’s difficult to say for sure whether the cast is necessary. For some of us consistency, is an end in itself. :)

If a null was to occur, then yes, the cast would throw. But it’s also possible that Calcite can deduce that a null is impossible, in which case it wouldn’t even generate the logic.

I have this argument with Ashutosh from time to time. He tends to say we can never be sure which expressions might or might not produce nulls, so why bother being so strict? I’d rather use every piece of information we can get, because it might allow us to optimize.

By the way, there are cases where we can deduce that columns are not-nullable but we cannot change the row-type due to the RelOptRule contract. We ought to be generating RelMdPredicates in these cases.

Julian



> On Aug 23, 2017, at 11:33 AM, Christian Beikov <ch...@gmail.com> wrote:
> 
> I am using materializations and my materialization table has nullable columns. Calcite correctly determines that the underlying query can never produce null for these columns, but I didn't encode that into the table definitino. I was just wondering why the casts are generated in the SQL that is sent to the owner of the materialization table.
> 
> Thanks for the explanation, I suppose the cast causes proper exceptions to be thrown when null is encountered instead of simply a NPE? Or is the cast in such a case really unnecessary?
> 
> 
> Mit freundlichen Grüßen,
> ------------------------------------------------------------------------
> *Christian Beikov*
> Am 23.08.2017 um 20:01 schrieb Julian Hyde:
>> I presume you’re talking about RexNodes. It makes a lot of things easier if the argument to function call are EXACTLY the type required by that function. The system inserts implicit casts where necessary.
>> 
>> If you’re generating Java code (in Enumerable convention) there is a lot of difference between INTEGER NOT NULL and INTEGER. The former can be represented by int, the latter only by java.lang.Integer.
>> 
>> Julian
>> 
>> 
>>> On Aug 23, 2017, at 2:08 AM, Christian Beikov <ch...@gmail.com> wrote:
>>> 
>>> Hello,
>>> 
>>> I came to notice that the use of a nullable column in a context where a non-nullable is expected, causes a cast that is essentially useless. Am I missing something and is there reason to why that is done?
>>> 
>>> -- 
>>> 
>>> Mit freundlichen Grüßen,
>>> ------------------------------------------------------------------------
>>> *Christian Beikov*
> 


Re: Casts because of nullability differences

Posted by Christian Beikov <ch...@gmail.com>.
I am using materializations and my materialization table has nullable 
columns. Calcite correctly determines that the underlying query can 
never produce null for these columns, but I didn't encode that into the 
table definitino. I was just wondering why the casts are generated in 
the SQL that is sent to the owner of the materialization table.

Thanks for the explanation, I suppose the cast causes proper exceptions 
to be thrown when null is encountered instead of simply a NPE? Or is the 
cast in such a case really unnecessary?


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 23.08.2017 um 20:01 schrieb Julian Hyde:
> I presume you’re talking about RexNodes. It makes a lot of things easier if the argument to function call are EXACTLY the type required by that function. The system inserts implicit casts where necessary.
>
> If you’re generating Java code (in Enumerable convention) there is a lot of difference between INTEGER NOT NULL and INTEGER. The former can be represented by int, the latter only by java.lang.Integer.
>
> Julian
>
>
>> On Aug 23, 2017, at 2:08 AM, Christian Beikov <ch...@gmail.com> wrote:
>>
>> Hello,
>>
>> I came to notice that the use of a nullable column in a context where a non-nullable is expected, causes a cast that is essentially useless. Am I missing something and is there reason to why that is done?
>>
>> -- 
>>
>> Mit freundlichen Grüßen,
>> ------------------------------------------------------------------------
>> *Christian Beikov*


Re: Casts because of nullability differences

Posted by Julian Hyde <jh...@apache.org>.
I presume you’re talking about RexNodes. It makes a lot of things easier if the argument to function call are EXACTLY the type required by that function. The system inserts implicit casts where necessary.

If you’re generating Java code (in Enumerable convention) there is a lot of difference between INTEGER NOT NULL and INTEGER. The former can be represented by int, the latter only by java.lang.Integer. 

Julian


> On Aug 23, 2017, at 2:08 AM, Christian Beikov <ch...@gmail.com> wrote:
> 
> Hello,
> 
> I came to notice that the use of a nullable column in a context where a non-nullable is expected, causes a cast that is essentially useless. Am I missing something and is there reason to why that is done?
> 
> -- 
> 
> Mit freundlichen Grüßen,
> ------------------------------------------------------------------------
> *Christian Beikov*