You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Bohdan Kazydub (JIRA)" <ji...@apache.org> on 2018/10/29 13:59:00 UTC

[jira] [Updated] (DRILL-6815) Improve code generation to handle functions with NullHandling.NULL_IF_NULL better

     [ https://issues.apache.org/jira/browse/DRILL-6815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bohdan Kazydub updated DRILL-6815:
----------------------------------
    Description: 
If a (simple) function is declared with NULL_IF_NULL null handling strategy (`nulls = NullHandling.NULL_IF_NULL`) there is a additional code generated which checks if any of the inputs is NULL (not set). In case if there is, output is set to be null otherwise function's code is executed and at the end output value is marked as set in case if ANY of the inputs is OPTIONAL (see https://github.com/apache/drill/blob/8edeb49873d1a1710cfe28e0b49364d07eb1aef4/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillSimpleFuncHolder.java#L143).

The problem is, this behavior makes it impossible to make output value NULL from within function's evaluation body. Which may prove useful in certain situations, e.g. when input is an empty string and output should be NULL in the case etc. Sometimes it may result in creation of two separate functions with NullHanling.INTERNAL (one for OPTIONAL and one for REQUIRED inputs) instead of one with NULL_IF_NULL. It does not follow a Principle of Least Astonishment as effectively it behaves more like "null if and only if null" and documentation for NULL_IF_NULL is as follows:
{code}
enum NullHandling {
    ...

    /**
     * Null output if any null input:
     * Indicates that a method's associated logical operation returns NULL if
     * either input is NULL, and therefore that the method must not be called
     * with null inputs.  (The calling framework must handle NULLs.)
     */
    NULL_IF_NULL
}
{code}
It looks as if this behavior was not intended.

Intent of this improvement is to allow output NULL values based on function's eval() method when NULL_IF_NULL null handling strategy is chosen.

  was:
If a (simple) function is declared with NULL_IF_NULL null handling strategy (`nulls = NullHandling.NULL_IF_NULL`) there is a additional code generated which checks if any of the inputs is NULL (not set). In case if there is, output is set to be null otherwise function's code is executed and at the end output value is marked as set in case if ANY of the inputs is OPTIONAL (see [https://github.com/apache/drill/blob/8edeb49873d1a1710cfe28e0b49364d07eb1aef4/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillSimpleFuncHolder.java#L143).]

The problem is, this behavior makes it impossible to make output value NULL from within [function's evaluation body|https://github.com/apache/drill/blob/7b0c9034753a8c5035fd1c0f1f84a37b376e6748/exec/java-exec/src/main/java/org/apache/drill/exec/expr/DrillSimpleFunc.java#L22]. Which may prove useful in certain situations, e.g. when input is an empty string and output should be NULL in the case etc. Sometimes it may result in two separate functions instead of one with NULL_IF_NULL. It does not follow a [Principle of Least Astonishment|https://en.wikipedia.org/wiki/Principle_of_least_astonishment] as effectively it behaves more like "null if and only if null" and documentation for NULL_IF_NULL is currently


> Improve code generation to handle functions with NullHandling.NULL_IF_NULL better
> ---------------------------------------------------------------------------------
>
>                 Key: DRILL-6815
>                 URL: https://issues.apache.org/jira/browse/DRILL-6815
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Bohdan Kazydub
>            Priority: Major
>
> If a (simple) function is declared with NULL_IF_NULL null handling strategy (`nulls = NullHandling.NULL_IF_NULL`) there is a additional code generated which checks if any of the inputs is NULL (not set). In case if there is, output is set to be null otherwise function's code is executed and at the end output value is marked as set in case if ANY of the inputs is OPTIONAL (see https://github.com/apache/drill/blob/8edeb49873d1a1710cfe28e0b49364d07eb1aef4/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillSimpleFuncHolder.java#L143).
> The problem is, this behavior makes it impossible to make output value NULL from within function's evaluation body. Which may prove useful in certain situations, e.g. when input is an empty string and output should be NULL in the case etc. Sometimes it may result in creation of two separate functions with NullHanling.INTERNAL (one for OPTIONAL and one for REQUIRED inputs) instead of one with NULL_IF_NULL. It does not follow a Principle of Least Astonishment as effectively it behaves more like "null if and only if null" and documentation for NULL_IF_NULL is as follows:
> {code}
> enum NullHandling {
>     ...
>     /**
>      * Null output if any null input:
>      * Indicates that a method's associated logical operation returns NULL if
>      * either input is NULL, and therefore that the method must not be called
>      * with null inputs.  (The calling framework must handle NULLs.)
>      */
>     NULL_IF_NULL
> }
> {code}
> It looks as if this behavior was not intended.
> Intent of this improvement is to allow output NULL values based on function's eval() method when NULL_IF_NULL null handling strategy is chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)