You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Knut Anders Hatlen (JIRA)" <ji...@apache.org> on 2014/05/11 00:16:36 UTC
[jira] [Updated] (DERBY-6566) Simplify handling of untyped nulls in CASE and NULLIF expressions

     [ https://issues.apache.org/jira/browse/DERBY-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Knut Anders Hatlen updated DERBY-6566:
--------------------------------------

    Attachment: d6566-1a.diff

Attaching [^d6566-1a.diff], which simplifies the handling of untyped NULLs in CASE and NULLIF expressions. The patch removes 245 lines more from the engine code than it adds to it. It is primarily ConditionalNode that's reduced in size.

First I'd like to point out that there is one behaviour change introduced by the patch. In Derby versions from 10.3 up to current, CASE expressions where one branch has a typed NULL of a type not compatible with the expression in the other branch, such as {{CASE WHEN 1=1 THEN 'abc' ELSE CAST(NULL AS SMALLINT) END}}, are silently rewritten during the bind phase so that {{CAST(NULL AS SMALLINT)}} becomes {{CAST(NULL AS CHAR(3))}}. The THEN branch and the ELSE branch are believed to have compatible types because of this rewrite, and the expression is successfully evaluated.

With the patch, this rewrite no longer happens, and the behaviour is reverted to what it was in 10.2 and earlier. Such an expression will now cause the following error to be raised:

{noformat}
ERROR 42X89: Types 'CHAR' and 'SMALLINT' are not type compatible. Neither type is assignable to the other type.
{noformat}

I believe 10.3 and higher accept such expressions because of an unintended fallout caused by the fix for DERBY-1620. That fix was supposed to make untyped NULLs get their type from the context, but ended up changing the type of already typed NULLs as well.

Here's a description of what the patch does:

1) It makes CASE expression represent untyped NULLs as UntypedNullConstantNodes, which is the same as NULLIF already does. This allows ConditionalNode to handle CASE and NULLIF exactly the same way.

2) It changes how the parser builds the AST so that a single CASE expression is now represented by a single ConditionalNode.

Previously, an expression such as {{CASE WHEN a THEN b WHEN c THEN d ELSE e END}} would be represented by two nested ConditionalNodes, as if the expression had actually been {{CASE WHEN a THEN b ELSE (CASE WHEN c THEN d ELSE e END) END}}. Those two expressions aren't always completely equivalent.

Take for example {{CASE WHEN a THEN 1 WHEN b THEN NULL ELSE NULL END}}, which would be represented as {{CASE WHEN a THEN 1 ELSE (CASE WHEN b THEN NULL ELSE NULL END) END}}. The inner conditional node in the rewritten expression is {{CASE WHEN b THEN NULL ELSE NULL}}, which has no information about what the return type is. Because of this, the current code needs some extra complexity to transfer type information from the outer ConditionalNode to the inner ConditionalNode. The patch is able to remove much of this complexity because it has information about all the branches of the CASE expression in one ConditionalNode.

Another benefit from this change, is that the AST is not so deeply nested, which reduces the risk of getting a StackOverflowError during compilation of CASE expressions with lots of WHEN clauses.

3) It changes some of the CASE expressions in the ODBC metadata queries. Some CASTs to the incorrect type had sneaked into the queries, and the previously described behaviour change revealed those bugs.

4) Add a test case to verify that some edge cases still work as before.


All the regression tests passed with the patch.

> Simplify handling of untyped nulls in CASE and NULLIF expressions
> -----------------------------------------------------------------
>
>                 Key: DERBY-6566
>                 URL: https://issues.apache.org/jira/browse/DERBY-6566
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 11.0.0.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: d6566-1a.diff
>
>
> The parser translates both CASE and NULLIF expressions into ConditionalNodes, but it represents untyped NULLs differently in the two cases.
> In a CASE expression, any branch that is an untyped NULL, is translated into an UntypedNullConstantNode that's wrapped in a CastNode that casts the value to CHAR(1). The CastNode is replaced with a cast to the correct type during the bind phase.
> A NULLIF expression is turned into a CASE expression that has a THEN NULL clause. The parser simply creates an UntypedNullConstantNode for that clause, without wrapping it in a CastNode. A CastNode is instead added during the bind phase.
> This slight difference in how NULLs are represented by the parser in the two cases, means that ConditionalNode needs to handle the two cases differently during the bind phase. It would be better if the parser generated NULLs in the same way for the two cases, so that ConditionalNode didn't need to know if it was generated for a CASE expression or a NULLIF expression.



--
This message was sent by Atlassian JIRA
(v6.2#6252)