You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "William Shen (JIRA)" <ji...@apache.org> on 2018/12/08 01:16:00 UTC

[jira] [Comment Edited] (PHOENIX-5065) Inconsistent treatment of NULL and empty string

    [ https://issues.apache.org/jira/browse/PHOENIX-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16713453#comment-16713453 ] 

William Shen edited comment on PHOENIX-5065 at 12/8/18 1:15 AM:
----------------------------------------------------------------

With the explain plan, it seems like when there are multiple values involved, the IN operator translates IS NULL instead of = NULL for the empty string?
{noformat}
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID = '';
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.022 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID = null;
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.027 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID is null;
+----------------------------------------------------------------------+-----------------+----------------+--------------+
|                                 PLAN                                 | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+----------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SYSTEM.CATALOG [null]  | null            | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY                                  | null            | null           | null         |
|     SERVER AGGREGATE INTO SINGLE ROW                                 | null            | null           | null         |
+----------------------------------------------------------------------+-----------------+----------------+--------------+
3 rows selected (0.02 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID in (null);
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.04 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID in ('');
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.02 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID in ('', 'FOO');
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
|                                          PLAN                                           | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY SKIP SCAN ON 2 KEYS OVER SYSTEM.CATALOG [null] - ['FOO']  | null            | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY                                                     | null            | null           | null         |
|     SERVER AGGREGATE INTO SINGLE ROW                                                    | null            | null           | null         |
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
{noformat}


Instead of evaluating to 
{noformat}
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID = '' OR TENANT_ID = 'FOO';
+------------------------------------------------------------------+-----------------+----------------+--------------+
|                               PLAN                               | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER SYSTEM.CATALOG      | null            | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY AND ( OR TENANT_ID = 'FOO')  | null            | null           | null         |
|     SERVER AGGREGATE INTO SINGLE ROW                             | null            | null           | null         |
+------------------------------------------------------------------+-----------------+----------------+--------------+
3 rows selected (0.025 seconds)
{noformat}

It evaluated to
{noformat}
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID IS NULL OR TENANT_ID = 'FOO';
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
|                                          PLAN                                           | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY SKIP SCAN ON 2 KEYS OVER SYSTEM.CATALOG [null] - ['FOO']  | null            | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY                                                     | null            | null           | null         |
|     SERVER AGGREGATE INTO SINGLE ROW                                                    | null            | null           | null         |
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
3 rows selected (0.021 seconds)

{noformat}


was (Author: willshen):
With the explain plan, it seems like when there are multiple values involved, the IN operator translates IS NULL instead of = NULL for the empty string?
{noformat}
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID = '';
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.022 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID = null;
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.027 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID is null;
+----------------------------------------------------------------------+-----------------+----------------+--------------+
|                                 PLAN                                 | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+----------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER SYSTEM.CATALOG [null]  | null            | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY                                  | null            | null           | null         |
|     SERVER AGGREGATE INTO SINGLE ROW                                 | null            | null           | null         |
+----------------------------------------------------------------------+-----------------+----------------+--------------+
3 rows selected (0.02 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID in (null);
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.04 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID in ('');
+--------------------------------------+-----------------+----------------+--------------+
|                 PLAN                 | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+--------------------------------------+-----------------+----------------+--------------+
| DEGENERATE SCAN OVER SYSTEM.CATALOG  | null            | null           | null         |
+--------------------------------------+-----------------+----------------+--------------+
1 row selected (0.02 seconds)
0: jdbc:phoenix:labs-darth-journalnode-lv-101> explain SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID in ('', 'FOO');
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
|                                          PLAN                                           | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY SKIP SCAN ON 2 KEYS OVER SYSTEM.CATALOG [null] - ['FOO']  | null            | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY                                                     | null            | null           | null         |
|     SERVER AGGREGATE INTO SINGLE ROW                                                    | null            | null           | null         |
+-----------------------------------------------------------------------------------------+-----------------+----------------+--------------+
{noformat}

> Inconsistent treatment of NULL and empty string
> -----------------------------------------------
>
>                 Key: PHOENIX-5065
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5065
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.14.1
>            Reporter: Geoffrey Jacoby
>            Priority: Major
>
> Phoenix doesn't handle NULLs consistently with other SQL dialects, and it doesn't handle them consistently internally either. 
> In PHOENIX-2422, [~jamestaylor] mentioned that Phoenix's intended behavior is for empty string and NULL to be equivalent. That's inconsistent with other SQL dialects (in which NULL is never equal to anything, including itself), but if that's our documented behavior, then that's fine unless PHOENIX-2422 to change it is ever worked. 
> But consider the following queries:
> {code:java}
> SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID = '';
> -- Returns 0 rows
> SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID IS NULL;
> -- Returns some number of rows. Call it N
> SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID IN ('');
> -- Returns 0 rows
> SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID IN ('', 'FOO');
> -- Returns N rows. Note that FOO does not exist, and is just a nonsense string
> SELECT COUNT(*) FROM SYSTEM.CATALOG WHERE TENANT_ID = '' OR TENANT_ID = 'FOO'
> --Returns 0 rows, but slowly
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)