You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2015/04/29 06:02:06 UTC

[jira] [Resolved] (DRILL-1957) NULL values in NOT IN sub-queries could lead to incorrect query result.

     [ https://issues.apache.org/jira/browse/DRILL-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aman Sinha resolved DRILL-1957.
-------------------------------
    Resolution: Fixed

Fixed in 3a232d81e.

> NULL values in NOT IN sub-queries could lead to incorrect query result. 
> ------------------------------------------------------------------------
>
>                 Key: DRILL-1957
>                 URL: https://issues.apache.org/jira/browse/DRILL-1957
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>            Reporter: Jinfeng Ni
>            Assignee: Aman Sinha
>            Priority: Critical
>             Fix For: 0.9.0
>
>
> NULL values in NOT IN sub-queries will disqualify all the rows, and hence return 0 row for a query. However, Drill could return some rows, due to issue in the query planning.
> For example, consider the following sample data :
> {code}
> select deptno from dfs.`/Users/jni/work/data/json/e.json` ;
> +------------+
> |   deptno   |
> +------------+
> | 10         |
> | 20         |
> | null       |
> +------------+
> 3 rows selected (0.062 seconds)
> 0: jdbc:drill:zk=local> select deptno from dfs.`/Users/jni/work/data/json/d.json` ;
> +------------+
> |   deptno   |
> +------------+
> | 10         |
> | null       |
> +------------+
> 2 rows selected (0.062 seconds)
> {code}
> The following query should return 0 row, but Drill actually return 1 row.
> {code}
> select deptno from dfs.`/Users/jni/work/data/json/e.json` where deptno not in (select deptno from dfs.`/Users/jni/work/data/json/d.json`);
> +------------+
> |   deptno   |
> +------------+
> | 20         |
> +------------+
> 1 row selected (0.286 seconds)
> {code}
> In comparison, here is the result run on posture.
> {code}
> mydb=# select * from e;
>  deptno | ename
> --------+-------
>      10 | Alice
>    NULL | Tom
>      20 | Bob
> (3 rows)
> mydb=# delete from d where deptno = 20;
> DELETE 1
> mydb=# select * from d;
>  deptno | dname
> --------+-------
>      10 | Sales
>    NULL | Eng
> (2 rows)
> mydb=# select deptno from e where deptno not in (select deptno from d);
>  deptno
> --------
> (0 rows)
> {code}
> The fix seems to be in OPTIQ-373, which will produce a valid plan. However, such plan contain a cross-join, which is not supported Drill, and hence hit CanNotPlanException in Drill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)