You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2015/04/24 19:42:38 UTC
[jira] [Commented] (DRILL-1957) NULL values in NOT IN sub-queries
could lead to incorrect query result.
[ https://issues.apache.org/jira/browse/DRILL-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511410#comment-14511410 ]
Aman Sinha commented on DRILL-1957:
-----------------------------------
Review board link for the patch: https://reviews.apache.org/r/33523/
Assigning to [~jni] for review.
> NULL values in NOT IN sub-queries could lead to incorrect query result.
> ------------------------------------------------------------------------
>
> Key: DRILL-1957
> URL: https://issues.apache.org/jira/browse/DRILL-1957
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Reporter: Jinfeng Ni
> Assignee: Aman Sinha
> Priority: Critical
> Fix For: 0.9.0
>
>
> NULL values in NOT IN sub-queries will disqualify all the rows, and hence return 0 row for a query. However, Drill could return some rows, due to issue in the query planning.
> For example, consider the following sample data :
> {code}
> select deptno from dfs.`/Users/jni/work/data/json/e.json` ;
> +------------+
> | deptno |
> +------------+
> | 10 |
> | 20 |
> | null |
> +------------+
> 3 rows selected (0.062 seconds)
> 0: jdbc:drill:zk=local> select deptno from dfs.`/Users/jni/work/data/json/d.json` ;
> +------------+
> | deptno |
> +------------+
> | 10 |
> | null |
> +------------+
> 2 rows selected (0.062 seconds)
> {code}
> The following query should return 0 row, but Drill actually return 1 row.
> {code}
> select deptno from dfs.`/Users/jni/work/data/json/e.json` where deptno not in (select deptno from dfs.`/Users/jni/work/data/json/d.json`);
> +------------+
> | deptno |
> +------------+
> | 20 |
> +------------+
> 1 row selected (0.286 seconds)
> {code}
> In comparison, here is the result run on posture.
> {code}
> mydb=# select * from e;
> deptno | ename
> --------+-------
> 10 | Alice
> NULL | Tom
> 20 | Bob
> (3 rows)
> mydb=# delete from d where deptno = 20;
> DELETE 1
> mydb=# select * from d;
> deptno | dname
> --------+-------
> 10 | Sales
> NULL | Eng
> (2 rows)
> mydb=# select deptno from e where deptno not in (select deptno from d);
> deptno
> --------
> (0 rows)
> {code}
> The fix seems to be in OPTIQ-373, which will produce a valid plan. However, such plan contain a cross-join, which is not supported Drill, and hence hit CanNotPlanException in Drill.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)