You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Vitalii Diravka (JIRA)" <ji...@apache.org> on 2017/11/24 19:45:00 UTC
[jira] [Assigned] (DRILL-5681) Incorrect query result when query
uses star and correlated subquery
[ https://issues.apache.org/jira/browse/DRILL-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vitalii Diravka reassigned DRILL-5681:
--------------------------------------
Assignee: Vitalii Diravka (was: Jinfeng Ni)
> Incorrect query result when query uses star and correlated subquery
> -------------------------------------------------------------------
>
> Key: DRILL-5681
> URL: https://issues.apache.org/jira/browse/DRILL-5681
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Jinfeng Ni
> Assignee: Vitalii Diravka
>
> The following repo was based on a testcase provided by Arjun Rajan(arajan@mapr.com).
> Drill returns incorrect query result, when the query has a correlated subquery and querying against a view defined with select *, or querying a subquery with select *.
> Case 1: Querying view with select * + correlated subquery
> {code}
> create view dfs.tmp.nation_view as select * from cp.`tpch/nation.parquet`;
> {code}
> //Q1 : return 25 rows. The correct answer is 0 row.
> {code}
> SELECT n_nationkey, n_name
> FROM dfs.tmp.nation_view a
> WHERE NOT EXISTS (SELECT 1
> FROM cp.`tpch/region.parquet` b
> WHERE b.r_regionkey = a.n_regionkey
> )
> +--------------+-----------------+
> | n_nationkey | n_name |
> +--------------+-----------------+
> | 0 | ALGERIA |
> | 1 | ARGENTINA |
> | 2 | BRAZIL |
> ...
> | 24 | UNITED STATES |
> +--------------+-----------------+
> 25 rows selected (0.614 seconds)
> {code}
> // Q2: return 0 row. The correct answer is 25 rows.
> {code}
> SELECT n_nationkey, n_name
> FROM dfs.tmp.nation_view a
> WHERE EXISTS (SELECT 1
> FROM cp.`tpch/region.parquet` b
> WHERE b.r_regionkey = a.n_regionkey
> )
> +--------------+---------+
> | n_nationkey | n_name |
> +--------------+---------+
> +--------------+---------+
> No rows selected (0.4 seconds)
> {code}
> Case 2: Querying a table expression with select *
> // Q3: return 25 rows. The correct result is 0 row
> {code}
> SELECT n_nationkey, n_name
> FROM (
> SELECT * FROM cp.`tpch/nation.parquet`
> ) a
> WHERE NOT EXISTS (SELECT 1
> FROM cp.`tpch/region.parquet` b
> WHERE b.r_regionkey = a.n_regionkey
> )
> +--------------+-----------------+
> | n_nationkey | n_name |
> +--------------+-----------------+
> | 0 | ALGERIA |
> | 1 | ARGENTINA |
> ...
> | 24 | UNITED STATES |
> +--------------+-----------------+
> 25 rows selected (0.451 seconds)
> {code}
> Q4: return 0 row. The correct result is 25 rows.
> {code}
> SELECT n_nationkey, n_name
> FROM (
> SELECT * FROM cp.`tpch/nation.parquet`
> ) a
> WHERE EXISTS (SELECT 1
> FROM cp.`tpch/region.parquet` b
> WHERE b.r_regionkey = a.n_regionkey
> )
> +--------------+---------+
> | n_nationkey | n_name |
> +--------------+---------+
> +--------------+---------+
> No rows selected (0.515 seconds)
> {code}
> All cases can be reproduced without view usage, using sub-select with star is enough.
> For example:
> {code}
> SELECT n_nationkey, n_name
> FROM (select * from cp.`tpch/nation.parquet`) a
> WHERE NOT EXISTS (SELECT 1
> FROM cp.`tpch/region.parquet` b
> WHERE b.r_regionkey = a.n_regionkey
> )
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)