You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/07/11 17:26:00 UTC
[jira] [Created] (IMPALA-9949) Subqueries in select can result in
rows not being returned
Tim Armstrong created IMPALA-9949:
-------------------------------------
Summary: Subqueries in select can result in rows not being returned
Key: IMPALA-9949
URL: https://issues.apache.org/jira/browse/IMPALA-9949
Project: IMPALA
Issue Type: Bug
Components: Frontend
Reporter: Tim Armstrong
Assignee: Tim Armstrong
IMPALA-8954 added support for uncorrelated subqueries but some do not return correct results. Both of those queries should return rows with NULLs where the subquery returned 0 rows.
{noformat}
[localhost.EXAMPLE.COM:21000] default> select (select min(int_col) from functional.alltypes having min(int_col) < 0) from functional.alltypestiny;
Fetched 0 row(s) in 0.16s
[localhost.EXAMPLE.COM:21000] default> select (select min(int_col) from functional.alltypes limit 0) from functional.alltypestiny;
Fetched 0 row(s) in 0.14s
{noformat}
The problem is that the CROSS JOIN will return 0 rows if the subquery returns 0 rows.
{noformat}
[localhost.EXAMPLE.COM:21000] default> explain select (select min(int_col) from functional.alltypes having min(int_col) < 0) from functional.alltypestiny;
Query: explain select (select min(int_col) from functional.alltypes having min(int_col) < 0) from functional.alltypestiny
+-------------------------------------------------------------+
| Explain String |
+-------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=40.00KB Threads=5 |
| Per-Host Resource Estimates: Memory=180MB |
| Codegen disabled by planner |
| |
| PLAN-ROOT SINK |
| | |
| 03:NESTED LOOP JOIN [CROSS JOIN, BROADCAST] |
| | row-size=4B cardinality=8 |
| | |
| |--06:EXCHANGE [UNPARTITIONED] |
| | | |
| | 00:SCAN HDFS [functional.alltypestiny] |
| | HDFS partitions=4/4 files=4 size=460B |
| | row-size=0B cardinality=8 |
| | |
| 05:AGGREGATE [FINALIZE] |
| | output: min:merge(int_col) |
| | having: min(int_col) < 0 |
| | row-size=4B cardinality=1 |
| | |
| 04:EXCHANGE [UNPARTITIONED] |
| | |
| 02:AGGREGATE |
| | output: min(int_col) |
| | row-size=4B cardinality=1 |
| | |
| 01:SCAN HDFS [functional.alltypes] |
| HDFS partitions=24/24 files=24 size=478.45KB |
| row-size=4B cardinality=7.30K |
+-------------------------------------------------------------+
Fetched 29 row(s) in 0.04s
{noformat}
We need to detect cases where the subquery can return 0 rows and instead insert a left outer join.
I did this in a patch and it fixed the issue.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)