You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Yuchen Fan (Jira)" <ji...@apache.org> on 2022/11/15 12:54:00 UTC
[jira] [Created] (IMPALA-11725) Query result incorrect when querying and filtering NULL values of sub-query
Yuchen Fan created IMPALA-11725:
-----------------------------------
Summary: Query result incorrect when querying and filtering NULL values of sub-query
Key: IMPALA-11725
URL: https://issues.apache.org/jira/browse/IMPALA-11725
Project: IMPALA
Issue Type: Bug
Components: Distributed Exec, Frontend
Affects Versions: Impala 4.1.0
Reporter: Yuchen Fan
We found that Impala can't filter NULL values from sub-query. For example, prepare two test tables err_tbl1 and err_tbl2(id INT, dt STRING):
+----+------------+
| id | dt |
+----+------------+
| 14 | 2022-11-13 |
| 15 | 2022-11-13 |
| 13 | 2022-11-13 |
+----+------------+
+----+------------+
| id | dt |
+----+------------+
| 14 | 2022-11-13 |
| 16 | 2022-11-13 |
| 13 | 2022-11-13 |
+----+------------+
And submit query below:
{code:java}
SELECT *
FROM (
SELECT aid, bid, COUNT(*) AS c
FROM (
SELECT id AS aid
FROM err_tbl1
WHERE dt = '2022-11-13'
) a
FULL JOIN (
SELECT id AS bid
FROM err_tbl2
WHERE dt = '2022-11-13'
) b
ON a.aid = b.bid
GROUP BY aid, bid
) t1
WHERE aid = bid;{code}
Out result includes 4 rows:
+------+------+---+
| aid | bid | c |
+------+------+---+
| 13 | 13 | 1 |
| 14 | 14 | 1 |
| NULL | 15 | 1 |
| 16 | NULL | 1 |
+------+------+—+
Obviously, condition of 'aid=bid' is invalid. 'NULL' value should be filtered out. But if we use condition of 'aid!=bid', there will be empty result set, which means '!=' can filter out 'NULL' value while '=' cannot. What's more, if we create a table as select * from sub-query and execute 'SELECT * FROM sub_table WHERE aid=bid'. The result is correct. If we surround 'aid=bid' with 'trim()' like 'trim(cast(aid as string))=trim(cast(bid as string))', result is also correct. In Spark, result of the same query doesn't contain 'NULL' values.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org