You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Andy Lam (Jira)" <ji...@apache.org> on 2023/10/12 00:42:00 UTC
[jira] [Created] (SPARK-45507) Correctness bug in correlated scalar subqueries with COUNT aggregates
Andy Lam created SPARK-45507:
--------------------------------
Summary: Correctness bug in correlated scalar subqueries with COUNT aggregates
Key: SPARK-45507
URL: https://issues.apache.org/jira/browse/SPARK-45507
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 4.0.0
Reporter: Andy Lam
{code:java}
create view if not exists t1(a1, a2) as values (0, 1), (1, 2);
create view if not exists t2(b1, b2) as values (0, 2), (0, 3);
create view if not exists t3(c1, c2) as values (0, 2), (0, 3);
-- Example 1 (has having clause)
select (
select SUM(l.cnt + r.cnt)
from (select count(*) cnt from t2 where t1.a1 = t2.b1 having cnt = 0) l
join (select count(*) cnt from t3 where t1.a1 = t3.c1 having cnt = 0) r
on l.cnt = r.cnt
) from t1
-- Correct answer: (null, 0)
+----------------------+
|scalarsubquery(c1, c1)|
+----------------------+
|null |
|null |
+----------------------+
-- Example 2
select ( select sum(cnt) from (select count(*) cnt from t2 where t1.c1 = t2.c1) ) from t1
-- Correct answer: (2, 0)
+------------------+
|scalarsubquery(c1)|
+------------------+
|2 |
|null |
+------------------+
-- Example 3
select ( select count(*) from (select count(*) cnt from t2 where t1.c1 = t2.c1) ) from t1
-- Correct answer: (1, 1)
+------------------+
|scalarsubquery(c1)|
+------------------+
|1 |
|0 |
+------------------+ {code}
DB fiddle for correctness check:[https://www.db-fiddle.com/f/4jyoMCicNSZpjMt4jFYoz5/10403#]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org