You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (Jira)" <ji...@apache.org> on 2023/10/19 02:37:00 UTC

[jira] [Resolved] (SPARK-45507) Correctness bug in correlated scalar subqueries with COUNT aggregates

     [ https://issues.apache.org/jira/browse/SPARK-45507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan resolved SPARK-45507.
---------------------------------
    Fix Version/s: 4.0.0
       Resolution: Fixed

Issue resolved by pull request 43341
[https://github.com/apache/spark/pull/43341]

> Correctness bug in correlated scalar subqueries with COUNT aggregates
> ---------------------------------------------------------------------
>
>                 Key: SPARK-45507
>                 URL: https://issues.apache.org/jira/browse/SPARK-45507
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: Andy Lam
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>
> {code:java}
>  
> create view if not exists t1(a1, a2) as values (0, 1), (1, 2);
> create view if not exists t2(b1, b2) as values (0, 2), (0, 3);
> create view if not exists t3(c1, c2) as values (0, 2), (0, 3);
> -- Example 1
> select (
>   select SUM(l.cnt + r.cnt)
>   from (select count(*) cnt from t2 where t1.a1 = t2.b1 having cnt = 0) l
>   join (select count(*) cnt from t3 where t1.a1 = t3.c1 having cnt = 0) r
>   on l.cnt = r.cnt
> ) from t1
> -- Correct answer: (null, 0)
> +----------------------+
> |scalarsubquery(c1, c1)|
> +----------------------+
> |null                  |
> |null                  |
> +----------------------+
> -- Example 2
> select ( select sum(cnt) from (select count(*) cnt from t2 where t1.c1 = t2.c1) ) from t1
> -- Correct answer: (2, 0)
> +------------------+
> |scalarsubquery(c1)|
> +------------------+
> |2                 |
> |null              |
> +------------------+
> -- Example 3
> select ( select count(*) from (select count(*) cnt from t2 where t1.c1 = t2.c1) ) from t1
> -- Correct answer: (1, 1)
> +------------------+
> |scalarsubquery(c1)|
> +------------------+
> |1                 |
> |0                 |
> +------------------+ {code}
>  
>  
> DB fiddle for correctness check:[https://www.db-fiddle.com/f/4jyoMCicNSZpjMt4jFYoz5/10403#]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org