You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Takeshi Yamamuro (Jira)" <ji...@apache.org> on 2021/03/31 22:45:00 UTC

[jira] [Resolved] (SPARK-34882) RewriteDistinctAggregates can cause a bug if the aggregator does not ignore NULLs

     [ https://issues.apache.org/jira/browse/SPARK-34882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Takeshi Yamamuro resolved SPARK-34882.
--------------------------------------
    Fix Version/s: 3.2.0
         Assignee: Tanel Kiis
       Resolution: Fixed

Resolved by https://github.com/apache/spark/pull/31983

> RewriteDistinctAggregates can cause a bug if the aggregator does not ignore NULLs
> ---------------------------------------------------------------------------------
>
>                 Key: SPARK-34882
>                 URL: https://issues.apache.org/jira/browse/SPARK-34882
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.8, 3.2.0, 3.1.2, 3.0.3
>            Reporter: Tanel Kiis
>            Assignee: Tanel Kiis
>            Priority: Major
>              Labels: correctness
>             Fix For: 3.2.0
>
>
> {code:title=group-by.sql}
> SELECT
>     first(DISTINCT a), last(DISTINCT a),
>     first(a), last(a),
>     first(DISTINCT b), last(DISTINCT b),
>     first(b), last(b)
> FROM testData WHERE a IS NOT NULL AND b IS NOT NULL;{code}
> {code:title=group-by.sql.out}
> -- !query schema
> struct<first(DISTINCT a):int,last(DISTINCT a):int,first(a):int,last(a):int,first(DISTINCT b):int,last(DISTINCT b):int,first(b):int,last(b):int>
> -- !query output
> NULL	1	1	3	1	NULL	1	2
> {code}
> The results should not be NULL, because NULL inputs are filtered out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org