You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Wenchen Fan (Jira)" <ji...@apache.org> on 2021/08/12 07:24:00 UTC

[jira] [Resolved] (SPARK-36489) Aggregate functions over no grouping keys, on tables with a single bucket, return multiple rows

     [ https://issues.apache.org/jira/browse/SPARK-36489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan resolved SPARK-36489.
---------------------------------
    Fix Version/s: 3.1.3
                   3.2.0
       Resolution: Fixed

Issue resolved by pull request 33711
[https://github.com/apache/spark/pull/33711]

> Aggregate functions over no grouping keys, on tables with a single bucket, return multiple rows
> -----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-36489
>                 URL: https://issues.apache.org/jira/browse/SPARK-36489
>             Project: Spark
>          Issue Type: Bug
>          Components: Optimizer
>    Affects Versions: 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.1.3
>            Reporter: Ionut Boicu
>            Priority: Major
>             Fix For: 3.2.0, 3.1.3
>
>
> When running any aggregate function, without any grouping keys, on a table with a single bucket, multiple rows are returned. 
> This happens because the aggregate function satisfies the `AllTuples` distribution, no `Exchange` will be planned, and the bucketed scan will be disabled.
>  
> Reproduction:
>  
> {code:java}
> sql(
>    """
>    |CREATE TABLE t1 (`id` BIGINT, `event_date` DATE)
>    |USING PARQUET
>    |CLUSTERED BY (id)
>    |INTO 1 BUCKETS
>    |""".stripMargin)
> sql(
>    """
>    |INSERT INTO TABLE t1 VALUES(1.23, cast("2021-07-07" as date))
>    |""".stripMargin)
> sql(
>    """
>    |INSERT INTO TABLE t1 VALUES(2.28, cast("2021-08-08" as date))
>    |""".stripMargin)
> assert(sql("select sum(id) from t1 where id is not null").count == 1){code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org