You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Wenchen Fan (Jira)" <ji...@apache.org> on 2021/04/23 05:25:00 UTC

[jira] [Resolved] (SPARK-35141) Support two level map for final hash aggregation

     [ https://issues.apache.org/jira/browse/SPARK-35141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan resolved SPARK-35141.
---------------------------------
    Fix Version/s: 3.2.0
       Resolution: Fixed

Issue resolved by pull request 32242
[https://github.com/apache/spark/pull/32242]

> Support two level map for final hash aggregation
> ------------------------------------------------
>
>                 Key: SPARK-35141
>                 URL: https://issues.apache.org/jira/browse/SPARK-35141
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: Cheng Su
>            Assignee: Cheng Su
>            Priority: Minor
>             Fix For: 3.2.0
>
>
> For partial hash aggregation (code-gen path), we have two level of hash map for aggregation. First level is from `RowBasedHashMapGenerator`, which is computation faster compared to the second level from `UnsafeFixedWidthAggregationMap`. The introducing of two level hash map can help improve CPU performance of query as the first level hash map normally fits in hardware cache and has cheaper hash function for key lookup.
> For final hash aggregation, we can also support two level of hash map, to improve query performance further.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org