You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Cheng Su (Jira)" <ji...@apache.org> on 2021/04/19 23:30:00 UTC

[jira] [Created] (SPARK-35141) Support two level map for final hash aggregation

Cheng Su created SPARK-35141:
--------------------------------

             Summary: Support two level map for final hash aggregation
                 Key: SPARK-35141
                 URL: https://issues.apache.org/jira/browse/SPARK-35141
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.2.0
            Reporter: Cheng Su


For partial hash aggregation (code-gen path), we have two level of hash map for aggregation. First level is from `RowBasedHashMapGenerator`, which is computation faster compared to the second level from `UnsafeFixedWidthAggregationMap`. The introducing of two level hash map can help improve CPU performance of query as the first level hash map normally fits in hardware cache and has cheaper hash function for key lookup.

For final hash aggregation, we can also support two level of hash map, to improve query performance further.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org