You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2015/05/29 08:19:17 UTC

[jira] [Commented] (SPARK-7936) Add configuration for initial size of hash for aggregation and limit

    [ https://issues.apache.org/jira/browse/SPARK-7936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564270#comment-14564270 ] 

Apache Spark commented on SPARK-7936:
-------------------------------------

User 'navis' has created a pull request for this issue:
https://github.com/apache/spark/pull/6488

> Add configuration for initial size of hash for aggregation and limit
> --------------------------------------------------------------------
>
>                 Key: SPARK-7936
>                 URL: https://issues.apache.org/jira/browse/SPARK-7936
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Navis
>            Priority: Minor
>
> Partial aggregation takes a lot of memory and mostly cannot be completed if it's not sliced into very small partitions (large in count). This patch is for limiting entry size for partial aggregation. Initial size for hash is just a bonus.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org