You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (JIRA)" <ji...@apache.org> on 2017/05/22 14:13:04 UTC

[jira] [Resolved] (SPARK-20801) Store accurate size of blocks in MapStatus when it's above threshold.

     [ https://issues.apache.org/jira/browse/SPARK-20801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan resolved SPARK-20801.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 2.2.0

Issue resolved by pull request 18031
[https://github.com/apache/spark/pull/18031]

> Store accurate size of blocks in MapStatus when it's above threshold.
> ---------------------------------------------------------------------
>
>                 Key: SPARK-20801
>                 URL: https://issues.apache.org/jira/browse/SPARK-20801
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Spark Core
>    Affects Versions: 2.1.1
>            Reporter: jin xing
>             Fix For: 2.2.0
>
>
> Currently, when number of reduces is above 2000, HighlyCompressedMapStatus is used to store size of blocks. in HighlyCompressedMapStatus, only average size is stored for non empty blocks. Which is not good for memory control when we shuffle blocks. It makes sense to store the accurate size of block when it's above threshold.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org