You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "yucai (JIRA)" <ji...@apache.org> on 2017/11/16 08:31:00 UTC

[jira] [Updated] (SPARK-22540) HighlyCompressedMapStatus's avgSize is incorrect

     [ https://issues.apache.org/jira/browse/SPARK-22540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

yucai updated SPARK-22540:
--------------------------
    Description: 
The calculation of HighlyCompressedMapStatus's avgSize is incorrect. 
Currently, it looks like "sum of small blocks / count of all non empty blocks", the count of all non empty blocks not only contains small blocks, which contains huge blocks number also, but we need the count of small blocks only.

  was:
The calculation of HighlyCompressedMapStatus's avgSize is incorrect. 
Currently, it looks like "sum of small blocks / count of all non empty blocks", all non empty blocks contains huge blocks number also, actually we need the count of small blocks only.


> HighlyCompressedMapStatus's avgSize is incorrect
> ------------------------------------------------
>
>                 Key: SPARK-22540
>                 URL: https://issues.apache.org/jira/browse/SPARK-22540
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.3.0
>            Reporter: yucai
>
> The calculation of HighlyCompressedMapStatus's avgSize is incorrect. 
> Currently, it looks like "sum of small blocks / count of all non empty blocks", the count of all non empty blocks not only contains small blocks, which contains huge blocks number also, but we need the count of small blocks only.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org