You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (Jira)" <ji...@apache.org> on 2019/08/26 17:41:00 UTC

[jira] [Resolved] (SPARK-28607) Don't hold a reference to two partitionLengths arrays

     [ https://issues.apache.org/jira/browse/SPARK-28607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcelo Vanzin resolved SPARK-28607.
------------------------------------
    Fix Version/s: 3.0.0
       Resolution: Fixed

Issue resolved by pull request 25341
[https://github.com/apache/spark/pull/25341]

> Don't hold a reference to two partitionLengths arrays
> -----------------------------------------------------
>
>                 Key: SPARK-28607
>                 URL: https://issues.apache.org/jira/browse/SPARK-28607
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Shuffle
>    Affects Versions: 3.0.0
>            Reporter: Matt Cheah
>            Assignee: Matt Cheah
>            Priority: Major
>             Fix For: 3.0.0
>
>
> SPARK-28209 introduced the new shuffle writer API and its usage in BypassMergeSortShuffleWriter. However, the design of the API forces the partition lengths to be tracked both in the implementation of the plugin and also by the higher-level writer. This leads to redundant memory usage. We should only track the lengths of the partitions in the implementation of the plugin and propagate this information back up to the writer as the return value of {{commitAllPartitions}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org