You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Matt Cheah (JIRA)" <ji...@apache.org> on 2019/08/02 23:45:00 UTC

[jira] [Updated] (SPARK-28607) Don't hold a reference to two partitionLengths arrays

     [ https://issues.apache.org/jira/browse/SPARK-28607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matt Cheah updated SPARK-28607:
-------------------------------
    Description: SPARK-28209 introduced the new shuffle writer API and its usage in BypassMergeSortShuffleWriter. However, the design of the API forces the partition lengths to be tracked both in the implementation of the plugin and also by the higher-level writer. This leads to redundant memory usage. We should only track the lengths of the partitions in the implementation of the plugin and propagate this information back up to the writer as the return value of {{commitAllPartitions}}.

> Don't hold a reference to two partitionLengths arrays
> -----------------------------------------------------
>
>                 Key: SPARK-28607
>                 URL: https://issues.apache.org/jira/browse/SPARK-28607
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Shuffle
>    Affects Versions: 3.0.0
>            Reporter: Matt Cheah
>            Priority: Major
>
> SPARK-28209 introduced the new shuffle writer API and its usage in BypassMergeSortShuffleWriter. However, the design of the API forces the partition lengths to be tracked both in the implementation of the plugin and also by the higher-level writer. This leads to redundant memory usage. We should only track the lengths of the partitions in the implementation of the plugin and propagate this information back up to the writer as the return value of {{commitAllPartitions}}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org