You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Saisai Shao (JIRA)" <ji...@apache.org> on 2017/06/01 13:53:04 UTC
[jira] [Commented] (SPARK-20943) Correct
BypassMergeSortShuffleWriter's comment
[ https://issues.apache.org/jira/browse/SPARK-20943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033016#comment-16033016 ]
Saisai Shao commented on SPARK-20943:
-------------------------------------
I don't think the previous comment is incorrect. In the shuffle write area, there's no map-side combine concept, it generalizes into aggregator and ordering, which means map-side combine is just one case w/ aggregator.
> Correct BypassMergeSortShuffleWriter's comment
> ----------------------------------------------
>
> Key: SPARK-20943
> URL: https://issues.apache.org/jira/browse/SPARK-20943
> Project: Spark
> Issue Type: Improvement
> Components: Documentation, Shuffle
> Affects Versions: 2.1.1
> Reporter: CanBin Zheng
> Priority: Trivial
> Labels: starter
>
> There are some comments written in BypassMergeSortShuffleWriter.java about when to select this write path, the three required conditions are described as follows:
> 1. no Ordering is specified, and
> 2. no Aggregator is specified, and
> 3. the number of partitions is less than
> spark.shuffle.sort.bypassMergeThreshold
> Obviously, the conditions written are partially wrong and misleading, the right conditions should be:
> 1. map-side combine is false, and
> 2. the number of partitions is less than
> spark.shuffle.sort.bypassMergeThreshold
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org