You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Josh Rosen (JIRA)" <ji...@apache.org> on 2015/06/12 03:21:01 UTC

[jira] [Updated] (SPARK-8319) Update logic related to key ordering in shuffle dependencies

     [ https://issues.apache.org/jira/browse/SPARK-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Josh Rosen updated SPARK-8319:
------------------------------
    Summary: Update logic related to key ordering in shuffle dependencies  (was: Update several pieces of shuffle logic related to key orderings)

> Update logic related to key ordering in shuffle dependencies
> ------------------------------------------------------------
>
>                 Key: SPARK-8319
>                 URL: https://issues.apache.org/jira/browse/SPARK-8319
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle, SQL
>            Reporter: Josh Rosen
>            Assignee: Josh Rosen
>
> The Tungsten ShuffleManager falls back to regular SortShuffleManager whenever the shuffle dependency specifies a key ordering, but technically we only need to fall back when an aggregator is also specified.  We should update the fallback logic to handle this case so that the Tungsten optimizations can apply to more workloads.
> I also noticed that the SQL Exchange operator performs defensive copying of shuffle inputs when a key ordering is specified, but this is unnecessary: the only shuffle manager that performs sorting on the map side is SortShuffleManager, and it only performs sorting if an aggregator is specified.  SQL never uses Spark's shuffle for performing aggregation, so this copying is unnecessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org