You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2015/08/07 11:08:45 UTC

[jira] [Commented] (SPARK-9702) Repartition operator should use Exchange to perform its shuffle

    [ https://issues.apache.org/jira/browse/SPARK-9702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661532#comment-14661532 ] 

Apache Spark commented on SPARK-9702:
-------------------------------------

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/8030

> Repartition operator should use Exchange to perform its shuffle
> ---------------------------------------------------------------
>
>                 Key: SPARK-9702
>                 URL: https://issues.apache.org/jira/browse/SPARK-9702
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Josh Rosen
>
> Spark SQL's {{Repartition}} operator is implemented in terms of Spark Core's repartition operator, which means that it has to perform lots of unnecessary row copying and inefficient row serialization. Instead, it would be better if this was implemented using some of Exchange's internals so that it can avoid row format conversions and generic getters / hashcodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org