You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Davies Liu (JIRA)" <ji...@apache.org> on 2014/08/15 19:25:19 UTC

[jira] [Commented] (SPARK-3073) improve large sort (external sort) for PySpark

    [ https://issues.apache.org/jira/browse/SPARK-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098785#comment-14098785 ] 

Davies Liu commented on SPARK-3073:
-----------------------------------

This is for PySpark, currently we do not support large data sets in reduce stage during sortBy() or sortByKey().

This also will be useful for groupByKey() with hot keys. (the memory can not hold one hot key).

> improve large sort (external sort) for PySpark
> ----------------------------------------------
>
>                 Key: SPARK-3073
>                 URL: https://issues.apache.org/jira/browse/SPARK-3073
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: Davies Liu
>            Assignee: Davies Liu
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org