You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2015/08/15 04:31:46 UTC

[jira] [Assigned] (SPARK-10008) Shuffle locality can take precedence over narrow dependencies for RDDs with both

     [ https://issues.apache.org/jira/browse/SPARK-10008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-10008:
------------------------------------

    Assignee: Apache Spark

> Shuffle locality can take precedence over narrow dependencies for RDDs with both
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-10008
>                 URL: https://issues.apache.org/jira/browse/SPARK-10008
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>            Reporter: Matei Zaharia
>            Assignee: Apache Spark
>
> The shuffle locality patch made the DAGScheduler aware of shuffle data, but for RDDs that have both narrow and shuffle dependencies, it can cause them to place tasks based on the shuffle dependency instead of the narrow one. This case is common in iterative join-based algorithms like PageRank and ALS, where one RDD is hash-partitioned and one isn't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org