You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dongjoon Hyun (JIRA)" <ji...@apache.org> on 2017/06/01 23:24:04 UTC

[jira] [Commented] (SPARK-15352) Topology aware block replication

    [ https://issues.apache.org/jira/browse/SPARK-15352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033899#comment-16033899 ] 

Dongjoon Hyun commented on SPARK-15352:
---------------------------------------

Hi, [~shubhamc].
Can we resolve this issue at 2.2.0 right now?
The ongoing PR seems to be the documentation only issue.

> Topology aware block replication
> --------------------------------
>
>                 Key: SPARK-15352
>                 URL: https://issues.apache.org/jira/browse/SPARK-15352
>             Project: Spark
>          Issue Type: New Feature
>          Components: Block Manager, Mesos, Spark Core, YARN
>            Reporter: Shubham Chopra
>            Assignee: Shubham Chopra
>
> With cached RDDs, Spark can be used for online analytics where it is used to respond to online queries. But loss of RDD partitions due to node/executor failures can cause huge delays in such use cases as the data would have to be regenerated.
> Cached RDDs, even when using multiple replicas per block, are not currently resilient to node failures when multiple executors are started on the same node. Block replication currently chooses a peer at random, and this peer could also exist on the same host. 
> This effort would add topology aware replication to Spark that can be enabled with pluggable strategies. For ease of development/review, this is being broken down to three major work-efforts:
> 1.	Making peer selection for replication pluggable
> 2.	Providing pluggable implementations for providing topology and topology aware replication
> 3.	Pro-active replenishment of lost blocks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org