You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by ericl <gi...@git.apache.org> on 2016/07/20 21:27:53 UTC

[GitHub] spark issue #13152: [SPARK-15353] [CORE] Making peer selection for block rep...

Github user ericl commented on the issue:

    https://github.com/apache/spark/pull/13152
  
    A couple high level questions:
    - Rather than send an RPC to the master asking for a worker's topology info, is it possible for this to be provided at initialization time or determined based on the environment?
    
    - Is it possible to narrow the interface of the prioritizer to just choose a single next peer? If it is desired to cache the prioritization order, this can be done internally within the prioritizer. For example, the interface could be something like this. Then the default prioritizer does not need to do a random shuffle of the entire peer list to choose its target.
    
    ```
    trait BlockReplicationStrategy {
    
      trait ReplicationTargetSelector {
        def getNextPeer(
          candidatePeers: Set[BlockManagerId],
          successfulReplications: Set[BlockManagerId],
          failedReplications: Set[BlockManagerId]): Option[BlockManagerId]
      }
    
      def getTargetSelector(
        localId: BlockManagerId,
        blockId: BlockId,
        level: StorageLevel): ReplicationTargetSelector
    }
    ```
    
    Also, the patch would be more minimal if only the `getRandomPeer()` call was changed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org