You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/05 08:33:04 UTC

[GitHub] [spark] gczsjdy opened a new pull request #24526: [SPARK-27603][CORE]Make the BlockTransferService for shuffle fetch pluggable

gczsjdy opened a new pull request #24526: [SPARK-27603][CORE]Make the BlockTransferService for shuffle fetch pluggable
URL: https://github.com/apache/spark/pull/24526
 
 
   ## What changes were proposed in this pull request?
   
   Shuffle manager is pluggable in Spark, however, some service closely related to the shuffle functionality is constrained to 1 or 2 implementations. One example is `NettyBlockTransferService`, it is used in BlockManager to fetch remote bytes, and to fetch shuffle data in non-external shuffle. The 2 functionalities are coupled together. Actually the latter functionality to fetch shuffle data should be pluggable/extensible.
   
   A custom Spark shuffle manager may need the set of service, including the RPC servers, clients and context that `NettyBlockTransferService` has constructed(constructing a new set of connections between executors is redundant), but also a new `NettyBlockTransferService` with custom need. For example, a remote shuffle manager under disaggregated compute and storage architecture may only need the service to transfer index files from other executors(for cache purpose) through Netty, but read data files directly from the globally-accessible storage.
   
   We propose to make this transfer service for shuffle pluggable, also make some fields in `NettyBlockTransferService` wider accessible for developers to extend.
   
   ## How was this patch tested?
   
   Existing tests.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org