You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/23 19:57:49 UTC

[GitHub] [spark] Victsm commented on pull request #29855: SPARK-32915 Network-layer and shuffle RPC layer changes to support push shuffle blocks

Victsm commented on pull request #29855:
URL: https://github.com/apache/spark/pull/29855#issuecomment-697939630


   A few clarifications on this PR:
   The entire netty RPC layer change for push-based shuffle is ~4000 LOC in our current implementation. We plan to break it down into 3 PRs for easier review:
   
   - The first one in this PR focus on the foundation for supporting block push functionalities
   - The second PR will provide the actual implementation for the MergedShuffleFileManager, as well as the integration with YARNShuffleService
   - The third PR will provide the read path implementation supporting fetching a merged shuffle file as a sequence of chunks
   
   In addition, there are some additional refactoring we could do with this PR.
   For example, we reuse RetryingBlockFetcher and BlockFetchingListener for block push as well.
   This makes their naming not appropriate any more.
   We didn't make that change in this PR to reduce the number of files we touch, so it's easier to review.
   We can either send out a separate PR just to do these refactoring or update this PR, depending on the reviewers' preferences.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org