You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by squito <gi...@git.apache.org> on 2018/01/31 20:35:22 UTC

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16989#discussion_r165178844
  
    --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java ---
    @@ -126,4 +150,38 @@ private void failRemainingBlocks(String[] failedBlockIds, Throwable e) {
           }
         }
       }
    +
    +  private class DownloadCallback implements StreamCallback {
    +
    +    private WritableByteChannel channel = null;
    +    private File targetFile = null;
    +    private int chunkIndex;
    +
    +    public DownloadCallback(File targetFile, int chunkIndex) throws IOException {
    +      this.targetFile = targetFile;
    +      this.channel = Channels.newChannel(new FileOutputStream(targetFile));
    +      this.chunkIndex = chunkIndex;
    +    }
    +
    +    @Override
    +    public void onData(String streamId, ByteBuffer buf) throws IOException {
    +      channel.write(buf);
    --- End diff --
    
    I am super-late on reviewing this, apologies, just asking questions for my own understanding, and to consider possible future improvements -- this won't do a zero-copy transfer, will it?  That ByteBuffer is still in user space?
    
    From my understanding, we'd need to do special handling to use netty's `spliceTo` when possible:
    https://stackoverflow.com/questions/30322957/is-there-transferfrom-like-functionality-in-netty-for-zero-copy
    
    but I'm still working on putting all the pieces together here and admittedly this is out of my area of expertise


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org