You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/10/06 17:56:39 UTC

[GitHub] [spark] jiangxb1987 commented on a change in pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

jiangxb1987 commented on a change in pull request #29855:
URL: https://github.com/apache/spark/pull/29855#discussion_r500489854



##########
File path: common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java
##########
@@ -181,6 +182,17 @@ public void onFailure(Throwable e) {
   private void processStreamUpload(final UploadStream req) {
     assert (req.body() == null);
     try {
+      // Retain the original metadata buffer, since it will be used during the invocation of
+      // this method. Will be released later.
+      req.meta.retain();
+      // Make a copy of the original metadata buffer. In benchmark, we noticed that
+      // we cannot respond the original metadata buffer back to the client, otherwise
+      // in cases where multiple concurrent shuffles are present, a wrong metadata might
+      // be sent back to client. This is related to the eager release of the metadata buffer,
+      // i.e., we always release the original buffer by the time the invocation of this
+      // method ends, instead of by the time we respond it to the client. This is necessary,
+      // otherwise we start seeing memory issues very quickly in benchmarks.
+      ByteBuffer meta = cloneBuffer(req.meta.nioByteBuffer());

Review comment:
       I think the major issue here is that, we still need the metadata on the client side the handle push callbacks. It sounds like a good idea to keep the metadata inside the `BlockPushCallback`, so we don't need to worry about the lifecycle of the metadata.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org