You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "otterc (via GitHub)" <gi...@apache.org> on 2023/05/19 03:03:34 UTC

[GitHub] [spark] otterc opened a new pull request, #41225: [SPARK-43583] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

otterc opened a new pull request, #41225:
URL: https://github.com/apache/spark/pull/41225

   ### What changes were proposed in this pull request?
   This change fixes a bug with push-based shuffle when encryption is enabled on the server. The meta requests for push-merged blocks fail with NPE:
   ```
   java.lang.RuntimeException: java.lang.NullPointerException
   	at org.apache.spark.network.server.AbstractAuthRpcHandler.getMergedBlockMetaReqHandler(AbstractAuthRpcHandler.java:110)
   	at org.apache.spark.network.crypto.AuthRpcHandler.getMergedBlockMetaReqHandler(AuthRpcHandler.java:144)
   	at org.apache.spark.network.server.TransportRequestHandler.processMergedBlockMetaRequest(TransportRequestHandler.java:275)
   	at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:117)
   	at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:140)
   	at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:53)
   	at org.sparkproject.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
   	at org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
   	at org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
   	at org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
   	at org.sparkproject.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286) 
   ```
   The change here fixes this issue.
   
   ### Why are the changes needed?
   It is a bug fix for push-based shuffle.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Added a simple UT. Verified manually.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] otterc commented on pull request #41225: [SPARK-43583][CORE] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

Posted by "otterc (via GitHub)" <gi...@apache.org>.
otterc commented on PR #41225:
URL: https://github.com/apache/spark/pull/41225#issuecomment-1557929717

   @dongjoon-hyun @mridulm @tgravescs @Ngone51 Could you please help review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] otterc commented on pull request #41225: [SPARK-43583][CORE] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

Posted by "otterc (via GitHub)" <gi...@apache.org>.
otterc commented on PR #41225:
URL: https://github.com/apache/spark/pull/41225#issuecomment-1558024638

   Thank you @dongjoon-hyun and @shuwang21!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] otterc commented on pull request #41225: [SPARK-43583][CORE] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

Posted by "otterc (via GitHub)" <gi...@apache.org>.
otterc commented on PR #41225:
URL: https://github.com/apache/spark/pull/41225#issuecomment-1553952178

   > I'm wondering if this can unblock [SPARK-36744](https://issues.apache.org/jira/browse/SPARK-36744) (Support IO encryption for push-based shuffle). If not, do you happen to know what are the remaining blockers?
   
   This doesn't unblock [SPARK-36744](https://issues.apache.org/jira/browse/SPARK-36744). To be able to read merged data, clients only know the map outputs belonging to that merged blocks. If the map outputs are encrypted, the client will need to know the size as well to decrypt it which is a protocol change. 
   
   Here, the server supports encryption but the application which is doing push-based shuffle doesn't have encryption enabled. I will add this to the summary.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] otterc commented on pull request #41225: [SPARK-43583] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

Posted by "otterc (via GitHub)" <gi...@apache.org>.
otterc commented on PR #41225:
URL: https://github.com/apache/spark/pull/41225#issuecomment-1553937512

   @mridulm @shuwang21 @zhouyejoe Please review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #41225: [SPARK-43583][CORE] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun closed pull request #41225: [SPARK-43583][CORE] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance
URL: https://github.com/apache/spark/pull/41225


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41225: [SPARK-43583][CORE] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41225:
URL: https://github.com/apache/spark/pull/41225#discussion_r1198502341


##########
common/network-common/src/main/java/org/apache/spark/network/crypto/AuthRpcHandler.java:
##########
@@ -139,8 +139,4 @@ protected boolean doAuthChallenge(
     return true;
   }
 
-  @Override
-  public MergedBlockMetaReqHandler getMergedBlockMetaReqHandler() {
-    return saslHandler.getMergedBlockMetaReqHandler();
-  }

Review Comment:
   This seems to be added at Apache Spark 3.2.0 via SPARK-35671, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] otterc commented on a diff in pull request #41225: [SPARK-43583][CORE] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

Posted by "otterc (via GitHub)" <gi...@apache.org>.
otterc commented on code in PR #41225:
URL: https://github.com/apache/spark/pull/41225#discussion_r1198505545


##########
common/network-common/src/main/java/org/apache/spark/network/crypto/AuthRpcHandler.java:
##########
@@ -139,8 +139,4 @@ protected boolean doAuthChallenge(
     return true;
   }
 
-  @Override
-  public MergedBlockMetaReqHandler getMergedBlockMetaReqHandler() {
-    return saslHandler.getMergedBlockMetaReqHandler();
-  }

Review Comment:
   Yes, we didn't test with encryption enabled on the server and this was missed. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] otterc commented on pull request #41225: [SPARK-43583][CORE] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

Posted by "otterc (via GitHub)" <gi...@apache.org>.
otterc commented on PR #41225:
URL: https://github.com/apache/spark/pull/41225#issuecomment-1554906573

   > Do you think when `spark.network.crypto.saslFallback=true` and L95 from `AuthRpcHandler.java`.
   > 
   > ```
   > saslHandler = new SaslRpcHandler(conf, channel, null, secretKeyHolder);
   > ```
   > 
   > which will set `delegate=null`. Will this cause potential NPE?
   
   No, with this modification, we obtain the reference to the `MergedBlockMetaReqHandler` from the `delegate (ExternalBlockHandler)` that the `AuthRpcHandler` is initialized with. The `saslHandler` instance within the `AuthRpcHandler` is specifically created for fallback to SASL, and we should not have used it to retrieve the MergedBlockMetaReqHandler. It is insignificant whether the saslHandler is initialized with a null delegate or not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] shuwang21 commented on pull request #41225: [SPARK-43583][CORE] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

Posted by "shuwang21 (via GitHub)" <gi...@apache.org>.
shuwang21 commented on PR #41225:
URL: https://github.com/apache/spark/pull/41225#issuecomment-1555014603

   > > Do you think when `spark.network.crypto.saslFallback=true` and L95 from `AuthRpcHandler.java`.
   > > ```
   > > saslHandler = new SaslRpcHandler(conf, channel, null, secretKeyHolder);
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > which will set `delegate=null`. Will this cause potential NPE?
   > 
   > No, with this modification, we obtain the reference to the `MergedBlockMetaReqHandler` from the `delegate (ExternalBlockHandler)` that the `AuthRpcHandler` is initialized with. The `saslHandler` instance within the `AuthRpcHandler` is specifically created for fallback to SASL, and we should not have used it to retrieve the MergedBlockMetaReqHandler. It is insignificant whether the saslHandler is initialized with a null delegate or not.
   
   I see. Make sense to me. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #41225: [SPARK-43583][CORE] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #41225:
URL: https://github.com/apache/spark/pull/41225#issuecomment-1554001410

   Thank you for sharing the rich context, @otterc .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] shuwang21 commented on pull request #41225: [SPARK-43583][CORE] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

Posted by "shuwang21 (via GitHub)" <gi...@apache.org>.
shuwang21 commented on PR #41225:
URL: https://github.com/apache/spark/pull/41225#issuecomment-1554049148

   LGTM. Thanks for your efforts!
   
   Do you think when `spark.network.crypto.saslFallback=true` and L95 from `AuthRpcHandler.java`. 
   ```
   saslHandler = new SaslRpcHandler(conf, channel, null, secretKeyHolder);
   ```
   which will set `delegate=null`. Will this cause potential NPE? 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] otterc commented on a diff in pull request #41225: [SPARK-43583][CORE] get MergedBlockedMetaReqHandler from the delegate instead of the SaslRpcHandler instance

Posted by "otterc (via GitHub)" <gi...@apache.org>.
otterc commented on code in PR #41225:
URL: https://github.com/apache/spark/pull/41225#discussion_r1198505545


##########
common/network-common/src/main/java/org/apache/spark/network/crypto/AuthRpcHandler.java:
##########
@@ -139,8 +139,4 @@ protected boolean doAuthChallenge(
     return true;
   }
 
-  @Override
-  public MergedBlockMetaReqHandler getMergedBlockMetaReqHandler() {
-    return saslHandler.getMergedBlockMetaReqHandler();
-  }

Review Comment:
   Yes, we didn't test with encryption enabled and this was missed. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org