You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/06/28 10:35:33 UTC

[GitHub] [spark] cxzl25 opened a new pull request, #37015: [SPARK-39628][CORE] Fix a race condition when handling of IdleStateEvent again

cxzl25 opened a new pull request, #37015:
URL: https://github.com/apache/spark/pull/37015

   ### What changes were proposed in this pull request?
   Fix a race condition when handling of IdleStateEvent.
   https://github.com/apache/spark/pull/23989
   
   
   ### Why are the changes needed?
   In [SPARK-27073](https://issues.apache.org/jira/browse/SPARK-27073), fix a race condition when handling of IdleStateEvent, but in [SPARK-37462](https://issues.apache.org/jira/browse/SPARK-37462) the call order is modified, which leads to a possible regression.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   This problem has been encountered in our production environment, and it is difficult to test the race condition here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #37015: [SPARK-39628][CORE] Fix a race condition when handling of IdleStateEvent again

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #37015:
URL: https://github.com/apache/spark/pull/37015#issuecomment-1169011846

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] weixiuli commented on a diff in pull request #37015: [SPARK-39628][CORE] Fix a race condition when handling of IdleStateEvent again

Posted by GitBox <gi...@apache.org>.
weixiuli commented on code in PR #37015:
URL: https://github.com/apache/spark/pull/37015#discussion_r908345107


##########
common/network-common/src/main/java/org/apache/spark/network/server/TransportChannelHandler.java:
##########
@@ -158,10 +158,12 @@ public void userEventTriggered(ChannelHandlerContext ctx, Object evt) throws Exc
       // To avoid a race between TransportClientFactory.createClient() and this code which could
       // result in an inactive client being returned, this needs to run in a synchronized block.
       synchronized (this) {
+        // Do not modify the order of hasInFlightRequests and isActuallyOverdue (see SPARK-27073)
+        boolean hasInFlightRequests = responseHandler.hasOutstandingRequests();
         boolean isActuallyOverdue =
           System.nanoTime() - responseHandler.getTimeOfLastRequestNs() > requestTimeoutNs;
         if (e.state() == IdleState.ALL_IDLE && isActuallyOverdue) {
-          if (responseHandler.hasOutstandingRequests()) {
+          if (hasInFlightRequests) {

Review Comment:
   Could you explain why "Do not modify the order of hasInFlightRequests and isActuallyOverdue " please ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] closed pull request #37015: [SPARK-39628][CORE] Fix a race condition when handling of IdleStateEvent again

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #37015: [SPARK-39628][CORE] Fix a race condition when handling of IdleStateEvent again
URL: https://github.com/apache/spark/pull/37015


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] commented on pull request #37015: [SPARK-39628][CORE] Fix a race condition when handling of IdleStateEvent again

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #37015:
URL: https://github.com/apache/spark/pull/37015#issuecomment-1272175950

   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cxzl25 commented on a diff in pull request #37015: [SPARK-39628][CORE] Fix a race condition when handling of IdleStateEvent again

Posted by GitBox <gi...@apache.org>.
cxzl25 commented on code in PR #37015:
URL: https://github.com/apache/spark/pull/37015#discussion_r908351409


##########
common/network-common/src/main/java/org/apache/spark/network/server/TransportChannelHandler.java:
##########
@@ -158,10 +158,12 @@ public void userEventTriggered(ChannelHandlerContext ctx, Object evt) throws Exc
       // To avoid a race between TransportClientFactory.createClient() and this code which could
       // result in an inactive client being returned, this needs to run in a synchronized block.
       synchronized (this) {
+        // Do not modify the order of hasInFlightRequests and isActuallyOverdue (see SPARK-27073)
+        boolean hasInFlightRequests = responseHandler.hasOutstandingRequests();
         boolean isActuallyOverdue =
           System.nanoTime() - responseHandler.getTimeOfLastRequestNs() > requestTimeoutNs;
         if (e.state() == IdleState.ALL_IDLE && isActuallyOverdue) {
-          if (responseHandler.hasOutstandingRequests()) {
+          if (hasInFlightRequests) {

Review Comment:
   Are you referring to adding more comments to the code, or explaining why this happens in this PR?
   
   Original PR #23989 should explain:
   > When TransportChannelHandler processes IdleStateEvent, it first calculates whether the last request time has timed out.
   At this time, TransportClient.sendRpc initiates a request.
   TransportChannelHandler gets responseHandler.numOutstandingRequests() > 0, causing the normal connection to be closed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org