You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "warrenzhu25 (via GitHub)" <gi...@apache.org> on 2023/05/06 00:23:34 UTC

[GitHub] [spark] warrenzhu25 opened a new pull request, #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

warrenzhu25 opened a new pull request, #41071:
URL: https://github.com/apache/spark/pull/41071

   ### What changes were proposed in this pull request?
   Not close idle connection when closeIdleConnection is disabled
   
   ### Why are the changes needed?
   Spark will close idle connection when there're outstanding requests but no traffic for at least `requestTimeoutMs` even with closeIdleConnection is disabled.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Manually tested.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] warrenzhu25 commented on a diff in pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "warrenzhu25 (via GitHub)" <gi...@apache.org>.
warrenzhu25 commented on code in PR #41071:
URL: https://github.com/apache/spark/pull/41071#discussion_r1186767910


##########
common/network-common/src/test/java/org/apache/spark/network/RequestTimeoutIntegrationSuite.java:
##########
@@ -110,7 +110,7 @@ public StreamManager getStreamManager() {
       }
     };
 
-    context = new TransportContext(conf, handler);
+    context = new TransportContext(conf, handler, true);

Review Comment:
   Spark has both usage pattern with the later has `closeIdleConnection` as `false`.  But the previous behavior ignored `closeIdleConnection`, so I think it's safe to update this test to match the correct behavior.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] warrenzhu25 commented on pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "warrenzhu25 (via GitHub)" <gi...@apache.org>.
warrenzhu25 commented on PR #41071:
URL: https://github.com/apache/spark/pull/41071#issuecomment-1536935143

   @dongjoon-hyun @mridulm help take a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] otterc commented on a diff in pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "otterc (via GitHub)" <gi...@apache.org>.
otterc commented on code in PR #41071:
URL: https://github.com/apache/spark/pull/41071#discussion_r1197131313


##########
common/network-common/src/main/java/org/apache/spark/network/server/TransportChannelHandler.java:
##########
@@ -163,14 +163,11 @@ public void userEventTriggered(ChannelHandlerContext ctx, Object evt) throws Exc
         if (e.state() == IdleState.ALL_IDLE && isActuallyOverdue) {
           if (responseHandler.hasOutstandingRequests()) {
             String address = getRemoteAddress(ctx.channel());
-            logger.error("Connection to {} has been quiet for {} ms while there are outstanding " +
-              "requests. Assuming connection is dead; please adjust" +
-              " spark.{}.io.connectionTimeout if this is wrong.",
-              address, requestTimeoutNs / 1000 / 1000, transportContext.getConf().getModuleName());
-            client.timeOut();
-            ctx.close();

Review Comment:
   The closeIdleConnection flag determines whether to close the connection, regardless of whether there are any pending requests. When this flag is enabled, the connection is closed even if there are no outstanding requests. However, if there are pending requests, the client adheres to the requestTimeoutNs duration, which is a sensible approach because waiting indefinitely is not recommended. Modifying this behavior would result in a change to long-established semantics.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] warrenzhu25 commented on pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "warrenzhu25 (via GitHub)" <gi...@apache.org>.
warrenzhu25 commented on PR #41071:
URL: https://github.com/apache/spark/pull/41071#issuecomment-1545903094

   @dongjoon-hyun any more comments on this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] warrenzhu25 commented on a diff in pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "warrenzhu25 (via GitHub)" <gi...@apache.org>.
warrenzhu25 commented on code in PR #41071:
URL: https://github.com/apache/spark/pull/41071#discussion_r1186768979


##########
common/network-common/src/test/java/org/apache/spark/network/RequestTimeoutIntegrationSuite.java:
##########
@@ -110,7 +110,7 @@ public StreamManager getStreamManager() {
       }
     };
 
-    context = new TransportContext(conf, handler);
+    context = new TransportContext(conf, handler, true);

Review Comment:
   Sure



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #41071:
URL: https://github.com/apache/spark/pull/41071#issuecomment-1537251921

   Thank you for the fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] closed pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled
URL: https://github.com/apache/spark/pull/41071


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm commented on PR #41071:
URL: https://github.com/apache/spark/pull/41071#issuecomment-1537104070

   +CC @otterc


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] warrenzhu25 commented on a diff in pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "warrenzhu25 (via GitHub)" <gi...@apache.org>.
warrenzhu25 commented on code in PR #41071:
URL: https://github.com/apache/spark/pull/41071#discussion_r1197093878


##########
common/network-common/src/main/java/org/apache/spark/network/server/TransportChannelHandler.java:
##########
@@ -163,14 +163,11 @@ public void userEventTriggered(ChannelHandlerContext ctx, Object evt) throws Exc
         if (e.state() == IdleState.ALL_IDLE && isActuallyOverdue) {
           if (responseHandler.hasOutstandingRequests()) {
             String address = getRemoteAddress(ctx.channel());
-            logger.error("Connection to {} has been quiet for {} ms while there are outstanding " +
-              "requests. Assuming connection is dead; please adjust" +
-              " spark.{}.io.connectionTimeout if this is wrong.",
-              address, requestTimeoutNs / 1000 / 1000, transportContext.getConf().getModuleName());
-            client.timeOut();
-            ctx.close();

Review Comment:
   if `closeIdleConnection` is `true`, idle connection still be closed. If false, the client will wait forever. I have followup change to allow user to config `closeIdleConnection` and `requestTimeout`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] commented on pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #41071:
URL: https://github.com/apache/spark/pull/41071#issuecomment-1694529849

   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] otterc commented on a diff in pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "otterc (via GitHub)" <gi...@apache.org>.
otterc commented on code in PR #41071:
URL: https://github.com/apache/spark/pull/41071#discussion_r1194381617


##########
common/network-common/src/main/java/org/apache/spark/network/server/TransportChannelHandler.java:
##########
@@ -163,14 +163,11 @@ public void userEventTriggered(ChannelHandlerContext ctx, Object evt) throws Exc
         if (e.state() == IdleState.ALL_IDLE && isActuallyOverdue) {
           if (responseHandler.hasOutstandingRequests()) {
             String address = getRemoteAddress(ctx.channel());
-            logger.error("Connection to {} has been quiet for {} ms while there are outstanding " +
-              "requests. Assuming connection is dead; please adjust" +
-              " spark.{}.io.connectionTimeout if this is wrong.",
-              address, requestTimeoutNs / 1000 / 1000, transportContext.getConf().getModuleName());
-            client.timeOut();
-            ctx.close();

Review Comment:
   
   I believe this change alters the semantics of the system. Previously, when a client had pending requests but had not received any response from the server within the last requestTimeoutNs, we inferred that the channel was inactive and closed it. The client would then initiate a new fetch request. If we do not close the channel, what would happen? Would the client wait indefinitely for the shuffle response?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41071:
URL: https://github.com/apache/spark/pull/41071#discussion_r1186768885


##########
common/network-common/src/test/java/org/apache/spark/network/RequestTimeoutIntegrationSuite.java:
##########
@@ -110,7 +110,7 @@ public StreamManager getStreamManager() {
       }
     };
 
-    context = new TransportContext(conf, handler);
+    context = new TransportContext(conf, handler, true);

Review Comment:
   Could you keep this comment `open` for a while?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #41071:
URL: https://github.com/apache/spark/pull/41071#discussion_r1186762877


##########
common/network-common/src/test/java/org/apache/spark/network/RequestTimeoutIntegrationSuite.java:
##########
@@ -110,7 +110,7 @@ public StreamManager getStreamManager() {
       }
     };
 
-    context = new TransportContext(conf, handler);
+    context = new TransportContext(conf, handler, true);

Review Comment:
   Which patter does Apache Spark use; (1) `new TransportContext(conf, handler, true)`, (2) `new TransportContext(conf, handler)`. Is this test coverage change valid?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] otterc commented on a diff in pull request #41071: [SPARK-43391][CORE] Idle connection should be kept when closeIdleConnection is disabled

Posted by "otterc (via GitHub)" <gi...@apache.org>.
otterc commented on code in PR #41071:
URL: https://github.com/apache/spark/pull/41071#discussion_r1197131313


##########
common/network-common/src/main/java/org/apache/spark/network/server/TransportChannelHandler.java:
##########
@@ -163,14 +163,11 @@ public void userEventTriggered(ChannelHandlerContext ctx, Object evt) throws Exc
         if (e.state() == IdleState.ALL_IDLE && isActuallyOverdue) {
           if (responseHandler.hasOutstandingRequests()) {
             String address = getRemoteAddress(ctx.channel());
-            logger.error("Connection to {} has been quiet for {} ms while there are outstanding " +
-              "requests. Assuming connection is dead; please adjust" +
-              " spark.{}.io.connectionTimeout if this is wrong.",
-              address, requestTimeoutNs / 1000 / 1000, transportContext.getConf().getModuleName());
-            client.timeOut();
-            ctx.close();

Review Comment:
   The closeIdleConnection flag determines to close the connection, if there are no pending requests. When this flag is enabled, the connection is closed even if there are no outstanding requests. However, if there are pending requests, the client adheres to the requestTimeoutNs duration, which is a sensible approach because waiting indefinitely is not recommended. Modifying this behavior would result in a change to long-established semantics.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org