You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/08/10 17:13:21 UTC

[GitHub] [pulsar] liangyepianzhou opened a new pull request, #17051: [Fix][Client] Fixed cnx channel Inactive causing the request fail to time out and fail to return

liangyepianzhou opened a new pull request, #17051:
URL: https://github.com/apache/pulsar/pull/17051

   ## Motivation
   If a request is put into `pendingRequest` and `timeoutQueue` after the channelInactive
   It may be put before
   `pendingRequests.clear();` and after `pendingRequests.forEach((key, future) -> future.completeExceptionally(e));`
   And then the request will not receive any return forever.
   ## Modification
   Use the `connectionFuture` to control concurrency。
   
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
     - *Added integration tests for end-to-end deployment with large payloads (10MB)*
     - *Extended integration test for recovery after broker failure*
   
   ### Does this pull request potentially affect one of the following parts:
   
   *If `yes` was chosen, please highlight the changes*
   
     - Dependencies (does it add or upgrade a dependency): (yes / no)
     - The public API: (yes / no)
     - The schema: (yes / no / don't know)
     - The default values of configurations: (yes / no)
     - The wire protocol: (yes / no)
     - The rest endpoints: (yes / no)
     - The admin cli options: (yes / no)
     - Anything that affects deployment: (yes / no / don't know)
   
   ### Documentation
   
   Check the box below or label this PR directly.
   
   Need to update docs? 
   
   - [ ] `doc-required` 
   (Your PR needs to update docs and you will update later)
     
   - [x] `doc-not-needed` 
   (Please explain why)
     
   - [ ] `doc` 
   (Your PR contains doc changes)
   
   - [ ] `doc-complete`
   (Docs have been already added)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] eolivelli commented on a diff in pull request #17051: [Fix][Client] Fixed cnx channel Inactive causing the request fail to time out and fail to return

Posted by GitBox <gi...@apache.org>.
eolivelli commented on code in PR #17051:
URL: https://github.com/apache/pulsar/pull/17051#discussion_r946515040


##########
pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java:
##########
@@ -290,7 +290,11 @@ public void channelInactive(ChannelHandlerContext ctx) throws Exception {
                 "Disconnected from server at " + ctx.channel().remoteAddress());
 
         // Fail out all the pending ops
-        pendingRequests.forEach((key, future) -> future.completeExceptionally(e));
+        pendingRequests.forEach((key, future) -> {
+            if (pendingRequests.remove(key, future) && !future.isDone()) {

Review Comment:
   is it really allowed and officially supported to mutate the collection inside a forEach loop ?
   are there any side effects ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] liangyepianzhou commented on a diff in pull request #17051: [Fix][Client] Fixed cnx channel Inactive causing the request fail to time out and fail to return

Posted by GitBox <gi...@apache.org>.
liangyepianzhou commented on code in PR #17051:
URL: https://github.com/apache/pulsar/pull/17051#discussion_r946563753


##########
pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java:
##########
@@ -290,7 +290,11 @@ public void channelInactive(ChannelHandlerContext ctx) throws Exception {
                 "Disconnected from server at " + ctx.channel().remoteAddress());
 
         // Fail out all the pending ops
-        pendingRequests.forEach((key, future) -> future.completeExceptionally(e));
+        pendingRequests.forEach((key, future) -> {
+            if (pendingRequests.remove(key, future) && !future.isDone()) {

Review Comment:
   Great suggestion, I'll check if it works.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] liangyepianzhou commented on pull request #17051: [Fix][Client] Fixed cnx channel Inactive causing the request fail to time out and fail to return

Posted by GitBox <gi...@apache.org>.
liangyepianzhou commented on PR #17051:
URL: https://github.com/apache/pulsar/pull/17051#issuecomment-1216513603

   > LGTM, please also help check if there are other cases also need the same fix, such as lookup, tc handler, etc.
   
   TC handler has a check for the handler state. Nowhere else found anything like `pendingRequests.clear()`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] codelipenghui commented on a diff in pull request #17051: [Fix][Client] Fixed cnx channel Inactive causing the request fail to time out and fail to return

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on code in PR #17051:
URL: https://github.com/apache/pulsar/pull/17051#discussion_r946456657


##########
pulsar-broker/src/test/java/org/apache/pulsar/client/impl/ClientCnxTest.java:
##########
@@ -0,0 +1,99 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pulsar.client.impl;
+
+import com.google.common.collect.Sets;
+import io.netty.channel.ChannelHandlerContext;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import org.apache.pulsar.broker.auth.MockedPulsarServiceBaseTest;
+import org.apache.pulsar.client.api.Consumer;
+import org.apache.pulsar.common.policies.data.ClusterData;
+import org.apache.pulsar.common.policies.data.TenantInfoImpl;
+import org.awaitility.Awaitility;
+import org.testng.annotations.AfterClass;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+public class ClientCnxTest extends MockedPulsarServiceBaseTest {

Review Comment:
   Please add test group.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] codelipenghui commented on a diff in pull request #17051: [Fix][Client] Fixed cnx channel Inactive causing the request fail to time out and fail to return

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on code in PR #17051:
URL: https://github.com/apache/pulsar/pull/17051#discussion_r945713240


##########
pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java:
##########
@@ -241,6 +244,7 @@ public void channelActive(ChannelHandlerContext ctx) throws Exception {
         super.channelActive(ctx);
         this.localAddress = ctx.channel().localAddress();
         this.remoteAddress = ctx.channel().remoteAddress();
+        internalPinnedExecutor = ctx.channel().eventLoop();

Review Comment:
   Why not use the internal thread pool of the Pulsar client?
   
   Here looks like switching to another IO thread. Either don't introduce a thread switch here or switch to the client's internal thread to reduce the IO thread burden.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] codelipenghui merged pull request #17051: [Fix][Client] Fixed cnx channel Inactive causing the request fail to time out and fail to return

Posted by GitBox <gi...@apache.org>.
codelipenghui merged PR #17051:
URL: https://github.com/apache/pulsar/pull/17051


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] syhily commented on pull request #17051: [Fix][Client] Fixed cnx channel Inactive causing the request fail to time out and fail to return

Posted by GitBox <gi...@apache.org>.
syhily commented on PR #17051:
URL: https://github.com/apache/pulsar/pull/17051#issuecomment-1241474972

   When will this fix get released?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] liangyepianzhou commented on a diff in pull request #17051: [Fix][Client] Fixed cnx channel Inactive causing the request fail to time out and fail to return

Posted by GitBox <gi...@apache.org>.
liangyepianzhou commented on code in PR #17051:
URL: https://github.com/apache/pulsar/pull/17051#discussion_r946574050


##########
pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java:
##########
@@ -290,7 +290,11 @@ public void channelInactive(ChannelHandlerContext ctx) throws Exception {
                 "Disconnected from server at " + ctx.channel().remoteAddress());
 
         // Fail out all the pending ops
-        pendingRequests.forEach((key, future) -> future.completeExceptionally(e));
+        pendingRequests.forEach((key, future) -> {
+            if (pendingRequests.remove(key, future) && !future.isDone()) {

Review Comment:
   I just tested it. Elements can be deleted as expected in `for each`. And because when the channel is Inactive, this ClientCnx will not be reused but recycled by the JVM, so we don't really care whether it is really deleted. Therefore, there will be no related side effects.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org