You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/09/02 03:57:15 UTC

[GitHub] [pulsar] BewareMyPower opened a new pull request #11889: [C++] Fix incorrect connect timeout implementation

BewareMyPower opened a new pull request #11889:
URL: https://github.com/apache/pulsar/pull/11889


   Fixes #10721 
   
   ### Motivation
   
   Currently the connection timer stops after the TCP connection is established immediately (the state becomes `TcpConnected`). However, the connect phase should also include sending the `CommandConnect` request and receiving the `CommandConnected` response from broker successfully. For example, in a case like #10721 described, if the Pulsar broker received a SIGSTOP signal and became pending, the TCP connection can be established but no response could be received from a stopped broker. Then the client would hang forever.
   
   ### Modifications
   
   Stop the connection timer only after the `ClientConnection`'s state becomes `Ready`, which means the client has received the `CommandConnected` response successfully. It's also consistent with Java client's implementation, see https://github.com/apache/pulsar/blob/235e678a56d0284e68b45e46706b6237d7c6d5f9/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ClientCnx.java#L329-L330
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   
   It's hard to simulate the scenario in unit test. But we can simply reproduce it in local env.
   1. Run a pulsar standalone and send `SIGSTOP` signal by `pkill -SIGSTOP -f pulsar`.
   2. Run any C++/Python client with this fix to connect to the standalone.
   
   ```
   >>> import pulsar
   >>> c = pulsar.Client('pulsar://localhost:6650')
   >>> c.create_producer('xxx')
   2021-09-02 11:54:38.828 INFO  [0x10d72ae00] ClientConnection:181 | [<none> -> pulsar://localhost:6650] Create ClientConnection, timeout=10000
   2021-09-02 11:54:38.828 INFO  [0x10d72ae00] ConnectionPool:96 | Created connection for pulsar://localhost:6650
   2021-09-02 11:54:38.969 INFO  [0x700003136000] ClientConnection:367 | [127.0.0.1:62784 -> 127.0.0.1:6650] Connected to broker
   2021-09-02 11:54:48.973 ERROR [0x700003136000] ClientConnection:532 | [127.0.0.1:62784 -> 127.0.0.1:6650] Connection was not established in 10000 ms, close the socket
   2021-09-02 11:54:48.973 ERROR [0x700003136000] ClientConnection:572 | [127.0.0.1:62784 -> 127.0.0.1:6650] Read failed: Operation canceled
   2021-09-02 11:54:48.973 INFO  [0x700003136000] ClientConnection:1495 | [127.0.0.1:62784 -> 127.0.0.1:6650] Connection closed
   2021-09-02 11:54:48.973 ERROR [0x700003136000] ClientImpl:188 | Error Checking/Getting Partition Metadata while creating producer on persistent://public/default/xxx -- ConnectError
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/Users/xuyunze/github.com/BewareMyPower/pulsar/pulsar-client-cpp/python/pulsar/__init__.py", line 603, in create_producer
       p._producer = self._client.create_producer(topic, conf)
   _pulsar.ConnectError: Pulsar error: ConnectError
   ```
   
   we can see after 10 seconds (the default connect timeout), it failed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] BewareMyPower merged pull request #11889: [C++] Fix incorrect connect timeout implementation

Posted by GitBox <gi...@apache.org>.
BewareMyPower merged pull request #11889:
URL: https://github.com/apache/pulsar/pull/11889


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org