You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/05/09 11:04:48 UTC

[GitHub] [pulsar] Matt-Esch opened a new pull request, #15508: [fix][c++ client] avoid race condition causing double callback on close

Matt-Esch opened a new pull request, #15508:
URL: https://github.com/apache/pulsar/pull/15508

   ### Motivation
   
   The pulsar-client-node library uses the c api to interact with the pulsar client. The model for the c api is to pass in a `void *ctx` to async functions with a callback. Typically, these callbacks free the context and rely heavily on the fact that the callback is called once. I was able to reproducibly trigger the close callback to call twice, resulting in a double free error in pulsar-client-node.
   
   ### Modifications
   
   My running theory is that the cause for this is a race condition in the decrementing of `numberOfOpenHandlers` shared pointer. It is possible for two threads to decrement the number in parallel, and to both end up with `*numberOfOpenHandlers == 0`. This fix assumes this can happen and synchronizes on the state, so even if this does happen, only one thread will be able to set the state to `Closed` and create the subsequent shutdown task.
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   - [ ] Run reproduction steps with pulsar-client-node using the new version
   
   This change is already covered by existing tests, such as all of the tests in ClientTest.cc. Given that this is a race condition it's not possible to reproduce consistently for tests. I will run reproduction steps with pulsar-client-node.
   
   
   ### Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API: no
     - The schema: no
     - The default values of configurations: no
     - The wire protocol: no
     - The rest endpoints: no
     - The admin cli options: no
     - Anything that affects deployment: no
   
   ### Documentation
   
   - [ ] `doc-required`
     
   - [x] `no-need-doc` 
     
   - [ ] `doc`
   
   - [ ] `doc-added`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] merlimat commented on pull request #15508: [fix][c++ client] avoid race condition causing double callback on close

Posted by GitBox <gi...@apache.org>.
merlimat commented on PR #15508:
URL: https://github.com/apache/pulsar/pull/15508#issuecomment-1121321076

   You're right, I was convinced we were using `std::atomic<int>` but we're not.. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] BewareMyPower commented on a diff in pull request #15508: [fix][c++ client] avoid race condition causing double callback on close

Posted by GitBox <gi...@apache.org>.
BewareMyPower commented on code in PR #15508:
URL: https://github.com/apache/pulsar/pull/15508#discussion_r867936174


##########
pulsar-client-cpp/lib/ClientImpl.cc:
##########
@@ -536,8 +536,13 @@ void ClientImpl::handleClose(Result result, SharedInt numberOfOpenHandlers, Resu
     }
     if (*numberOfOpenHandlers == 0) {
         Lock lock(mutex_);
-        state_ = Closed;
-        lock.unlock();
+        if (state_ == Closed) {
+            LOG_DEBUG("Client is already shutting down, possible race condition in handleClose");
+            return lock.unlock();

Review Comment:
   ```suggestion
               return;
   ```
   The `unlock` method could be invoked automatically.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] merlimat commented on pull request #15508: [fix][c++ client] avoid race condition causing double callback on close

Posted by GitBox <gi...@apache.org>.
merlimat commented on PR #15508:
URL: https://github.com/apache/pulsar/pull/15508#issuecomment-1121258950

   The problem is that the decrement and checking of the counter are not done in atomic fashion:
   
   ```cpp
       if (*numberOfOpenHandlers > 0) {
           --(*numberOfOpenHandlers);
       }
       if (*numberOfOpenHandlers == 0) {
           // .....
       }
   ```
   
   We should instead convert into: 
   
   ```cpp
       if (--(*numberOfOpenHandlers) == 0) {
           // .....
       }
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] Matt-Esch commented on pull request #15508: [fix][c++ client] avoid race condition causing double callback on close

Posted by GitBox <gi...@apache.org>.
Matt-Esch commented on PR #15508:
URL: https://github.com/apache/pulsar/pull/15508#issuecomment-1121303435

   Are we guaranteed that `--(*numberOfOpenHandlers) == 0` is atomic if the underlying data type isn't atomic? The fix I have put in place allows the decrement race but still only one thread can win due to the lock.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] merlimat merged pull request #15508: [fix][c++ client] avoid race condition causing double callback on close

Posted by GitBox <gi...@apache.org>.
merlimat merged PR #15508:
URL: https://github.com/apache/pulsar/pull/15508


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org