You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/04/20 08:15:32 UTC

[GitHub] [pulsar] lhotari opened a new pull request #10278: [Java Client] Fix concurrency issue in client's producer epoch handling

lhotari opened a new pull request #10278:
URL: https://github.com/apache/pulsar/pull/10278


   ### Motivation
   
   Client's producer epoch handling was added #5571 . An issue has been reported as #9792 where this solution hasn't worked.
   When looking at the code, it seems that there's a concurrency issue. 
   The epoc field is incremented in the timer thread, but read in another thread. There is no synchronization / volatile field in place.
   In Java, updating double or long values isn't thread safe.
   Explained in [JLS 17.7](https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.7)
   > For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.
   
   ### Modifications
   
   - use volatile field for epoch
   - use AtomicLongFieldUpdater for incrementing the value (although it would be fine to omit it when a single timer thread is updating the value. However this method isn't a final method and subclasses can override the reconnectLater method.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie commented on pull request #10278: [Java Client] Fix concurrency issue in client's producer epoch handling

Posted by GitBox <gi...@apache.org>.
sijie commented on pull request #10278:
URL: https://github.com/apache/pulsar/pull/10278#issuecomment-849758664


   @eolivelli any reason that you didn't add the `release/2.7.3` label when you removed `release/2.7.2`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on pull request #10278: [Java Client] Fix concurrency issue in client's producer epoch handling

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on pull request #10278:
URL: https://github.com/apache/pulsar/pull/10278#issuecomment-849786977


   https://github.com/apache/pulsar/pull/10436 merged into branch-2.7, so we don't need to cherry-pick this one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on pull request #10278: [Java Client] Fix concurrency issue in client's producer epoch handling

Posted by GitBox <gi...@apache.org>.
lhotari commented on pull request #10278:
URL: https://github.com/apache/pulsar/pull/10278#issuecomment-823167061


   After doing some investigation about atomicity of long & double on 64 bit JVMs, it is unlikely that this PR fixes the issue on 64bit Hotspot JVMs. More details in #10280 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on a change in pull request #10278: [Java Client] Fix concurrency issue in client's producer epoch handling

Posted by GitBox <gi...@apache.org>.
merlimat commented on a change in pull request #10278:
URL: https://github.com/apache/pulsar/pull/10278#discussion_r616823280



##########
File path: pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConnectionHandler.java
##########
@@ -104,11 +108,15 @@ protected void reconnectLater(Throwable exception) {
         state.setState(State.Connecting);
         state.client.timer().newTimeout(timeout -> {
             log.info("[{}] [{}] Reconnecting after connection was closed", state.topic, state.getHandlerName());
-            ++epoch;
+            incrementEpoch();
             grabCnx();
         }, delayMs, TimeUnit.MILLISECONDS);
     }
 
+    protected long incrementEpoch() {

Review comment:
       Maybe we can keep this `private`, unless we need to expose.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on a change in pull request #10278: [Java Client] Fix concurrency issue in client's producer epoch handling

Posted by GitBox <gi...@apache.org>.
lhotari commented on a change in pull request #10278:
URL: https://github.com/apache/pulsar/pull/10278#discussion_r616828913



##########
File path: pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConnectionHandler.java
##########
@@ -104,11 +108,15 @@ protected void reconnectLater(Throwable exception) {
         state.setState(State.Connecting);
         state.client.timer().newTimeout(timeout -> {
             log.info("[{}] [{}] Reconnecting after connection was closed", state.topic, state.getHandlerName());
-            ++epoch;
+            incrementEpoch();
             grabCnx();
         }, delayMs, TimeUnit.MILLISECONDS);
     }
 
+    protected long incrementEpoch() {

Review comment:
       that's true. I exposed it as `protected` since the `reconnectLater` method is `protected` and I was following a similar style.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat merged pull request #10278: [Java Client] Fix concurrency issue in client's producer epoch handling

Posted by GitBox <gi...@apache.org>.
merlimat merged pull request #10278:
URL: https://github.com/apache/pulsar/pull/10278


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari edited a comment on pull request #10278: [Java Client] Fix concurrency issue in client's producer epoch handling

Posted by GitBox <gi...@apache.org>.
lhotari edited a comment on pull request #10278:
URL: https://github.com/apache/pulsar/pull/10278#issuecomment-823167061


   After doing some investigation about atomicity of long & double on 64 bit JVMs, it is unlikely that this PR fixes the issue on 64bit Hotspot JVMs. More details in #10280 
   
   On 32 bit JVMs, this fix is relevant.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli commented on pull request #10278: [Java Client] Fix concurrency issue in client's producer epoch handling

Posted by GitBox <gi...@apache.org>.
eolivelli commented on pull request #10278:
URL: https://github.com/apache/pulsar/pull/10278#issuecomment-849774134


   Probably it was a mistake


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie commented on pull request #10278: [Java Client] Fix concurrency issue in client's producer epoch handling

Posted by GitBox <gi...@apache.org>.
sijie commented on pull request #10278:
URL: https://github.com/apache/pulsar/pull/10278#issuecomment-849860116


   Ok


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org