You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/04/20 09:23:46 UTC

[GitHub] [pulsar] lhotari opened a new pull request #10280: Fix thread safety issue in ManagedLedgerImpl

lhotari opened a new pull request #10280:
URL: https://github.com/apache/pulsar/pull/10280


   ### Motivation
   
   In Java, updating double or long values isn't atomic and thread safe.
   Explained in [JLS 17.7](https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.7)
   > For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.
   
   There are several mutable non-volatile `long` fields in `ManagedLedgerImpl` which are accessed from multiple threads.
   
   ### Modifications
   
   Make the fields volatile to fix the thread safety issues.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on pull request #10280: Fix thread safety issue in ManagedLedgerImpl

Posted by GitBox <gi...@apache.org>.
lhotari commented on pull request #10280:
URL: https://github.com/apache/pulsar/pull/10280#issuecomment-823135530


   Although it's a clear violation of JLS 17.7, it might not be an issue in reality when running on 64-bit HotSpot based JVM: https://stackoverflow.com/a/25176428 . However it's unclear if the problem could occur when the JVM is running in interpreted mode (before JIT).  I guess it's risky to violate JLS 17.7 in any case?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on pull request #10280: Fix thread safety issue in ManagedLedgerImpl

Posted by GitBox <gi...@apache.org>.
lhotari commented on pull request #10280:
URL: https://github.com/apache/pulsar/pull/10280#issuecomment-824170568


   Closing this PR since the resolution will be to document it that Pulsar Broker requires a 64 bit JVM. There's a separate issue about adding the documentation, #10311 .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari closed pull request #10280: Fix thread safety issue in ManagedLedgerImpl

Posted by GitBox <gi...@apache.org>.
lhotari closed pull request #10280:
URL: https://github.com/apache/pulsar/pull/10280


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari edited a comment on pull request #10280: Fix thread safety issue in ManagedLedgerImpl

Posted by GitBox <gi...@apache.org>.
lhotari edited a comment on pull request #10280:
URL: https://github.com/apache/pulsar/pull/10280#issuecomment-823164231


   I did some more research:
   
   There's a blog post from 2014 "All accesses are atomic" https://shipilev.net/blog/2014/all-accesses-are-atomic/ .
   
   There is also [JEP 188: Java Memory Model Update](http://openjdk.java.net/jeps/188) and an InfoQ article ["The OpenJDK Revised Java Memory Model"](https://www.infoq.com/articles/The-OpenJDK9-Revised-Java-Memory-Model/#lowerFullwidthVCR:~:text=JMM9%20%2D%20Read%20And%20Write%20Atomicity%20Problem%20And%20The%20Word%2DTearing%20Problem.). It seems that the Java Memory Model has been revisited for JDK9+, called JEP 188 / JMM9 . It's a bit hard to find the consistent documentation since the JLS for JDK 16 hasn't been updated with any changes [in 17.7 of JLS](https://docs.oracle.com/javase/specs/jls/se16/html/jls-17.html#jls-17.7). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari edited a comment on pull request #10280: Fix thread safety issue in ManagedLedgerImpl

Posted by GitBox <gi...@apache.org>.
lhotari edited a comment on pull request #10280:
URL: https://github.com/apache/pulsar/pull/10280#issuecomment-823164231


   I did some more research:
   
   There's a blog post from 2014 "All accesses are atomic" https://shipilev.net/blog/2014/all-accesses-are-atomic/ .
   
   There is also [JEP 188: Java Memory Model Update](http://openjdk.java.net/jeps/188) and an InfoQ article ["The OpenJDK Revised Java Memory Model"](https://www.infoq.com/articles/The-OpenJDK9-Revised-Java-Memory-Model/#lowerFullwidthVCR:~:text=JMM9%20%2D%20Read%20And%20Write%20Atomicity%20Problem%20And%20The%20Word%2DTearing%20Problem.). It seems that the Java Memory Model has been revisited for JDK9+, called JMM9 . It's a bit hard to find the consistent documentation since the JLS for JDK 16 hasn't been updated with any changes [in 17.7 of JLS](https://docs.oracle.com/javase/specs/jls/se16/html/jls-17.html#jls-17.7). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on a change in pull request #10280: Fix thread safety issue in ManagedLedgerImpl

Posted by GitBox <gi...@apache.org>.
merlimat commented on a change in pull request #10280:
URL: https://github.com/apache/pulsar/pull/10280#discussion_r616891925



##########
File path: managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
##########
@@ -195,11 +195,11 @@
     private final static CompletableFuture<PositionImpl> NULL_OFFLOAD_PROMISE = CompletableFuture
             .completedFuture(PositionImpl.latest);
     private volatile LedgerHandle currentLedger;
-    private long currentLedgerEntries = 0;
-    private long currentLedgerSize = 0;
-    private long lastLedgerCreatedTimestamp = 0;
-    private long lastLedgerCreationFailureTimestamp = 0;
-    private long lastLedgerCreationInitiationTimestamp = 0;
+    private volatile long currentLedgerEntries = 0;

Review comment:
       > Perhaps running the Broker on non-64 bit JVMs should simply be unsupported and possibly prevented at startup?
   
   That's a good point. I think it would a good option to just preventing it to start. Right now we don't have any check that it actually works on 32bit. And, while it's true that there are still many pieces of hardware out there on 32bits CPUs, I'd say it's unlikely that you'd want to run brokers on them. Even small devices like RPi are running Arm64. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on pull request #10280: Fix thread safety issue in ManagedLedgerImpl

Posted by GitBox <gi...@apache.org>.
lhotari commented on pull request #10280:
URL: https://github.com/apache/pulsar/pull/10280#issuecomment-823164231


   I did some more research:
   
   There's a blog post from 2014 "All accesses are atomic" https://shipilev.net/blog/2014/all-accesses-are-atomic/ .
   
   There is also [JEP 188: Java Memory Model Update](http://openjdk.java.net/jeps/188) and an InfoQ article ["The OpenJDK Revised Java Memory Model"](https://www.infoq.com/articles/The-OpenJDK9-Revised-Java-Memory-Model/#lowerFullwidthVCR:~:text=JMM9%20%2D%20Read%20And%20Write%20Atomicity%20Problem%20And%20The%20Word%2DTearing%20Problem.). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on a change in pull request #10280: Fix thread safety issue in ManagedLedgerImpl

Posted by GitBox <gi...@apache.org>.
lhotari commented on a change in pull request #10280:
URL: https://github.com/apache/pulsar/pull/10280#discussion_r616823604



##########
File path: managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
##########
@@ -195,11 +195,11 @@
     private final static CompletableFuture<PositionImpl> NULL_OFFLOAD_PROMISE = CompletableFuture
             .completedFuture(PositionImpl.latest);
     private volatile LedgerHandle currentLedger;
-    private long currentLedgerEntries = 0;
-    private long currentLedgerSize = 0;
-    private long lastLedgerCreatedTimestamp = 0;
-    private long lastLedgerCreationFailureTimestamp = 0;
-    private long lastLedgerCreationInitiationTimestamp = 0;
+    private volatile long currentLedgerEntries = 0;

Review comment:
       That's true for updating, but reading doesn't seem to hold the mutex in all cases? That could lead to "word tearing" problem at read time on 32-bit JVMs? As mentioned in the other comments, this isn't an issue on 64-bit JVMs. Perhaps running the Broker on non-64 bit JVMs should simply be unsupported and possibly prevented at startup?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on pull request #10280: Fix thread safety issue in ManagedLedgerImpl

Posted by GitBox <gi...@apache.org>.
lhotari commented on pull request #10280:
URL: https://github.com/apache/pulsar/pull/10280#issuecomment-823207861


   /pulsarbot run-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on a change in pull request #10280: Fix thread safety issue in ManagedLedgerImpl

Posted by GitBox <gi...@apache.org>.
merlimat commented on a change in pull request #10280:
URL: https://github.com/apache/pulsar/pull/10280#discussion_r616817217



##########
File path: managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
##########
@@ -195,11 +195,11 @@
     private final static CompletableFuture<PositionImpl> NULL_OFFLOAD_PROMISE = CompletableFuture
             .completedFuture(PositionImpl.latest);
     private volatile LedgerHandle currentLedger;
-    private long currentLedgerEntries = 0;
-    private long currentLedgerSize = 0;
-    private long lastLedgerCreatedTimestamp = 0;
-    private long lastLedgerCreationFailureTimestamp = 0;
-    private long lastLedgerCreationInitiationTimestamp = 0;
+    private volatile long currentLedgerEntries = 0;

Review comment:
       All of these variable should be getting update while holding the mutex on the `ManagedLedgerImpl` instance, so we shouldn't anyway requiring atomicity here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on pull request #10280: Fix thread safety issue in ManagedLedgerImpl

Posted by GitBox <gi...@apache.org>.
lhotari commented on pull request #10280:
URL: https://github.com/apache/pulsar/pull/10280#issuecomment-823187454


   I asked a question on Twitter 
   > Is it ok to ignore JLS 17.7 advice  "Programmers are encouraged to declare shared 64-bit values as volatile or synchronize their programs correctly to avoid possible complications." when running on 64 bit JVMs ? What is the status of JEP 188 / JMM9 ?
   
   and received this answer
   
   > There is still plenty of 32-bit hardware in the wild, so I would not ignore it. If you are dead sure you only run with 64-bit VMs, ever and forever, then yes, this rule can be side-stepped.
   
   https://twitter.com/shipilev/status/1384461505161089024


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org