You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/04/27 10:15:40 UTC

[GitHub] [pulsar] lhotari opened a new pull request #10400: [Broker] Fix issue in reusing EntryBatchIndexesAcks instances

lhotari opened a new pull request #10400:
URL: https://github.com/apache/pulsar/pull/10400


   ### Motivation
   
   The broker has a bug in calculating permits, #6054 .
   
   https://github.com/apache/pulsar/blob/c4f154e79c03cff9055aa4e2ede7748c5952f2bc/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentStickyKeyDispatcherMultipleConsumers.java#L227-L228
   
   https://github.com/apache/pulsar/blob/875262a0ae2fba82a6dd7d46dd2467d675b0d9c3/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentDispatcherMultipleConsumers.java#L527-L528
   
   https://github.com/apache/pulsar/blob/8535dee5ca4d56a5944c068f116d9a32bb6d85f6/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/Consumer.java#L255-L256
   
   As it can be seen from the code locations for calculating permits, it is important that `batchIndexesAcks.getTotalAckedIndexCount()` returns the correct value. 
   When reviewing the code, it was noticed that `EntryBatchIndexesAcks` instances are reused. The problem is, that state isn't properly cleared before reusing. `getTotalAckedIndexCount()` will include state from the previous usage in the calculation. This causes the permit calculations to get corrupted. There are bugs such as #6054 which have been reported about the issue.
   
   The only location where the previous state of the `EntryBatchIndexesAcks` instance is cleared is this line of code:
   https://github.com/apache/pulsar/blob/4709f3aeaed1bc6a68a9a683c95e0398c940cd54/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractBaseDispatcher.java#L137
   
   This doesn't take the case into account where the previous batch was larger than the next one. This is a unit test case that reproduces the issue:
   ```
           BitSetRecyclable bitSet = BitSetRecyclable.create();
           bitSet.set(0, 95);
           long[] nintyFiveBitsSet = bitSet.toLongArray();
   
           EntryBatchIndexesAcks acks = EntryBatchIndexesAcks.get(10);
           acks.setIndexesAcks(8, Pair.of(100, nintyFiveBitsSet));
           acks.setIndexesAcks(9, Pair.of(100, nintyFiveBitsSet));
   
           assertEquals(acks.getTotalAckedIndexCount(), 10);
   
           acks.recycle();
   
           acks = EntryBatchIndexesAcks.get(2);
           // there should be no previous state
           assertEquals(acks.getTotalAckedIndexCount(), 0);
   ```
   
   ### Modifications
   
   - add a field to EntryBatchIndexesAcks for tracking the maximum size since as the result of reuse, the complete array might not be used
   - properly reset state between usages
   - add unit test that reproduced the issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on a change in pull request #10400: [Broker] Fix issue in reusing EntryBatchIndexesAcks instances

Posted by GitBox <gi...@apache.org>.
lhotari commented on a change in pull request #10400:
URL: https://github.com/apache/pulsar/pull/10400#discussion_r621087107



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/service/EntryBatchIndexesAcks.java
##########
@@ -24,8 +24,8 @@
 import org.apache.commons.lang3.tuple.Pair;
 
 public class EntryBatchIndexesAcks {
-
-    Pair<Integer, long[]>[] indexesAcks = new Pair[100];
+    int maxSize = 100;

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on a change in pull request #10400: [Broker] Fix issue in reusing EntryBatchIndexesAcks instances

Posted by GitBox <gi...@apache.org>.
lhotari commented on a change in pull request #10400:
URL: https://github.com/apache/pulsar/pull/10400#discussion_r621511057



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/service/EntryBatchIndexesAcks.java
##########
@@ -50,12 +51,20 @@ public void recycle() {
         handle.recycle(this);
     }
 
+    private void ensureCapacityAndReset(int entriesListSize) {
+        size = entriesListSize;
+        if (indexesAcks.length < size) {
+            indexesAcks = new Pair[size];
+        } else {
+            for (int i = 0; i < size; i++) {
+                indexesAcks[i] = null;

Review comment:
       Done. PTAL @merlimat 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on a change in pull request #10400: [Broker] Fix issue in reusing EntryBatchIndexesAcks instances

Posted by GitBox <gi...@apache.org>.
lhotari commented on a change in pull request #10400:
URL: https://github.com/apache/pulsar/pull/10400#discussion_r621509138



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/service/EntryBatchIndexesAcks.java
##########
@@ -50,12 +51,20 @@ public void recycle() {
         handle.recycle(this);
     }
 
+    private void ensureCapacityAndReset(int entriesListSize) {
+        size = entriesListSize;
+        if (indexesAcks.length < size) {
+            indexesAcks = new Pair[size];
+        } else {
+            for (int i = 0; i < size; i++) {
+                indexesAcks[i] = null;

Review comment:
       Good point. I'll make this change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli commented on a change in pull request #10400: [Broker] Fix issue in reusing EntryBatchIndexesAcks instances

Posted by GitBox <gi...@apache.org>.
eolivelli commented on a change in pull request #10400:
URL: https://github.com/apache/pulsar/pull/10400#discussion_r621082530



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/service/EntryBatchIndexesAcks.java
##########
@@ -24,8 +24,8 @@
 import org.apache.commons.lang3.tuple.Pair;
 
 public class EntryBatchIndexesAcks {
-
-    Pair<Integer, long[]>[] indexesAcks = new Pair[100];
+    int maxSize = 100;

Review comment:
       can we make this "private" ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on a change in pull request #10400: [Broker] Fix issue in reusing EntryBatchIndexesAcks instances

Posted by GitBox <gi...@apache.org>.
lhotari commented on a change in pull request #10400:
URL: https://github.com/apache/pulsar/pull/10400#discussion_r621084228



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/service/EntryBatchIndexesAcks.java
##########
@@ -24,8 +24,8 @@
 import org.apache.commons.lang3.tuple.Pair;
 
 public class EntryBatchIndexesAcks {
-
-    Pair<Integer, long[]>[] indexesAcks = new Pair[100];
+    int maxSize = 100;

Review comment:
       yes, I'll revisit it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] devinbost commented on pull request #10400: [Broker] Fix issue in reusing EntryBatchIndexesAcks instances

Posted by GitBox <gi...@apache.org>.
devinbost commented on pull request #10400:
URL: https://github.com/apache/pulsar/pull/10400#issuecomment-827757205


   @lhotari Good find! Thanks for pinging me on it.
   I think there's still another issue lurking that's causing the subscriptions to get stuck because we see the stuck subscriptions even when `acknowledgmentAtBatchIndexLevelEnabled` is disabled (which is the default.) However, when `acknowledgmentAtBatchIndexLevelEnabled` is true, this PR will prevent issues as you discovered. I hope we can merge this fix before many people start enabling this feature. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on a change in pull request #10400: [Broker] Fix issue in reusing EntryBatchIndexesAcks instances

Posted by GitBox <gi...@apache.org>.
merlimat commented on a change in pull request #10400:
URL: https://github.com/apache/pulsar/pull/10400#discussion_r621420500



##########
File path: pulsar-broker/src/main/java/org/apache/pulsar/broker/service/EntryBatchIndexesAcks.java
##########
@@ -50,12 +51,20 @@ public void recycle() {
         handle.recycle(this);
     }
 
+    private void ensureCapacityAndReset(int entriesListSize) {
+        size = entriesListSize;
+        if (indexesAcks.length < size) {
+            indexesAcks = new Pair[size];
+        } else {
+            for (int i = 0; i < size; i++) {
+                indexesAcks[i] = null;

Review comment:
       Instead of setting to null here, it would be better to do it in the recycle so the instances of `Pair` are staying valid for a smaller amount of time.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on pull request #10400: [Broker] Fix issue in reusing EntryBatchIndexesAcks instances

Posted by GitBox <gi...@apache.org>.
lhotari commented on pull request #10400:
URL: https://github.com/apache/pulsar/pull/10400#issuecomment-827495705


   @codelipenghui @rdhabalia @merlimat @eolivelli @massakam @devinbost Please review this PR that fixes some issues with consumer permit calculation on the broker side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat merged pull request #10400: [Broker] Fix issue in reusing EntryBatchIndexesAcks instances

Posted by GitBox <gi...@apache.org>.
merlimat merged pull request #10400:
URL: https://github.com/apache/pulsar/pull/10400


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org