You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/07/16 07:04:13 UTC

[GitHub] [pulsar] massakam opened a new issue #7554: Messages that have already been acked are redelivered when upgrading Pulsar version

massakam opened a new issue #7554:
URL: https://github.com/apache/pulsar/issues/7554


   The other day, we upgraded the Pulsar version of our broker servers from 2.3.2 to 2.4.2.
   
   At that time, a large number of messages were redelivered on some topics. These messages should have already been acked in the past. The topics where this happened had some unacked messages in their backlog. I can understand that the unacked messages were redelivered. The strange thing is that some or all of the acked messages that followed the unacked messages were also redelivered.
   
   We have found that this happens when the Pulsar 2.3.2 broker that owns the topic in question is shut down and the topic is moved to the Pulsar 2.4.2 broker. The cause seems to be the difference in behavior of `individualDeletedMessages` held by `ManagedCursorImpl` instance.
   
   In Pulsar 2.3.2, `individualDeletedMessages` is an instance of the `TreeRangeSet` class. Instances of `TreeRangeSet` can contain ranges that span different ledgers.
   
   Verification code 1:
   ```java
   RangeSet<PositionImpl> rangeSet = TreeRangeSet.create();
   rangeSet.add(Range.openClosed(new PositionImpl(1, 100), new PositionImpl(2, 200)));
   System.out.println(rangeSet);
   ```
   
   Result 1:
   ```
   [(1:100..2:200]]
   ```
   
   On the other hand, in Pulsar 2.4.2, `individualDeletedMessages` is an instance of `ConcurrentOpenLongPairRangeSet` by default. If a range that spans multiple ledgers is added to this instance, the information of the first half will be lost. It seems that adding such a range to `ConcurrentOpenLongPairRangeSet` is not allowed.
   
   Verification code 2:
   ```java
   LongPairConsumer<PositionImpl> positionRangeConverter = (key, value) -> new PositionImpl(key, value);
   LongPairRangeSet<PositionImpl> rangeSet = new ConcurrentOpenLongPairRangeSet<PositionImpl>(4096, positionRangeConverter);
   rangeSet.addOpenClosed(1, 100, 2, 200);
   System.out.println(rangeSet);
   ```
   
   Result 2:
   ```
   [(2:-1..2:200]]
   ```
   
   And if we set `managedLedgerUnackedRangesOpenCacheSetEnabled` to false on Pulsar 2.4.2 broker, `individualDeletedMessages` will be an instance of `LongPairRangeSet.DefaultRangeSet`. Such a range can be added to `LongPairRangeSet.DefaultRangeSet`.
   
   Verification code 3:
   ```java
   LongPairConsumer<PositionImpl> positionRangeConverter = (key, value) -> new PositionImpl(key, value);
   LongPairRangeSet<PositionImpl> rangeSet = new LongPairRangeSet.DefaultRangeSet<>(positionRangeConverter);
   rangeSet.addOpenClosed(1, 100, 2, 200);
   System.out.println(rangeSet);
   ```
   
   Result 3:
   ```
   [(1:100..2:200]]
   ```
   
   This difference in behavior causes redelivery of acked messages. For example, suppose a topic owned by a Pulsar 2.3.2 broker has the following `individuallyDeletedMessages`:
   ```json
   "individuallyDeletedMessages" : "[(2625703:-1..2625703:9], (2625719:-1..2625727:9]]",
   ```
   
   When this Pulsar 2.3.2 broker is shut down and the topic moves to a Pulsar 2.4.2 broker, the `individuallyDeletedMessages` changes as follows:
   ```json
   "individuallyDeletedMessages" : "[(2625703:-1..2625703:9],(2625727:-1..2625727:9]]",
   ```
   
   The messages with ledger ID 2625719 have already been acked, but after the topic moves to the Pulsar 2.4.2 broker, that information is lost and these messages are redelivered to the consumers.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie commented on issue #7554: Messages that have already been acked are redelivered when upgrading Pulsar version

Posted by GitBox <gi...@apache.org>.
sijie commented on issue #7554:
URL: https://github.com/apache/pulsar/issues/7554#issuecomment-659783624


   @codelipenghui Can you check this issue?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] massakam commented on issue #7554: Messages that have already been acked are redelivered when upgrading Pulsar version

Posted by GitBox <gi...@apache.org>.
massakam commented on issue #7554:
URL: https://github.com/apache/pulsar/issues/7554#issuecomment-659206287


   @rdhabalia Is the difference in behavior between `ConcurrentOpenLongPairRangeSet` and `TreeRangeSet/LongPairRangeSet.DefaultRangeSet` intentional? If so, what is the reason?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] massakam commented on issue #7554: Messages that have already been acked are redelivered when upgrading Pulsar version

Posted by GitBox <gi...@apache.org>.
massakam commented on issue #7554:
URL: https://github.com/apache/pulsar/issues/7554#issuecomment-659204094


   The class of `individualDeletedMessages` has been changed by the following two PRs:
   - https://github.com/apache/pulsar/pull/3818
   - https://github.com/apache/pulsar/pull/3819


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie closed issue #7554: Messages that have already been acked are redelivered when upgrading Pulsar version

Posted by GitBox <gi...@apache.org>.
sijie closed issue #7554:
URL: https://github.com/apache/pulsar/issues/7554


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org