You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/09/26 13:37:57 UTC

[GitHub] [pulsar] AnonHxy opened a new pull request, #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed if meet NonRecoveā€¦

AnonHxy opened a new pull request, #17842:
URL: https://github.com/apache/pulsar/pull/17842

   
   ### Motivation
   
   Without consumer acking messages, the `PersistentMessageExpiryMonitor` expires messages process could be very slow if the ledger `PersistentMessageExpiryMonitor`  be read has been deleted from the bookkeeper.
   
   The root cause is that,  when `autoSkipNonRecoverableData` is true and the ledger to be read has been deleted, the `expireMessages` will read entry failed and set the markDeletePosition as the failed position, see Line194.  And then waiting to be scheduled(defalut value is 5min)
   https://github.com/apache/pulsar/blob/49eaa548b97eca06a6aff5d79391d49f085adb9e/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java#L186-L196
   
   In the next scheduled, the `PersistentMessageExpiryMonitor` will read the next position of markDeletePosition , which is also not existed because they may belong to the same ledger.
   
   So the markDeletePosition just go forward 1 every 5min. And if there are  many deleted ledgers, it will lead to the ttl expire lose efficacy
   
   ### Modifications
   
   If read entry failed with ledger not existed exceptions,  we will set the markDeletePosition as the last position of the deleted ledger.
   
   ### Verifying this change
   
   - [x] Make sure that the change passes the CI checks.
   - `org.apache.pulsar.broker.service.PersistentMessageFinderTest#testMessageExpiryWithTimestampNonRecoverableException` has coverd this change
   
   ### Does this pull request potentially affect one of the following parts:
   
   
   ### Documentation
   
   - [x] `doc-not-needed` 
   (Please explain why)
   
   ### Matching PR in forked repository
   
   PR in forked repository: <!-- ENTER URL HERE 
   
   After opening this PR, the build in apache/pulsar will fail and instructions will
   be provided for opening a PR in the PR author's forked repository.
   
   apache/pulsar pull requests should be first tested in your own fork since the 
   apache/pulsar CI based on GitHub Actions has constrained resources and quota.
   GitHub Actions provides separate quota for pull requests that are executed in 
   a forked repository.
   
   The tests will be run in the forked repository until all PR review comments have
   been handled, the tests pass and the PR is approved by a reviewer.
   
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] Jason918 commented on a diff in pull request #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed when ledger not existed

Posted by GitBox <gi...@apache.org>.
Jason918 commented on code in PR #17842:
URL: https://github.com/apache/pulsar/pull/17842#discussion_r980824852


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java:
##########
@@ -191,9 +192,38 @@ public void findEntryFailed(ManagedLedgerException exception, Optional<Position>
                 && (exception instanceof NonRecoverableLedgerException)) {
             log.warn("[{}][{}] read failed from ledger at position:{} : {}", topicName, subName, failedReadPosition,
                     exception.getMessage());
-            findEntryComplete(failedReadPosition.get(), ctx);
+            if (isLedgerNotExistException(((NonRecoverableLedgerException) exception).getBkErrorCode())) {

Review Comment:
   It's better to put `isLedgerNotExistException` into `NonRecoverableLedgerException`.



##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java:
##########
@@ -191,9 +192,38 @@ public void findEntryFailed(ManagedLedgerException exception, Optional<Position>
                 && (exception instanceof NonRecoverableLedgerException)) {
             log.warn("[{}][{}] read failed from ledger at position:{} : {}", topicName, subName, failedReadPosition,
                     exception.getMessage());
-            findEntryComplete(failedReadPosition.get(), ctx);
+            if (isLedgerNotExistException(((NonRecoverableLedgerException) exception).getBkErrorCode())) {
+                try {
+                    long failedLedgerId = failedReadPosition.get().getLedgerId();
+                    Position lastPositionInLedger = PositionImpl.get(failedLedgerId,
+                            cursor.getManagedLedger().getLedgerInfo(failedLedgerId).get().getEntries() - 1);
+                    log.info("[{}][{}] ledger not existed, will complete the last position of the non-existed"
+                                    + " ledger:{}", topicName, subName, lastPositionInLedger);
+                    findEntryComplete(lastPositionInLedger, ctx);
+                } catch (Exception e) {

Review Comment:
   Any known exception goes here? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] AnonHxy commented on pull request #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed when ledger not existed

Posted by GitBox <gi...@apache.org>.
AnonHxy commented on PR #17842:
URL: https://github.com/apache/pulsar/pull/17842#issuecomment-1264876030

   /pulsarbot run-failure-checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] AnonHxy commented on pull request #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed when ledger not existed

Posted by GitBox <gi...@apache.org>.
AnonHxy commented on PR #17842:
URL: https://github.com/apache/pulsar/pull/17842#issuecomment-1258938742

   /pulsarbot ready-to-test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] AnonHxy commented on a diff in pull request #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed when ledger not existed

Posted by GitBox <gi...@apache.org>.
AnonHxy commented on code in PR #17842:
URL: https://github.com/apache/pulsar/pull/17842#discussion_r980935534


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java:
##########
@@ -191,9 +192,38 @@ public void findEntryFailed(ManagedLedgerException exception, Optional<Position>
                 && (exception instanceof NonRecoverableLedgerException)) {
             log.warn("[{}][{}] read failed from ledger at position:{} : {}", topicName, subName, failedReadPosition,
                     exception.getMessage());
-            findEntryComplete(failedReadPosition.get(), ctx);
+            if (isLedgerNotExistException(((NonRecoverableLedgerException) exception).getBkErrorCode())) {
+                try {
+                    long failedLedgerId = failedReadPosition.get().getLedgerId();
+                    Position lastPositionInLedger = PositionImpl.get(failedLedgerId,
+                            cursor.getManagedLedger().getLedgerInfo(failedLedgerId).get().getEntries() - 1);
+                    log.info("[{}][{}] ledger not existed, will complete the last position of the non-existed"
+                                    + " ledger:{}", topicName, subName, lastPositionInLedger);
+                    findEntryComplete(lastPositionInLedger, ctx);
+                } catch (Exception e) {

Review Comment:
   Yes.  `getLedgerInfo(failedLedgerId).get()` at line199 will throw exception and we must catch it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] AnonHxy commented on a diff in pull request #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed when ledger not existed

Posted by GitBox <gi...@apache.org>.
AnonHxy commented on code in PR #17842:
URL: https://github.com/apache/pulsar/pull/17842#discussion_r981958672


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java:
##########
@@ -191,9 +192,38 @@ public void findEntryFailed(ManagedLedgerException exception, Optional<Position>
                 && (exception instanceof NonRecoverableLedgerException)) {
             log.warn("[{}][{}] read failed from ledger at position:{} : {}", topicName, subName, failedReadPosition,
                     exception.getMessage());
-            findEntryComplete(failedReadPosition.get(), ctx);
+            if (isLedgerNotExistException(((NonRecoverableLedgerException) exception).getBkErrorCode())) {
+                try {
+                    long failedLedgerId = failedReadPosition.get().getLedgerId();
+                    Position lastPositionInLedger = PositionImpl.get(failedLedgerId,
+                            cursor.getManagedLedger().getLedgerInfo(failedLedgerId).get().getEntries() - 1);
+                    log.info("[{}][{}] ledger not existed, will complete the last position of the non-existed"
+                                    + " ledger:{}", topicName, subName, lastPositionInLedger);
+                    findEntryComplete(lastPositionInLedger, ctx);
+                } catch (Exception e) {

Review Comment:
   Maybe we could use `ManagedLedgerImpl#getLedgersInfo`, which will also avoid this catch exception



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] AnonHxy commented on a diff in pull request #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed when ledger not existed

Posted by GitBox <gi...@apache.org>.
AnonHxy commented on code in PR #17842:
URL: https://github.com/apache/pulsar/pull/17842#discussion_r980941095


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java:
##########
@@ -191,9 +192,38 @@ public void findEntryFailed(ManagedLedgerException exception, Optional<Position>
                 && (exception instanceof NonRecoverableLedgerException)) {
             log.warn("[{}][{}] read failed from ledger at position:{} : {}", topicName, subName, failedReadPosition,
                     exception.getMessage());
-            findEntryComplete(failedReadPosition.get(), ctx);
+            if (isLedgerNotExistException(((NonRecoverableLedgerException) exception).getBkErrorCode())) {

Review Comment:
   Nice suggestion. I have updated. PTAL @Jason918 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] Pomelongan commented on a diff in pull request #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed when ledger not existed

Posted by GitBox <gi...@apache.org>.
Pomelongan commented on code in PR #17842:
URL: https://github.com/apache/pulsar/pull/17842#discussion_r984180073


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java:
##########
@@ -191,7 +193,28 @@ public void findEntryFailed(ManagedLedgerException exception, Optional<Position>
                 && (exception instanceof NonRecoverableLedgerException)) {
             log.warn("[{}][{}] read failed from ledger at position:{} : {}", topicName, subName, failedReadPosition,
                     exception.getMessage());
-            findEntryComplete(failedReadPosition.get(), ctx);
+            if (exception instanceof LedgerNotExistException) {
+                long failedLedgerId = failedReadPosition.get().getLedgerId();

Review Comment:
   I think it would be better to add a value to judge whether the result of `failedReadPosition. get()` is null



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] Pomelongan commented on a diff in pull request #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed when ledger not existed

Posted by GitBox <gi...@apache.org>.
Pomelongan commented on code in PR #17842:
URL: https://github.com/apache/pulsar/pull/17842#discussion_r984180073


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java:
##########
@@ -191,7 +193,28 @@ public void findEntryFailed(ManagedLedgerException exception, Optional<Position>
                 && (exception instanceof NonRecoverableLedgerException)) {
             log.warn("[{}][{}] read failed from ledger at position:{} : {}", topicName, subName, failedReadPosition,
                     exception.getMessage());
-            findEntryComplete(failedReadPosition.get(), ctx);
+            if (exception instanceof LedgerNotExistException) {
+                long failedLedgerId = failedReadPosition.get().getLedgerId();

Review Comment:
   I think it would be better to add a value to judge whether the result of `failedReadPosition. get()` is null



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] AnonHxy commented on pull request #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed when ledger not existed

Posted by GitBox <gi...@apache.org>.
AnonHxy commented on PR #17842:
URL: https://github.com/apache/pulsar/pull/17842#issuecomment-1263178678

   @codelipenghui @Technoboy-  PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] Jason918 commented on a diff in pull request #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed when ledger not existed

Posted by GitBox <gi...@apache.org>.
Jason918 commented on code in PR #17842:
URL: https://github.com/apache/pulsar/pull/17842#discussion_r981223776


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java:
##########
@@ -191,9 +192,38 @@ public void findEntryFailed(ManagedLedgerException exception, Optional<Position>
                 && (exception instanceof NonRecoverableLedgerException)) {
             log.warn("[{}][{}] read failed from ledger at position:{} : {}", topicName, subName, failedReadPosition,
                     exception.getMessage());
-            findEntryComplete(failedReadPosition.get(), ctx);
+            if (isLedgerNotExistException(((NonRecoverableLedgerException) exception).getBkErrorCode())) {
+                try {
+                    long failedLedgerId = failedReadPosition.get().getLedgerId();
+                    Position lastPositionInLedger = PositionImpl.get(failedLedgerId,
+                            cursor.getManagedLedger().getLedgerInfo(failedLedgerId).get().getEntries() - 1);
+                    log.info("[{}][{}] ledger not existed, will complete the last position of the non-existed"
+                                    + " ledger:{}", topicName, subName, lastPositionInLedger);
+                    findEntryComplete(lastPositionInLedger, ctx);
+                } catch (Exception e) {

Review Comment:
   It's a fake async operation.
   We can add a `getOptionalLedgerInfo(ledgerId)` in `ManagedLedger` to avoid this exception.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] AnonHxy commented on a diff in pull request #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed when ledger not existed

Posted by GitBox <gi...@apache.org>.
AnonHxy commented on code in PR #17842:
URL: https://github.com/apache/pulsar/pull/17842#discussion_r983118055


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java:
##########
@@ -191,9 +192,38 @@ public void findEntryFailed(ManagedLedgerException exception, Optional<Position>
                 && (exception instanceof NonRecoverableLedgerException)) {
             log.warn("[{}][{}] read failed from ledger at position:{} : {}", topicName, subName, failedReadPosition,
                     exception.getMessage());
-            findEntryComplete(failedReadPosition.get(), ctx);
+            if (isLedgerNotExistException(((NonRecoverableLedgerException) exception).getBkErrorCode())) {
+                try {
+                    long failedLedgerId = failedReadPosition.get().getLedgerId();
+                    Position lastPositionInLedger = PositionImpl.get(failedLedgerId,
+                            cursor.getManagedLedger().getLedgerInfo(failedLedgerId).get().getEntries() - 1);
+                    log.info("[{}][{}] ledger not existed, will complete the last position of the non-existed"
+                                    + " ledger:{}", topicName, subName, lastPositionInLedger);
+                    findEntryComplete(lastPositionInLedger, ctx);
+                } catch (Exception e) {

Review Comment:
   > We can add a getOptionalLedgerInfo(ledgerId) in ManagedLedger to avoid this exception
   
   It makes sense to me.  Adding `getOptionalLedgerInfo` will make the code more  clear.  Updated. PTAL @Jason918 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [pulsar] AnonHxy merged pull request #17842: [improve][broker]Improve PersistentMessageExpiryMonitor expire speed when ledger not existed

Posted by GitBox <gi...@apache.org>.
AnonHxy merged PR #17842:
URL: https://github.com/apache/pulsar/pull/17842


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org