You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/09/16 23:04:33 UTC

[GitHub] [pulsar] dlg99 opened a new pull request #12071: Fix: Debezium connector hung and failed to restart, "Failed to get batch size for entry .. No such ledger exists on Metadata Server" logged for the offset topic

dlg99 opened a new pull request #12071:
URL: https://github.com/apache/pulsar/pull/12071


   <!--
   ### Contribution Checklist
     
     - Name the pull request in the form "[Issue XYZ][component] Title of the pull request", where *XYZ* should be replaced by the actual issue number.
       Skip *Issue XYZ* if there is no associated github issue for this pull request.
       Skip *component* if you are unsure about which is the best component. E.g. `[docs] Fix typo in produce method`.
   
     - Fill out the template below to describe the changes contributed by the pull request. That will give reviewers the context they need to do the review.
     
     - Each pull request should address only one issue, not mix up code from multiple issues.
     
     - Each commit in the pull request has a meaningful commit message
   
     - Once all items of the checklist are addressed, remove the above text and this checklist, leaving only the filled out template below.
   
   **(The sections below can be removed for hotfixes of typos)**
   -->
   
   Fixes #12070 
   
   ### Motivation
   
   Bug fix
   
   ### Modifications
   
   Check if the ledger exists, do not isse read if it does not.
   Added extra logging.
   
   ### Verifying this change
   
   Could not repro the problem (neither e2e nor in the tests). I am open for suggestions.
   See details in the https://github.com/apache/pulsar/issues/12070
   
   ### Does this pull request potentially affect one of the following parts:
   
   *If `yes` was chosen, please highlight the changes*
   
   No
   
   ### Documentation
   
   Check the box below and label this PR (if you have committer privilege).
   
   Need to update docs? 
     
   - [x] no-need-doc 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli commented on a change in pull request #12071: Fail managed ledger initialization if LCE belongs to non-existing ledger

Posted by GitBox <gi...@apache.org>.
eolivelli commented on a change in pull request #12071:
URL: https://github.com/apache/pulsar/pull/12071#discussion_r718249013



##########
File path: managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
##########
@@ -320,9 +319,44 @@ public ManagedLedgerImpl(ManagedLedgerFactoryImpl factory, BookKeeper bookKeeper
         }
     }
 
-    synchronized void initialize(final ManagedLedgerInitializeLedgerCallback callback, final Object ctx) {
+    synchronized void initialize(final ManagedLedgerInitializeLedgerCallback originalCb, final Object ctx) {
         log.info("Opening managed ledger {}", name);
 
+        ManagedLedgerInitializeLedgerCallback callback = new ManagedLedgerInitializeLedgerCallback() {
+            @Override
+            public void initializeComplete() {
+                if (lastConfirmedEntry != null && lastConfirmedEntry.entryId != -1) {
+                    // check that ledger used by lastConfirmedEntry actually exists
+                    try {
+                        getLedgerMetadata(lastConfirmedEntry.getLedgerId()).get();

Review comment:
       should we chain this CompletableFuture and call` originalCb.initializeComplete();` on completion ?
   otherwise we are introducing a blocking call in the chain, also getLedgerMetadata will be a blocking call related to ZK client
   
   ```
   getLedgerMetadata(lastConfirmedEntry.getLedgerId()).thenRun( () -> {
         originalCb.initializeComplete();
   }).exceptionally((err) -> {
        originalCb.initializeFailed(err);
        ....
   });
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari edited a comment on pull request #12071: Fail managed ledger initialization if LCE belongs to non-existing ledger

Posted by GitBox <gi...@apache.org>.
lhotari edited a comment on pull request #12071:
URL: https://github.com/apache/pulsar/pull/12071#issuecomment-1057219357


   Please rebase once again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on pull request #12071: Fail managed ledger initialization if LCE belongs to non-existing ledger

Posted by GitBox <gi...@apache.org>.
lhotari commented on pull request #12071:
URL: https://github.com/apache/pulsar/pull/12071#issuecomment-1057219357


   Please rebased once again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli commented on pull request #12071: Fix: Debezium connector hung and failed to restart, "Failed to get batch size for entry .. No such ledger exists on Metadata Server" logged for the offset topic

Posted by GitBox <gi...@apache.org>.
eolivelli commented on pull request #12071:
URL: https://github.com/apache/pulsar/pull/12071#issuecomment-921518028


   In order to fall into the case that you are fixing here the topic lastPosition should point to a ledger that does not exist anymore.
   You found evidence that the ledger has been "trimmed".
   
   From the logs you reported it looks like we are falling into a case in which we trim the ledger and we know that this will impact lastMessageId.
   See here
   https://github.com/apache/pulsar/blob/d41973454964ef4466df5e192c0d806496473293/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java#L2435
   
   Maybe you can look into the tests relevant to the commit that introduced that comment and take inspiration to reproduce the problem
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli commented on a change in pull request #12071: Fail managed ledger initialization if LCE belongs to non-existing ledger

Posted by GitBox <gi...@apache.org>.
eolivelli commented on a change in pull request #12071:
URL: https://github.com/apache/pulsar/pull/12071#discussion_r718249013



##########
File path: managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
##########
@@ -320,9 +319,44 @@ public ManagedLedgerImpl(ManagedLedgerFactoryImpl factory, BookKeeper bookKeeper
         }
     }
 
-    synchronized void initialize(final ManagedLedgerInitializeLedgerCallback callback, final Object ctx) {
+    synchronized void initialize(final ManagedLedgerInitializeLedgerCallback originalCb, final Object ctx) {
         log.info("Opening managed ledger {}", name);
 
+        ManagedLedgerInitializeLedgerCallback callback = new ManagedLedgerInitializeLedgerCallback() {
+            @Override
+            public void initializeComplete() {
+                if (lastConfirmedEntry != null && lastConfirmedEntry.entryId != -1) {
+                    // check that ledger used by lastConfirmedEntry actually exists
+                    try {
+                        getLedgerMetadata(lastConfirmedEntry.getLedgerId()).get();

Review comment:
       should we chain this CompletableFuture and call` originalCb.initializeComplete();` on completion ?
   otherwise we are introducing a blocking call in the chain, also getLedgerMetadata will be a blocking call related to ZK client
   
   ```
   getLedgerMetadata(lastConfirmedEntry.getLedgerId()).thenRun( () -> {
         originalCb.initializeComplete();
   }).exceptionally((err) -> {
        originalCb.initializeFailed(err);
        ....
   });
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] dlg99 closed pull request #12071: Fail managed ledger initialization if LCE belongs to non-existing ledger

Posted by GitBox <gi...@apache.org>.
dlg99 closed pull request #12071:
URL: https://github.com/apache/pulsar/pull/12071


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie commented on pull request #12071: Fail managed ledger initialization if LCE belongs to non-existing ledger

Posted by GitBox <gi...@apache.org>.
sijie commented on pull request #12071:
URL: https://github.com/apache/pulsar/pull/12071#issuecomment-929690531


   @codelipenghui @merlimat Can you review this?
   
   This might be related to a PR (https://github.com/apache/pulsar/pull/12161) that @codelipenghui fixed a couple of days ago.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie commented on pull request #12071: Fail managed ledger initialization if LCE belongs to non-existing ledger

Posted by GitBox <gi...@apache.org>.
sijie commented on pull request #12071:
URL: https://github.com/apache/pulsar/pull/12071#issuecomment-929690531


   @codelipenghui @merlimat Can you review this?
   
   This might be related to a PR (https://github.com/apache/pulsar/pull/12161) that @codelipenghui fixed a couple of days ago.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] dlg99 commented on a change in pull request #12071: Fail managed ledger initialization if LCE belongs to non-existing ledger

Posted by GitBox <gi...@apache.org>.
dlg99 commented on a change in pull request #12071:
URL: https://github.com/apache/pulsar/pull/12071#discussion_r797222317



##########
File path: managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
##########
@@ -320,9 +319,44 @@ public ManagedLedgerImpl(ManagedLedgerFactoryImpl factory, BookKeeper bookKeeper
         }
     }
 
-    synchronized void initialize(final ManagedLedgerInitializeLedgerCallback callback, final Object ctx) {
+    synchronized void initialize(final ManagedLedgerInitializeLedgerCallback originalCb, final Object ctx) {
         log.info("Opening managed ledger {}", name);
 
+        ManagedLedgerInitializeLedgerCallback callback = new ManagedLedgerInitializeLedgerCallback() {
+            @Override
+            public void initializeComplete() {
+                if (lastConfirmedEntry != null && lastConfirmedEntry.entryId != -1) {
+                    // check that ledger used by lastConfirmedEntry actually exists
+                    try {
+                        getLedgerMetadata(lastConfirmedEntry.getLedgerId()).get();

Review comment:
       fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli commented on pull request #12071: Fail managed ledger initialization if LCE belongs to non-existing ledger

Posted by GitBox <gi...@apache.org>.
eolivelli commented on pull request #12071:
URL: https://github.com/apache/pulsar/pull/12071#issuecomment-1026581147


   @dlg99 we probably forgot about this patch.
   do you have updates ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org