You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/11/18 14:51:38 UTC

[GitHub] [pulsar] milos-matijasevic opened a new issue #8615: Failed to read ledger from s3, NPE

milos-matijasevic opened a new issue #8615:
URL: https://github.com/apache/pulsar/issues/8615


   **Describe the bug**
   Consumers are stuck and logs from the broker are:
   ```
   ERROR org.apache.bookkeeper.mledger.offload.jcloud.impl.BlobStoreManagedLedgerOffloader - Failed readOffloaded:
   java.lang.NullPointerException: null
   	at org.apache.bookkeeper.mledger.offload.jcloud.impl.BlobStoreManagedLedgerOffloader.lambda$new$1(BlobStoreManagedLedgerOffloader.java:153) ~[?:?]
   	at org.apache.bookkeeper.mledger.offload.jcloud.impl.BlobStoreBackedReadHandleImpl.open(BlobStoreBackedReadHandleImpl.java:196) ~[?:?]
   	at org.apache.bookkeeper.mledger.offload.jcloud.impl.BlobStoreManagedLedgerOffloader.lambda$readOffloaded$5(BlobStoreManagedLedgerOffloader.java:556) ~[?:?]
   	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_252]
   	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) [com.google.guava-guava-25.1-jre.jar:?]
   	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) [com.google.guava-guava-25.1-jre.jar:?]
   	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) [com.google.guava-guava-25.1-jre.jar:?]
   	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_252]
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
   	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_252]
   	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_252]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
   	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
   11:42:55.978 [bookkeeper-ml-workers-OrderedExecutor-3-0] ERROR org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [public/default/persistent/xxx] Error opening ledger for reading at position 45043:0 - org.apache.bookkeeper.mledger.ManagedLedgerException: Unknown exception
   ```
   
   This is maybe related to the error we got after cluster auto-upgraded to v2.6.2 and we got this error (https://gist.github.com/milos-matijasevic/1502d90293bb89ce4fb16b4c61bb81a0), and also zookeeper went to a state where they don't see each other and brokers were crashing so we downgraded to 2.6.1 (which we used before). (This should probably be a separate issue)
   
   When i try to search for this bucket in s3 with
   ```bash
   aws s3api list-objects --bucket bucketname --query "Contents[?contains(Key, 'ledger-45043')]"
   ```
   it's not there, and stats-internal for this topic returns like everything is ok with that ledger:
   ```
   {
       "ledgerId" : 45043,
       "entries" : 53640,
       "size" : 26299313,
       "offloaded" : true
     }
   ```
   
   We found one more ledger in a different topic with the same problem.
   
   We made our consumers run with skipping messages for that ledger and continue reading from a next ledger(next ledger is there), but in every next reading the topic where start position is before this ledger will be a problem, is there a way to fix this, or a way to maybe at least delete this corrupted ledgers?
   
   While i am writing this, the same thing happened to some new ledger.
   Nothing is strange for these ledgers, size and entries are the same as for other ledgers.
   
   **Desktop (please complete the following information):**
    - OS: Linux
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org