You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/05/06 14:07:36 UTC

[GitHub] [pulsar] bback99 opened a new issue #6894: Still having issues of Failed to restore rockdb

bback99 opened a new issue #6894:
URL: https://github.com/apache/pulsar/issues/6894


   **Describe the bug**
   we have accidentally addressed "Still having issues of Failed to restore rockdb"
   when we are running as standalone mode.
   and didn't changed any configurations for bookkeeper.
   
   might related to this
   https://github.com/apache/pulsar/issues/5668
   
   with **-nss**, looks fine now. 
   
   so, we should run with **-nss** until having some changes?
   
   **To Reproduce**
   Logs
   13:29:28.310 [io-write-scheduler-OrderedScheduler-0-0] WARN  org.apache.bookkeeper.stream.storage.impl.sc.ZkStorageContainerManager - Failed to start storage container (0)
   java.util.concurrent.CompletionException: org.apache.bookkeeper.statelib.api.exceptions.StateStoreException: Failed to restore rocksdb 000000000000000000/000000000000000000/000000000000000000
   	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_242]
   	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_242]
   	at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:957) ~[?:1.8.0_242]
   	at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940) ~[?:1.8.0_242]
   	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_242]
   	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) ~[?:1.8.0_242]
   	at org.apache.bookkeeper.statelib.impl.journal.AbstractStateStoreWithJournal.lambda$executeIO$16(AbstractStateStoreWithJournal.java:474) ~[org.apache.bookkeeper-statelib-4.10.0.jar:4.10.0]
   	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_242]
   	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) [com.google.guava-guava-25.1-jre.jar:?]
   	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) [com.google.guava-guava-25.1-jre.jar:?]
   	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) [com.google.guava-guava-25.1-jre.jar:?]
   	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_242]
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242]
   	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_242]
   	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_242]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
   	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.45.Final.jar:4.1.45.Final]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
   Caused by: org.apache.bookkeeper.statelib.api.exceptions.StateStoreException: Failed to restore rocksdb 000000000000000000/000000000000000000/000000000000000000
   	at org.apache.bookkeeper.statelib.impl.rocksdb.checkpoint.RocksCheckpointer.restore(RocksCheckpointer.java:84) ~[org.apache.bookkeeper-statelib-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.statelib.impl.kv.RocksdbKVStore.loadRocksdbFromCheckpointStore(RocksdbKVStore.java:161) ~[org.apache.bookkeeper-statelib-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.statelib.impl.kv.RocksdbKVStore.init(RocksdbKVStore.java:223) ~[org.apache.bookkeeper-statelib-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.statelib.impl.journal.AbstractStateStoreWithJournal.lambda$initializeLocalStore$5(AbstractStateStoreWithJournal.java:202) ~[org.apache.bookkeeper-statelib-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.statelib.impl.journal.AbstractStateStoreWithJournal.lambda$executeIO$16(AbstractStateStoreWithJournal.java:471) ~[org.apache.bookkeeper-statelib-4.10.0.jar:4.10.0]
   	... 12 more
   Caused by: org.apache.distributedlog.exceptions.LogEmptyException: Log 000000000000000000/000000000000000000/000000000000000000/checkpoints/e6ac48ab-1045-472e-89d0-95686a71ee8d/metadata:<default> has no records
   	at org.apache.distributedlog.BKLogHandler$2$1.onSuccess(BKLogHandler.java:245) ~[org.apache.distributedlog-distributedlog-core-4.10.0.jar:4.10.0]
   	at org.apache.distributedlog.BKLogHandler$2$1.onSuccess(BKLogHandler.java:239) ~[org.apache.distributedlog-distributedlog-core-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:42) ~[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:26) ~[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) ~[?:1.8.0_242]
   	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) ~[?:1.8.0_242]
   	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_242]
   	at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) ~[?:1.8.0_242]
   	at org.apache.distributedlog.BKLogHandler.readLogSegmentsFromStore(BKLogHandler.java:636) ~[org.apache.distributedlog-distributedlog-core-4.10.0.jar:4.10.0]
   	at org.apache.distributedlog.BKLogHandler$6.onSuccess(BKLogHandler.java:600) ~[org.apache.distributedlog-distributedlog-core-4.10.0.jar:4.10.0]
   	at org.apache.distributedlog.BKLogHandler$6.onSuccess(BKLogHandler.java:592) ~[org.apache.distributedlog-distributedlog-core-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:42) ~[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.common.concurrent.FutureEventListener.accept(FutureEventListener.java:26) ~[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) ~[?:1.8.0_242]
   	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) ~[?:1.8.0_242]
   	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_242]
   	at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) ~[?:1.8.0_242]
   	at org.apache.distributedlog.impl.ZKLogSegmentMetadataStore.processResult(ZKLogSegmentMetadataStore.java:377) ~[org.apache.distributedlog-distributedlog-core-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.zookeeper.ZooKeeperClient$25$1.processResult(ZooKeeperClient.java:1174) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:627) ~[org.apache.pulsar-pulsar-zookeeper-2.5.1.jar:2.5.1]
   	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) ~[org.apache.pulsar-pulsar-zookeeper-2.5.1.jar:2.5.1]
   
   
   **Expected behavior**
   A clear and concise description of what you expected to happen.
   
   **Screenshots**
   If applicable, add screenshots to help explain your problem.
   
   **Desktop (please complete the following information):**
    - OS: [e.g. iOS]
   
   **Additional context**
   this issue is on Pulsar 2.5.1
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] yitian108 commented on issue #6894: Still having issues of Failed to restore rockdb

Posted by GitBox <gi...@apache.org>.
yitian108 commented on issue #6894:
URL: https://github.com/apache/pulsar/issues/6894#issuecomment-813970547


   Today, the issue still persist, v2.7 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] ta1meng commented on issue #6894: Still having issues of Failed to restore rockdb

Posted by GitBox <gi...@apache.org>.
ta1meng commented on issue #6894:
URL: https://github.com/apache/pulsar/issues/6894#issuecomment-858149448


   Still happening in v2.7.2. Both my co-worker and I have been running into this rockdb error intermittently when running Pulsar standalone.
   
   Can someone explain what the argumennt -nss (--no-stream-storage) does?
   
   The command line help states:
   
   ```
       -nss, --no-stream-storage
         Disable stream storage
         Default: false
   ```
   
   What functionality is disabled when we disable stream storage?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] nicolo-paganin commented on issue #6894: Still having issues of Failed to restore rockdb

Posted by GitBox <gi...@apache.org>.
nicolo-paganin commented on issue #6894:
URL: https://github.com/apache/pulsar/issues/6894#issuecomment-682432305


   I still have this error in pulsar 2.6.0, any news?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] ta1meng edited a comment on issue #6894: Still having issues of Failed to restore rockdb

Posted by GitBox <gi...@apache.org>.
ta1meng edited a comment on issue #6894:
URL: https://github.com/apache/pulsar/issues/6894#issuecomment-858149448


   Still happening in v2.7.1. Both my co-worker and I have been running into this rockdb error intermittently when running Pulsar standalone.
   
   Can someone explain what the argumennt -nss (--no-stream-storage) does?
   
   The command line help states:
   
   ```
       -nss, --no-stream-storage
         Disable stream storage
         Default: false
   ```
   
   What functionality is disabled when we disable stream storage?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] qq459673705 commented on issue #6894: Still having issues of Failed to restore rockdb

Posted by GitBox <gi...@apache.org>.
qq459673705 commented on issue #6894:
URL: https://github.com/apache/pulsar/issues/6894#issuecomment-865863936


   This problem is caused by https://github.com/apache/bookkeeper/issues/2357
   First the pulsar server log showed me that :  Not enough non-faulty bookies available.
   Then a minutes later , the log showed  me this:  io.netty.util.internal.OutOfDirectMemoryError: ....
   After 5 times retry,Pulsar server is shutdown.
   When i found this, i tried to restart the pulsar server, but the log tell me that:  Failed to restore rocksdb...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] narzach commented on issue #6894: Still having issues of Failed to restore rockdb

Posted by GitBox <gi...@apache.org>.
narzach commented on issue #6894:
URL: https://github.com/apache/pulsar/issues/6894#issuecomment-783407511


   Any updates on this? I see this failure very frequently when running a standalone Pulsar cluster locally in docker.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] darkredz commented on issue #6894: Still having issues of Failed to restore rockdb

Posted by GitBox <gi...@apache.org>.
darkredz commented on issue #6894:
URL: https://github.com/apache/pulsar/issues/6894#issuecomment-745810521


   I am also seeing this on 2.6.2


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] devinbost commented on issue #6894: Still having issues of Failed to restore rockdb

Posted by GitBox <gi...@apache.org>.
devinbost commented on issue #6894:
URL: https://github.com/apache/pulsar/issues/6894#issuecomment-702573378


   I'm also seeing it here on 2.6.1: https://github.com/apache/pulsar/issues/8184


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] ta1meng commented on issue #6894: Still having issues of Failed to restore rockdb

Posted by GitBox <gi...@apache.org>.
ta1meng commented on issue #6894:
URL: https://github.com/apache/pulsar/issues/6894#issuecomment-858150327


   Also today, I learned from the Pulsar monthly update hosted by StreamNative that Pulsar 2.8 will have some rockdb fixes, so maybe this problem would be resolved in Pulsar 2.8.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org