You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/12/17 16:21:45 UTC

[GitHub] [pulsar] milos-matijasevic opened a new issue #8991: NPE while reading entries

milos-matijasevic opened a new issue #8991:
URL: https://github.com/apache/pulsar/issues/8991


   **Describe the bug**
   Null pointer exception while reading entries
   ```
   00:14:43.672 [pulsar-io-25-1] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/xx] Rewind from 85116:72902 to 85116:72902
   00:14:43.672 [bookkeeper-ml-workers-OrderedExecutor-0-0] WARN  org.apache.bookkeeper.mledger.impl.OpReadEntry - [public/default/persistent/x][x] read failed from ledger at position:85116:72902 : Unknown exception
   00:14:43.672 [broker-topic-workers-OrderedScheduler-5-0] ERROR org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer - [persistent://public/default/x/ x-Consumer{subscription=PersistentSubscription{topic=persistent://public/default/x, name=x}, consumerId=1, consumerName=guvgf, address=/172.16.37.221:41704}] Error reading entries at 85116:72902 : Unknown exception - Retrying to read in 15.0 seconds
   ```
   
   ```
   00:20:06.485 [bookkeeper-ml-workers-OrderedExecutor-5-0] ERROR org.apache.bookkeeper.common.util.SafeRunnable - Unexpected throwable caught
   java.lang.NullPointerException: null
   	at org.apache.bookkeeper.mledger.impl.OpReadEntry.lambda$readEntriesFailed$0(OpReadEntry.java:94) ~[org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   	at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) ~[org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   	at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run(OrderedExecutor.java:203) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run(OrderedExecutor.java:203) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
   	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
   ```
   **To Reproduce**
   Pulsar go client is used, and consumer which tries to read entries.
   
   It fails here https://github.com/apache/pulsar/blob/v2.6.1/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java#L94
   in line `callback.readEntriesComplete(entries, ctx);`
   ```java
   if (!entries.isEmpty()) {
       // There were already some entries that were read before, we can return them
       cursor.ledger.getExecutor().execute(safeRun(() -> {
           callback.readEntriesComplete(entries, ctx);
           recycle();
       }));
   }
   ```
   The pipeline was that cursors are set to entry by the `SeekByTime` function, after that first message is consumed just to print it and it wasn't acked. So this probably happened after that.
   
   Also, another exception occurred during resetting cursors
   ```
   10:36:19.518 [bookkeeper-ml-workers-OrderedExecutor-4-0] WARN  org.apache.pulsar.broker.service.ServerCnx - [/172.16.41.184:38110][PersistentSubscription{topic=persistent://public/default/z, name=z}] Failed to reset subscription: org.apache.pulsar.broker.service.BrokerServiceException: org.apache.bookkeeper.mledger.ManagedLedgerException: Unknown exception
   java.util.concurrent.CompletionException: org.apache.pulsar.broker.service.BrokerServiceException: org.apache.bookkeeper.mledger.ManagedLedgerException: Unknown exception
   	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_252]
   	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_252]
   	at java.util.concurrent.CompletableFuture.uniRun(CompletableFuture.java:714) ~[?:1.8.0_252]
   	at java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:701) ~[?:1.8.0_252]
   	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_252]
   	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) ~[?:1.8.0_252]
   	at org.apache.pulsar.broker.service.persistent.PersistentSubscription$5.findEntryFailed(PersistentSubscription.java:656) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   	at org.apache.pulsar.broker.service.persistent.PersistentMessageFinder.findEntryFailed(PersistentMessageFinder.java:118) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   	at org.apache.bookkeeper.mledger.impl.OpFindNewest.readEntryFailed(OpFindNewest.java:116) ~[org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   	at org.apache.bookkeeper.mledger.impl.EntryCacheImpl.lambda$asyncReadEntry0$0(EntryCacheImpl.java:215) ~[org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) [?:1.8.0_252]
   	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) [?:1.8.0_252]
   	at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456) [?:1.8.0_252]
   	at org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run(OrderedExecutor.java:203) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
   	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
   Caused by: org.apache.pulsar.broker.service.BrokerServiceException: org.apache.bookkeeper.mledger.ManagedLedgerException: Unknown exception
   	... 12 more
   Caused by: org.apache.bookkeeper.mledger.ManagedLedgerException: Unknown exception
   10:36:19.570 [pulsar-io-25-2] INFO  org.apache.pulsar.broker.service.ServerCnx - Closed connection from /172.16.41.184:38110
   ```
   
   But after trying multiple times thay are reset to successfully.
   
   There were multiple subscriptions for the same topic, but this didn't happen for all subscriptions.
   
   Cursors somehow recover and continue but this is cursor stats.
   ```
   "cursors" : {
         "markDeletePosition" : "87378:46711",
         "readPosition" : "87378:47273",
         "waitingReadOp" : false,
         "pendingReadOps" : -20,
         "messagesConsumedCounter" : -18228941,
         "cursorLedger" : 88621,
         "cursorLedgerLastEntry" : 781,
         "individuallyDeletedMessages" : "[]",
         "lastLedgerSwitchTimestamp" : "2020-12-17T15:36:44.377Z",
         "state" : "Open",
         "numberOfEntriesSinceFirstNotAckedMessage" : 562,
         "totalNonContiguousDeletedMessagesRange" : 0,
         "properties" : { }
       },
   ```
   
   I think every time this happens, `pendingReadOps`  decreases. While i am writing this `pendingReadOps` goes to -25
   
   Also, this happens only on topics that have multiple subscriptions.
   
   **Desktop (please complete the following information):**
    - OS: Linux
    - Pulsar: v2.6.1
    - Client: Go v0.3.0
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on issue #8991: NPE while reading entries

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #8991:
URL: https://github.com/apache/pulsar/issues/8991#issuecomment-1058894607


   The issue had no activity for 30 days, mark with Stale label.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org