You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/05/04 16:32:20 UTC

[GitHub] [pulsar] rueian opened a new issue #6871: Broker stop serving a topic after a "Bookie handle is not available" error

rueian opened a new issue #6871:
URL: https://github.com/apache/pulsar/issues/6871


   **Describe the bug**
   I had a topic `persistent://public/default/random` that receiving messages published from a pulsar function `public/default/RandomFunction` for days.
   
   But the function kept crashing this morning with the error `Failed to create consumer: TimeOut`:
   ![image](https://user-images.githubusercontent.com/2727535/80987730-0bdb8b80-8e65-11ea-9239-7a59305d3c46.png)
   
   And the `pulsar-admin` would return `HTTP 500` when doing operations on the topic `persistent://public/default/random`:
   
   ![image](https://user-images.githubusercontent.com/2727535/80987962-65dc5100-8e65-11ea-9771-0b0cf87ad113.png)
   
   But other topics just worked as normal, no timeout, no error. 
   The `persistent://public/default/random` topic stopped serving from the broker for hours until I restarted all the brokers manually.
   
   The related logs I found on broker are:
   `[BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L388 E23224 from bookie: pulsar-bookie-1.pulsar-bookie.default.svc.cluster.local:3181`
   
   **To Reproduce**
   Currently I have no idea how to reproduce, I would update the procedure if I succeed to reproduce the error. 
   
   **Expected behavior**
   Maybe the broker should keep trying to serve the topic without restarting the broker process?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] rueian commented on issue #6871: Broker stop serving a topic after a "Bookie handle is not available" error

Posted by GitBox <gi...@apache.org>.
rueian commented on issue #6871:
URL: https://github.com/apache/pulsar/issues/6871#issuecomment-683877264


   Thanks! I will try the new version later this week.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie commented on issue #6871: Broker stop serving a topic after a "Bookie handle is not available" error

Posted by GitBox <gi...@apache.org>.
sijie commented on issue #6871:
URL: https://github.com/apache/pulsar/issues/6871#issuecomment-673323794


   @rueian The bug has been fixed in bookkeeper 4.11 and we are going to bump the version in Pulsar.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] rueian commented on issue #6871: Broker stop serving a topic after a "Bookie handle is not available" error

Posted by GitBox <gi...@apache.org>.
rueian commented on issue #6871:
URL: https://github.com/apache/pulsar/issues/6871#issuecomment-623849407


   Forget to mention that I am using the 2.5.1 pulsar-all docker image.
   
   It seems that if one bookie was restarted, the brokers would stop serving related topics until restart the brokers.
   
   I have total 3 bookies, and 3 brokers having the following setting:
   ```
   managedLedgerDefaultAckQuorum: '2'
   managedLedgerDefaultEnsembleSize: '2'
   managedLedgerDefaultWriteQuorum: '2'
   ```
   
   more logs:
   ```
   03:42:02.374 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L535 E20000 from bookie: pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local:3181
   03:42:02.403 [bookkeeper-io-14-2] WARN  org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to bookie: [id: 0x5a7463c4]/pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local:3181, current state CONNECTING : pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local
   03:42:02.415 [bookkeeper-io-14-1] WARN  org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to bookie: [id: 0x574931fa]/pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local:3181, current state CONNECTING : pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local
   03:42:02.416 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  org.apache.bookkeeper.client.PendingAddOp - Failed to write entry (535, 20000): Bookie handle is not available
   03:42:02.449 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.common.util.SafeRunnable - Unexpected throwable caught 
   java.lang.NullPointerException: null
   	at org.apache.bookkeeper.net.NetUtils.resolveNetworkLocation(NetUtils.java:77) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.resolveNetworkLocation(TopologyAwareEnsemblePlacementPolicy.java:779) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.createBookieNode(TopologyAwareEnsemblePlacementPolicy.java:775) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.replaceBookie(RackawareEnsemblePlacementPolicyImpl.java:450) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.replaceBookie(RackawareEnsemblePlacementPolicy.java:117) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.BookieWatcherImpl.replaceBookie(BookieWatcherImpl.java:289) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.EnsembleUtils.replaceBookiesInEnsemble(EnsembleUtils.java:71) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.ReadOnlyLedgerHandle.handleBookieFailure(ReadOnlyLedgerHandle.java:226) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.PendingAddOp.writeComplete(PendingAddOp.java:377) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.proto.BookieClientImpl$1.safeRun(BookieClientImpl.java:292) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
   	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.45.Final.jar:4.1.45.Final]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
   03:42:02.487 [bookkeeper-io-14-2] WARN  org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to bookie: [id: 0x29f028a9]/pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local:3181, current state CONNECTING : pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local
   03:42:02.488 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L535 E20002 from bookie: pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local:3181
   03:42:02.490 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  org.apache.bookkeeper.client.PendingAddOp - Failed to write entry (535, 20001): Bookie handle is not available
   03:42:02.491 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.common.util.SafeRunnable - Unexpected throwable caught 
   java.lang.NullPointerException: null
   	at org.apache.bookkeeper.net.NetUtils.resolveNetworkLocation(NetUtils.java:77) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.resolveNetworkLocation(TopologyAwareEnsemblePlacementPolicy.java:779) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.createBookieNode(TopologyAwareEnsemblePlacementPolicy.java:775) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.replaceBookie(RackawareEnsemblePlacementPolicyImpl.java:450) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.replaceBookie(RackawareEnsemblePlacementPolicy.java:117) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.BookieWatcherImpl.replaceBookie(BookieWatcherImpl.java:289) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.EnsembleUtils.replaceBookiesInEnsemble(EnsembleUtils.java:71) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.ReadOnlyLedgerHandle.handleBookieFailure(ReadOnlyLedgerHandle.java:226) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.PendingAddOp.writeComplete(PendingAddOp.java:377) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.proto.BookieClientImpl$1.safeRun(BookieClientImpl.java:292) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
   	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.45.Final.jar:4.1.45.Final]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
   03:42:02.499 [bookkeeper-io-14-1] WARN  org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to bookie: [id: 0x7601fba1]/pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local:3181, current state CONNECTING : pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local
   03:42:02.507 [bookkeeper-io-14-2] WARN  org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to bookie: [id: 0x31edef11]/pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local:3181, current state CONNECTING : pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local
   03:42:02.508 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  org.apache.bookkeeper.client.PendingAddOp - Failed to write entry (535, 20002): Bookie handle is not available
   03:42:02.510 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.common.util.SafeRunnable - Unexpected throwable caught 
   java.lang.NullPointerException: null
   	at org.apache.bookkeeper.net.NetUtils.resolveNetworkLocation(NetUtils.java:77) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.resolveNetworkLocation(TopologyAwareEnsemblePlacementPolicy.java:779) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.createBookieNode(TopologyAwareEnsemblePlacementPolicy.java:775) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.replaceBookie(RackawareEnsemblePlacementPolicyImpl.java:450) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.replaceBookie(RackawareEnsemblePlacementPolicy.java:117) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.BookieWatcherImpl.replaceBookie(BookieWatcherImpl.java:289) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.EnsembleUtils.replaceBookiesInEnsemble(EnsembleUtils.java:71) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.ReadOnlyLedgerHandle.handleBookieFailure(ReadOnlyLedgerHandle.java:226) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.PendingAddOp.writeComplete(PendingAddOp.java:377) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.proto.BookieClientImpl$1.safeRun(BookieClientImpl.java:292) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
   	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.45.Final.jar:4.1.45.Final]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
   03:42:02.526 [bookkeeper-io-14-1] WARN  org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to bookie: [id: 0x3d328dcc]/pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local:3181, current state CONNECTING : pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local
   03:42:02.526 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not available while reading L535 E20004 from bookie: pulsar-bookie-2.pulsar-bookie.default.svc.cluster.local:3181
   03:42:02.527 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  org.apache.bookkeeper.client.PendingAddOp - Failed to write entry (535, 20003): Bookie handle is not available
   03:42:02.531 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.common.util.SafeRunnable - Unexpected throwable caught 
   java.lang.NullPointerException: null
   	at org.apache.bookkeeper.net.NetUtils.resolveNetworkLocation(NetUtils.java:77) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.resolveNetworkLocation(TopologyAwareEnsemblePlacementPolicy.java:779) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.createBookieNode(TopologyAwareEnsemblePlacementPolicy.java:775) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.replaceBookie(RackawareEnsemblePlacementPolicyImpl.java:450) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.replaceBookie(RackawareEnsemblePlacementPolicy.java:117) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.BookieWatcherImpl.replaceBookie(BookieWatcherImpl.java:289) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.EnsembleUtils.replaceBookiesInEnsemble(EnsembleUtils.java:71) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.ReadOnlyLedgerHandle.handleBookieFailure(ReadOnlyLedgerHandle.java:226) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.client.PendingAddOp.writeComplete(PendingAddOp.java:377) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.proto.BookieClientImpl$1.safeRun(BookieClientImpl.java:292) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
   	at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
   	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.45.Final.jar:4.1.45.Final]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] michaeljmarshall commented on issue #6871: Broker stop serving a topic after a "Bookie handle is not available" error

Posted by GitBox <gi...@apache.org>.
michaeljmarshall commented on issue #6871:
URL: https://github.com/apache/pulsar/issues/6871#issuecomment-726288940


   > @rueian The bug has been fixed in bookkeeper 4.11 and we are going to bump the version in Pulsar.
   
   @sijie - I noticed that bookkeeper is still only on 4.10.0 in this week's pulsar release  (2.6.2), but that it is 4.11.1 in master. Is there a plan to cut a new release with this updated version of bookkeeper any time soon? Thanks.
   
   https://github.com/apache/pulsar/blob/v2.6.2/pom.xml#L158
   https://github.com/apache/pulsar/blob/master/pom.xml#L101
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org