You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/09/23 13:17:37 UTC

[GitHub] [pulsar] otmanel31 opened a new issue #12160: [question] pulsar broker reboot

otmanel31 opened a new issue #12160:
URL: https://github.com/apache/pulsar/issues/12160

Hi,

To begin, i'm not able to labelize this issue as a question. Could you move the label ?

We are in production since almost a year now.

Today our Pulsar platform manage 26 000+ topics with 4 brokers, 4 proxies, 3 bookies and 3 zookeeper.
The number of connections we have are under our forecast due to customer issue in selling its safe home services.
We are in version 2.6.1 and in load balancing and running on Kubernetes server version 1.18.20 (Rancher).

We are encountering issues on RAM usage which is increasing regularly on bookies and brokers.
We are rebooting one by one bookies and it do not lead to any issues.

Conversely for brokers this is not the same story. We rebooted one broker (deleting a pod at Kubernetes-Rancher level which generate automatically the creation of a new pod) 2 months ago and it has remained without any connections despite the load balancing mode over these 2 months.

Now we are back to a stable situation where all 4 brokers have a one fourth of the connections.

So, what is the procedure on a load balanced Pulsar platform to reboot a single broker without almost no disturbance.

More broadly is there any documentation on run operations of a Pulsar platform available.

Thanks you for your advice :)

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-926136086


   is there another way to get zookeeper data ? from zookeeper logs for example ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] Shoothzj commented on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

Shoothzj commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-925904433


   via the zookeeper command line client


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-925898900


   @Shoothzj , the data is in filesystem of zookeeper under /data ? or via an http request ? or the log content of log file under /data/zookeeper/version-2 ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] Shoothzj commented on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

Shoothzj commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-925847417


   Could U please provider the zookeeper data in path `/loadbalance/brokers/{broker-num}`
   like
   ![image](https://user-images.githubusercontent.com/12933197/134521857-a9076238-8687-404d-b01b-f019158297c2.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-933274190


   Other error later: 
   10:26:36.465 [ForkJoinPool.commonPool-worker-176] WARN org.apache.pulsar.broker.lookup.TopicLookupBase - Failed to lookup 44a6de43-4e73-42a8-9262-0cc4891c42fd for topic persistent://my_tenant/my_ns/data_down-bf558c9c-a2e6-4961-a627-87229c619abf with error java.util.concurrent.RejectedExecutionException: Thread limit exceeded replacing blocked worker
   
   May be as no response from zookeeper, thread continue to increase in broker side ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-927898317


   Hi, please find below needed datas in files. One file per broker num. we currently have 4 broker:
   - https://github.com/otmanel31/share/blob/master/pulsar_zookeeper_3_request.json
   - https://github.com/otmanel31/share/blob/master/requet_broker_0.json
   - https://github.com/otmanel31/share/blob/master/requet_broker_1.json
   - https://github.com/otmanel31/share/blob/master/requet_broker_2.json


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] github-actions[bot] commented on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-1054902865


   The issue had no activity for 30 days, mark with Stale label.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-925934160


   sorry, but i didn't find any zookeeper client in my pulsar toolset deployment.  Only pulsar-admin, client CLI etc ... let me check if i can install the zkCli.sh.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-933259609


   1) Also, two days ago, we also face an other issue (that trigger a shutdown of our deployement), where brokers timeout on zookeeper call. Before first exception, i catched a lot of logs (Info) in few seconds like below for each topics in all brokers:
   - 09:49:31.186 [ForkJoinPool.commonPool-worker-1] INFO  org.apache.pulsar.broker.service.AbstractTopic - Disabling publish throttling for persistent://my_tenant/my_ns/action_down-f640fa62-3245-41af-81f5-edf868118a9a
   09:49:31.186 [ForkJoinPool.commonPool-worker-1] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://my_tenant/my_ns/action_down-f640fa62-3245-41af-81f5-edf868118a9a] Policies updated successfully
   For your information, we manage more than 25000 topics, so we have this 2 lines for each active topic.
   Is there any request to zookeeper when this 2 previous log  lines appear ? 
   
   2) Then first exception thrown is: 
   
   09:50:06.167 [pulsar-ordered-OrderedExecutor-1-0] WARN  org.apache.pulsar.broker.service.BrokerService - Got exception when reading persistence policy for persistent://my_tenant/my_ns/data_down-5034936e-c9bb-4720-874d-e7e6e5e6d897: null
   java.util.concurrent.TimeoutException: null
   at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) ~[?:1.8.0_252]
   at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) ~[?:1.8.0_252]
   at org.apache.pulsar.zookeeper.ZooKeeperDataCache.get(ZooKeeperDataCache.java:97) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1074) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.bookkeeper.mledger.util.SafeRun$2.safeRun(SafeRun.java:49) [org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
   at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
   09:50:14.820 [prometheus-stats-43-1] ERROR org.apache.pulsar.broker.service.BacklogQuotaManager - Failed to read policies data, will apply the default backlog quota: namespace=my_tenant/my_ns
   java.util.concurrent.TimeoutException: null
   at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) ~[?:1.8.0_252]
   at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) ~[?:1.8.0_252]
   at org.apache.pulsar.zookeeper.ZooKeeperDataCache.get(ZooKeeperDataCache.java:97) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.service.BacklogQuotaManager.getBacklogQuota(BacklogQuotaManager.java:64) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.service.persistent.PersistentTopic.getBacklogQuota(PersistentTopic.java:1859) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.getTopicStats(NamespaceStatsAggregator.java:97) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$null$0(NamespaceStatsAggregator.java:65) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$null$1(NamespaceStatsAggregator.java:64) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$generate$2(NamespaceStatsAggregator.java:63) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.generate(NamespaceStatsAggregator.java:60) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsGenerator.generate(PrometheusMetricsGenerator.java:85) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsServlet.lambda$doGet$0(PrometheusMetricsServlet.java:70) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_252]
   at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_252]
   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_252]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
   at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.generate(NamespaceStatsAggregator.java:60) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$generate$2(NamespaceStatsAggregator.java:63) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   java.util.concurrent.TimeoutException: null
   at org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1074) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   java.util.concurrent.TimeoutException: null
   at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsServlet.lambda$doGet$0(PrometheusMetricsServlet.java:70) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   
   This previous exception was throws only in my broker-0. (we have 4 running broker and 4 zookeeper).
   Then, as my broker-0 was down, it seems load balancing not correctly worked and not dispatched to each others brokers
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] otmanel31 edited a comment on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

otmanel31 edited a comment on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-933259609


   1) Also, two days ago, we also face an other issue (that trigger a shutdown of our deployement), where brokers timeout on zookeeper call. Before first exception, i catched a lot of logs (Info) in few seconds like below for each topics in all brokers:
   - 09:49:31.186 [ForkJoinPool.commonPool-worker-1] INFO  org.apache.pulsar.broker.service.AbstractTopic - Disabling publish throttling for persistent://my_tenant/my_ns/action_down-f640fa62-3245-41af-81f5-edf868118a9a
   09:49:31.186 [ForkJoinPool.commonPool-worker-1] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://my_tenant/my_ns/action_down-f640fa62-3245-41af-81f5-edf868118a9a] Policies updated successfully
   For your information, we manage more than 25000 topics, so we have this 2 lines for each active topic.
   Is there any request to zookeeper when this 2 previous log  lines appear ? 
   
   2) Then first exception thrown is: 
   
   09:50:06.167 [pulsar-ordered-OrderedExecutor-1-0] WARN  org.apache.pulsar.broker.service.BrokerService - Got exception when reading persistence policy for persistent://my_tenant/my_ns/data_down-5034936e-c9bb-4720-874d-e7e6e5e6d897: null
   java.util.concurrent.TimeoutException: null
   at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) ~[?:1.8.0_252]
   at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) ~[?:1.8.0_252]
   at org.apache.pulsar.zookeeper.ZooKeeperDataCache.get(ZooKeeperDataCache.java:97) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1074) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.bookkeeper.mledger.util.SafeRun$2.safeRun(SafeRun.java:49) [org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
   at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
   09:50:14.820 [prometheus-stats-43-1] ERROR org.apache.pulsar.broker.service.BacklogQuotaManager - Failed to read policies data, will apply the default backlog quota: namespace=my_tenant/my_ns
   java.util.concurrent.TimeoutException: null
   at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) ~[?:1.8.0_252]
   at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) ~[?:1.8.0_252]
   at org.apache.pulsar.zookeeper.ZooKeeperDataCache.get(ZooKeeperDataCache.java:97) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.service.BacklogQuotaManager.getBacklogQuota(BacklogQuotaManager.java:64) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.service.persistent.PersistentTopic.getBacklogQuota(PersistentTopic.java:1859) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.getTopicStats(NamespaceStatsAggregator.java:97) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$null$0(NamespaceStatsAggregator.java:65) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$null$1(NamespaceStatsAggregator.java:64) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$generate$2(NamespaceStatsAggregator.java:63) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.generate(NamespaceStatsAggregator.java:60) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsGenerator.generate(PrometheusMetricsGenerator.java:85) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsServlet.lambda$doGet$0(PrometheusMetricsServlet.java:70) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_252]
   at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_252]
   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_252]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
   at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.generate(NamespaceStatsAggregator.java:60) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$generate$2(NamespaceStatsAggregator.java:63) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   java.util.concurrent.TimeoutException: null
   at org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1074) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   java.util.concurrent.TimeoutException: null
   at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsServlet.lambda$doGet$0(PrometheusMetricsServlet.java:70) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   
   This previous exception was throws only in my broker-0. (we have 4 running broker and 4 zookeeper).
   Then, as my broker-0 was down, it seems load balancing not correctly worked and not dispatched to each others brokers. It seems we need to restart all brokers to have a working load balancing
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] Shoothzj commented on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

Shoothzj commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-925819043


   which `LoadSheddingStrategy` are you using? It seems like five broker's load balance doesn't balanced


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-925829471


   i found loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.OverloadShedder, but no loadSheddingStrategy
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [pulsar] Shoothzj commented on issue #12160: [question] pulsar broker reboot

Posted by GitBox <gi...@apache.org>.

Shoothzj commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-926280863


   This command would work, but before #12102 release, you need to copy a jline jar into `pulsar/lib` directory, it can be easily find in zookeeper binary package or other place.
   ```bash
   bin/pulsar zookeeper-shell
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org