You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/09/23 13:17:37 UTC
[GitHub] [pulsar] otmanel31 opened a new issue #12160: [question] pulsar broker reboot
otmanel31 opened a new issue #12160:
URL: https://github.com/apache/pulsar/issues/12160
Hi,
To begin, i'm not able to labelize this issue as a question. Could you move the label ?
We are in production since almost a year now.
Today our Pulsar platform manage 26 000+ topics with 4 brokers, 4 proxies, 3 bookies and 3 zookeeper.
The number of connections we have are under our forecast due to customer issue in selling its safe home services.
We are in version 2.6.1 and in load balancing and running on Kubernetes server version 1.18.20 (Rancher).
We are encountering issues on RAM usage which is increasing regularly on bookies and brokers.
We are rebooting one by one bookies and it do not lead to any issues.
Conversely for brokers this is not the same story. We rebooted one broker (deleting a pod at Kubernetes-Rancher level which generate automatically the creation of a new pod) 2 months ago and it has remained without any connections despite the load balancing mode over these 2 months.
Now we are back to a stable situation where all 4 brokers have a one fourth of the connections.
So, what is the procedure on a load balanced Pulsar platform to reboot a single broker without almost no disturbance.
More broadly is there any documentation on run operations of a Pulsar platform available.
Thanks you for your advice :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-926136086
is there another way to get zookeeper data ? from zookeeper logs for example ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] Shoothzj commented on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
Shoothzj commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-925904433
via the zookeeper command line client
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-925898900
@Shoothzj , the data is in filesystem of zookeeper under /data ? or via an http request ? or the log content of log file under /data/zookeeper/version-2 ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] Shoothzj commented on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
Shoothzj commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-925847417
Could U please provider the zookeeper data in path `/loadbalance/brokers/{broker-num}`
like
![image](https://user-images.githubusercontent.com/12933197/134521857-a9076238-8687-404d-b01b-f019158297c2.png)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-933274190
Other error later:
10:26:36.465 [ForkJoinPool.commonPool-worker-176] WARN org.apache.pulsar.broker.lookup.TopicLookupBase - Failed to lookup 44a6de43-4e73-42a8-9262-0cc4891c42fd for topic persistent://my_tenant/my_ns/data_down-bf558c9c-a2e6-4961-a627-87229c619abf with error java.util.concurrent.RejectedExecutionException: Thread limit exceeded replacing blocked worker
May be as no response from zookeeper, thread continue to increase in broker side ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-927898317
Hi, please find below needed datas in files. One file per broker num. we currently have 4 broker:
- https://github.com/otmanel31/share/blob/master/pulsar_zookeeper_3_request.json
- https://github.com/otmanel31/share/blob/master/requet_broker_0.json
- https://github.com/otmanel31/share/blob/master/requet_broker_1.json
- https://github.com/otmanel31/share/blob/master/requet_broker_2.json
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] github-actions[bot] commented on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-1054902865
The issue had no activity for 30 days, mark with Stale label.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-925934160
sorry, but i didn't find any zookeeper client in my pulsar toolset deployment. Only pulsar-admin, client CLI etc ... let me check if i can install the zkCli.sh.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-933259609
1) Also, two days ago, we also face an other issue (that trigger a shutdown of our deployement), where brokers timeout on zookeeper call. Before first exception, i catched a lot of logs (Info) in few seconds like below for each topics in all brokers:
- 09:49:31.186 [ForkJoinPool.commonPool-worker-1] INFO org.apache.pulsar.broker.service.AbstractTopic - Disabling publish throttling for persistent://my_tenant/my_ns/action_down-f640fa62-3245-41af-81f5-edf868118a9a
09:49:31.186 [ForkJoinPool.commonPool-worker-1] INFO org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://my_tenant/my_ns/action_down-f640fa62-3245-41af-81f5-edf868118a9a] Policies updated successfully
For your information, we manage more than 25000 topics, so we have this 2 lines for each active topic.
Is there any request to zookeeper when this 2 previous log lines appear ?
2) Then first exception thrown is:
09:50:06.167 [pulsar-ordered-OrderedExecutor-1-0] WARN org.apache.pulsar.broker.service.BrokerService - Got exception when reading persistence policy for persistent://my_tenant/my_ns/data_down-5034936e-c9bb-4720-874d-e7e6e5e6d897: null
java.util.concurrent.TimeoutException: null
at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) ~[?:1.8.0_252]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) ~[?:1.8.0_252]
at org.apache.pulsar.zookeeper.ZooKeeperDataCache.get(ZooKeeperDataCache.java:97) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1074) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.bookkeeper.mledger.util.SafeRun$2.safeRun(SafeRun.java:49) [org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
09:50:14.820 [prometheus-stats-43-1] ERROR org.apache.pulsar.broker.service.BacklogQuotaManager - Failed to read policies data, will apply the default backlog quota: namespace=my_tenant/my_ns
java.util.concurrent.TimeoutException: null
at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) ~[?:1.8.0_252]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) ~[?:1.8.0_252]
at org.apache.pulsar.zookeeper.ZooKeeperDataCache.get(ZooKeeperDataCache.java:97) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.service.BacklogQuotaManager.getBacklogQuota(BacklogQuotaManager.java:64) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.service.persistent.PersistentTopic.getBacklogQuota(PersistentTopic.java:1859) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.getTopicStats(NamespaceStatsAggregator.java:97) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$null$0(NamespaceStatsAggregator.java:65) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$null$1(NamespaceStatsAggregator.java:64) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$generate$2(NamespaceStatsAggregator.java:63) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.generate(NamespaceStatsAggregator.java:60) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsGenerator.generate(PrometheusMetricsGenerator.java:85) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsServlet.lambda$doGet$0(PrometheusMetricsServlet.java:70) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_252]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_252]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.generate(NamespaceStatsAggregator.java:60) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$generate$2(NamespaceStatsAggregator.java:63) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
java.util.concurrent.TimeoutException: null
at org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1074) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
java.util.concurrent.TimeoutException: null
at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsServlet.lambda$doGet$0(PrometheusMetricsServlet.java:70) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
This previous exception was throws only in my broker-0. (we have 4 running broker and 4 zookeeper).
Then, as my broker-0 was down, it seems load balancing not correctly worked and not dispatched to each others brokers
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] otmanel31 edited a comment on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
otmanel31 edited a comment on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-933259609
1) Also, two days ago, we also face an other issue (that trigger a shutdown of our deployement), where brokers timeout on zookeeper call. Before first exception, i catched a lot of logs (Info) in few seconds like below for each topics in all brokers:
- 09:49:31.186 [ForkJoinPool.commonPool-worker-1] INFO org.apache.pulsar.broker.service.AbstractTopic - Disabling publish throttling for persistent://my_tenant/my_ns/action_down-f640fa62-3245-41af-81f5-edf868118a9a
09:49:31.186 [ForkJoinPool.commonPool-worker-1] INFO org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://my_tenant/my_ns/action_down-f640fa62-3245-41af-81f5-edf868118a9a] Policies updated successfully
For your information, we manage more than 25000 topics, so we have this 2 lines for each active topic.
Is there any request to zookeeper when this 2 previous log lines appear ?
2) Then first exception thrown is:
09:50:06.167 [pulsar-ordered-OrderedExecutor-1-0] WARN org.apache.pulsar.broker.service.BrokerService - Got exception when reading persistence policy for persistent://my_tenant/my_ns/data_down-5034936e-c9bb-4720-874d-e7e6e5e6d897: null
java.util.concurrent.TimeoutException: null
at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) ~[?:1.8.0_252]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) ~[?:1.8.0_252]
at org.apache.pulsar.zookeeper.ZooKeeperDataCache.get(ZooKeeperDataCache.java:97) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1074) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.bookkeeper.mledger.util.SafeRun$2.safeRun(SafeRun.java:49) [org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
09:50:14.820 [prometheus-stats-43-1] ERROR org.apache.pulsar.broker.service.BacklogQuotaManager - Failed to read policies data, will apply the default backlog quota: namespace=my_tenant/my_ns
java.util.concurrent.TimeoutException: null
at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) ~[?:1.8.0_252]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) ~[?:1.8.0_252]
at org.apache.pulsar.zookeeper.ZooKeeperDataCache.get(ZooKeeperDataCache.java:97) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.service.BacklogQuotaManager.getBacklogQuota(BacklogQuotaManager.java:64) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.service.persistent.PersistentTopic.getBacklogQuota(PersistentTopic.java:1859) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.getTopicStats(NamespaceStatsAggregator.java:97) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$null$0(NamespaceStatsAggregator.java:65) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$null$1(NamespaceStatsAggregator.java:64) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$generate$2(NamespaceStatsAggregator.java:63) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160) ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.generate(NamespaceStatsAggregator.java:60) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsGenerator.generate(PrometheusMetricsGenerator.java:85) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsServlet.lambda$doGet$0(PrometheusMetricsServlet.java:70) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_252]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_252]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.generate(NamespaceStatsAggregator.java:60) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$generate$2(NamespaceStatsAggregator.java:63) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
java.util.concurrent.TimeoutException: null
at org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1074) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
java.util.concurrent.TimeoutException: null
at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsServlet.lambda$doGet$0(PrometheusMetricsServlet.java:70) ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
This previous exception was throws only in my broker-0. (we have 4 running broker and 4 zookeeper).
Then, as my broker-0 was down, it seems load balancing not correctly worked and not dispatched to each others brokers. It seems we need to restart all brokers to have a working load balancing
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] Shoothzj commented on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
Shoothzj commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-925819043
which `LoadSheddingStrategy` are you using? It seems like five broker's load balance doesn't balanced
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] otmanel31 commented on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-925829471
i found loadBalancerLoadSheddingStrategy=org.apache.pulsar.broker.loadbalance.impl.OverloadShedder, but no loadSheddingStrategy
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] Shoothzj commented on issue #12160: [question] pulsar broker reboot
Posted by GitBox <gi...@apache.org>.
Shoothzj commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-926280863
This command would work, but before #12102 release, you need to copy a jline jar into `pulsar/lib` directory, it can be easily find in zookeeper binary package or other place.
```bash
bin/pulsar zookeeper-shell
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org