You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by zh...@apache.org on 2020/10/16 12:49:35 UTC
[pulsar] branch master updated: Fix stuck lookup operations when
the broker is starting up (#8273)
This is an automated email from the ASF dual-hosted git repository.
zhaijia pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/pulsar.git
The following commit(s) were added to refs/heads/master by this push:
new b57c163 Fix stuck lookup operations when the broker is starting up (#8273)
b57c163 is described below
commit b57c1630e2478755c05a147bfaf11d9a723cd28e
Author: Matteo Merli <mm...@apache.org>
AuthorDate: Fri Oct 16 05:49:02 2020 -0700
Fix stuck lookup operations when the broker is starting up (#8273)
Motivation
When the broker is starting up, it might start getting lookup requests before all the components of the service are fully initialized. In this particular case a lookup will fail on NPE because the leader election service is not ready yet (it gets instantiated after the broker service).
This NPE causes a series of rippling effects:
The future for the request hitting NPE are not completed
They stay stale in the findingBundlesNotAuthoritative cache map forever
All other lookup requests are piggy-backing on the first futures (but these will not complete)
We reach the max number of pending lookup requests, after which the broker rejects new lookup
---
.../apache/pulsar/broker/namespace/NamespaceService.java | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/pulsar-broker/src/main/java/org/apache/pulsar/broker/namespace/NamespaceService.java b/pulsar-broker/src/main/java/org/apache/pulsar/broker/namespace/NamespaceService.java
index c00e802..511ea12 100644
--- a/pulsar-broker/src/main/java/org/apache/pulsar/broker/namespace/NamespaceService.java
+++ b/pulsar-broker/src/main/java/org/apache/pulsar/broker/namespace/NamespaceService.java
@@ -30,6 +30,7 @@ import org.apache.pulsar.broker.PulsarServerException;
import org.apache.pulsar.broker.PulsarService;
import org.apache.pulsar.broker.ServiceConfiguration;
import org.apache.pulsar.broker.admin.AdminResource;
+import org.apache.pulsar.broker.loadbalance.LeaderElectionService;
import org.apache.pulsar.broker.loadbalance.LoadManager;
import org.apache.pulsar.broker.loadbalance.ResourceUnit;
import org.apache.pulsar.broker.lookup.LookupResult;
@@ -404,7 +405,17 @@ public class NamespaceService {
return;
}
String candidateBroker = null;
- boolean authoritativeRedirect = pulsar.getLeaderElectionService().isLeader();
+
+ LeaderElectionService les = pulsar.getLeaderElectionService();
+ if (les == null) {
+ // The leader election service was not initialized yet. This can happen because the broker service is
+ // initialized first and it might start receiving lookup requests before the leader election service is
+ // fully initialized.
+ lookupFuture.complete(Optional.empty());
+ return;
+ }
+
+ boolean authoritativeRedirect = les.isLeader();
try {
// check if this is Heartbeat or SLAMonitor namespace