You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by zhangao <ga...@qq.com.INVALID> on 2021/12/31 03:41:45 UTC
[Bug] Other brokers failed to acquire bundle ownership after unloading a namespace bundle
Hi,
Currently, I found a problem about bundle ownership acquire, After I unloaded a namespace bundle, I found these error log on other brokers:
```
2021-12-29 14:37:37.641 [metadata-store-6-1] WARN org.apache.pulsar.broker.lookup.TopicLookupBase - Failed to lookup null for topic persistent://public/data-channel/tet-partition-30 with error org.apache.pulsar.broker.PulsarServerException: Failed to acquire ownership for namespace bundle public/data-channel/0xebf3b108_0xf0000000 java.util.concurrent.CompletionException: org.apache.pulsar.broker.PulsarServerException: Failed to acquire ownership for namespace bundle public/data-channel/0xebf3b108_0xf0000000 at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[?:1.8.0_102] at org.apache.pulsar.broker.namespace.NamespaceService.lambda$searchForCandidateBroker$15(NamespaceService.java:577) ~[org.apache.pulsar-pulsar-broker-2.9.1.jar:2.9.1] at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[?:1.8.0_102] at org.apache.pulsar.metadata.coordination.impl.LockManagerImpl.lambda$acquireLock$2(LockManagerImpl.java:111) ~[org.apache.pulsar-pulsar-metadata-2.9.1.jar:2.9.1] at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[?:1.8.0_102] at org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl.lambda$acquire$4(ResourceLockImpl.java:134) ~[org.apache.pulsar-pulsar-metadata-2.9.1.jar:2.9.1] at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) ~[?:1.8.0_102] at org.apache.pulsar.metadata.impl.ZKMetadataStore.lambda$get$7(ZKMetadataStore.java:139) ~[org.apache.pulsar-pulsar-metadata-2.9.1.jar:2.9.1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_102] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_102] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.72.Final.jar:4.1.72.Final] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102] Caused by: org.apache.pulsar.broker.PulsarServerException: Failed to acquire ownership for namespace bundle public/data-channel/0xebf3b108_0xf0000000 ... 20 more Caused by: java.util.concurrent.CompletionException: org.apache.pulsar.metadata.api.MetadataStoreException$LockBusyException: Resource at /namespace/public/data-channel/0xebf3b108_0xf0000000 is already locked at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) ~[?:1.8.0_102] ... 17 more Caused by: org.apache.pulsar.metadata.api.MetadataStoreException$LockBusyException: Resource at /namespace/public/data-channel/0xebf3b108_0xf0000000 is already locked at org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl.lambda$doRevalidate$20(ResourceLockImpl.java:297) ~[org.apache.pulsar-pulsar-metadata-2.9.1.jar:2.9.1] at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926) ~[?:1.8.0_102] ... 7 more
```
The error logs said the resource is already locked, this means, the ZNODE of this namespace bundle created by previous broker was not removed yet.
Normally, broker would firstly query the bundle owner before tried to acquire it. the partial codes:
```
String path = ServiceUnitUtils.path(suName);
return lockManager.readLock(path);
```
If the ZNODE created by previous broker was not removed yet. Why broker got a result that no broker owned this bundle.
Re: [Bug] Other brokers failed to acquire bundle ownership after unloading a namespace bundle
Posted by Haiting Jiang <ji...@apache.org>.
Hi,
Please provide more information about this issue, like pulsar version, steps to steady reproduce this, the influence of this error.
Thanks,
Haiting Jiang
On 2021/12/31 03:41:45 zhangao wrote:
> Hi,
> Currently, I found a problem about bundle ownership acquire, After I unloaded a namespace bundle, I found these error log on other brokers:
>
>
> ```
> 2021-12-29 14:37:37.641 [metadata-store-6-1] WARN org.apache.pulsar.broker.lookup.TopicLookupBase - Failed to lookup null for topic persistent://public/data-channel/tet-partition-30 with error org.apache.pulsar.broker.PulsarServerException: Failed to acquire ownership for namespace bundle public/data-channel/0xebf3b108_0xf0000000 java.util.concurrent.CompletionException: org.apache.pulsar.broker.PulsarServerException: Failed to acquire ownership for namespace bundle public/data-channel/0xebf3b108_0xf0000000 at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[?:1.8.0_102] at org.apache.pulsar.broker.namespace.NamespaceService.lambda$searchForCandidateBroker$15(NamespaceService.java:577) ~[org.apache.pulsar-pulsar-broker-2.9.1.jar:2.9.1] at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[?:1.8.0_102] at org.apache.pulsar.metadata.coordination.impl.LockManagerImpl.lambda$acquireLock$2(LockManagerImpl.java:111) ~[org.apache.pulsar-pulsar-metadata-2.9.1.jar:2.9.1] at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[?:1.8.0_102] at org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl.lambda$acquire$4(ResourceLockImpl.java:134) ~[org.apache.pulsar-pulsar-metadata-2.9.1.jar:2.9.1] at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) ~[?:1.8.0_102] at org.apache.pulsar.metadata.impl.ZKMetadataStore.lambda$get$7(ZKMetadataStore.java:139) ~[org.apache.pulsar-pulsar-metadata-2.9.1.jar:2.9.1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_102] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_102] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.72.Final.jar:4.1.72.Final] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102] Caused by: org.apache.pulsar.broker.PulsarServerException: Failed to acquire ownership for namespace bundle public/data-channel/0xebf3b108_0xf0000000 ... 20 more Caused by: java.util.concurrent.CompletionException: org.apache.pulsar.metadata.api.MetadataStoreException$LockBusyException: Resource at /namespace/public/data-channel/0xebf3b108_0xf0000000 is already locked at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) ~[?:1.8.0_102] ... 17 more Caused by: org.apache.pulsar.metadata.api.MetadataStoreException$LockBusyException: Resource at /namespace/public/data-channel/0xebf3b108_0xf0000000 is already locked at org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl.lambda$doRevalidate$20(ResourceLockImpl.java:297) ~[org.apache.pulsar-pulsar-metadata-2.9.1.jar:2.9.1] at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952) ~[?:1.8.0_102] at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926) ~[?:1.8.0_102] ... 7 more
> ```
> The error logs said the resource is already locked, this means, the ZNODE of this namespace bundle created by previous broker was not removed yet.
>
> Normally, broker would firstly query the bundle owner before tried to acquire it. the partial codes:
>
>
> ```
> String path = ServiceUnitUtils.path(suName);
> return lockManager.readLock(path);
> ```
>
>
>
>
> If the ZNODE created by previous broker was not removed yet. Why broker got a result that no broker owned this bundle.