You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/03/29 14:38:55 UTC
[GitHub] [pinot] walterddr opened a new issue #8431: Potential controller start/stop race condition
walterddr opened a new issue #8431:
URL: https://github.com/apache/pinot/issues/8431
When a test controller is restarted (such as in `PinotTaskManagerStatelessTest`), the controller will be come unavailable for ~1sec until it returns back to normal.
see code block:
<details>
```
Connected to the target VM, address: '127.0.0.1:54163', transport: 'socket'
Mar 28, 2022 2:03:04 PM org.glassfish.grizzly.http.server.NetworkListener start
INFO: Started listener bound to [0.0.0.0:18998]
Mar 28, 2022 2:03:04 PM org.glassfish.grizzly.http.server.HttpServer start
INFO: [HttpServer] Started.
Mar 28, 2022 2:03:10 PM org.glassfish.grizzly.http.server.NetworkListener shutdownNow
INFO: Stopped listener bound to [0.0.0.0:18998]
14:03:10.409 [HelixController-pipeline-task-PinotTaskManagerStatelessTest-(b0992338_TASK)] ERROR org.apache.helix.controller.GenericHelixController - Cluster manager: localhost_18998 is not leader for PinotTaskManagerStatelessTest. Pipeline will not be invoked
Mar 28, 2022 2:03:11 PM org.glassfish.grizzly.http.server.NetworkListener start
INFO: Started listener bound to [0.0.0.0:18998]
Mar 28, 2022 2:03:11 PM org.glassfish.grizzly.http.server.HttpServer start
INFO: [HttpServer-1] Started.
14:03:36.069 [ZkClient-EventThread-113-localhost:2191] ERROR org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager - Caught exception while processing change for path /PinotTaskManagerStatelessTest/PROPERTYSTORE/CONFIGS/TABLE
java.lang.IllegalStateException: ZkClient already closed!
at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1171) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.zookeeper.ZkClient.getChildren(ZkClient.java:725) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.zookeeper.ZkClient.getChildren(ZkClient.java:718) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.ZkBaseDataAccessor.getChildNames(ZkBaseDataAccessor.java:511) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildNames(ZkCacheBaseDataAccessor.java:675) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.store.zk.AutoFallbackPropertyStore.getChildNames(AutoFallbackPropertyStore.java:283) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildren(ZkCacheBaseDataAccessor.java:692) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildren(ZkCacheBaseDataAccessor.java:681) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.refreshWatchers(PinotRealtimeSegmentManager.java:318) ~[classes/:?]
at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.processPropertyStoreChange(PinotRealtimeSegmentManager.java:295) [classes/:?]
at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.handleChildChange(PinotRealtimeSegmentManager.java:383) [classes/:?]
at org.apache.helix.manager.zk.zookeeper.ZkClient$8.run(ZkClient.java:1070) [helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.zookeeper.ZkEventThread.run(ZkEventThread.java:69) [helix-core-0.9.8.jar:0.9.8]
14:03:36.076 [ZkClient-EventThread-113-localhost:2191] ERROR org.apache.helix.manager.zk.zookeeper.ZkClient - Error handling event ZkEvent[Children of /PinotTaskManagerStatelessTest/PROPERTYSTORE/CONFIGS/TABLE changed sent to org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager@4ebfdbb4]
java.lang.IllegalStateException: ZkClient already closed!
at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1171) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.zookeeper.ZkClient.getChildren(ZkClient.java:725) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.zookeeper.ZkClient.getChildren(ZkClient.java:718) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.ZkBaseDataAccessor.getChildNames(ZkBaseDataAccessor.java:511) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildNames(ZkCacheBaseDataAccessor.java:675) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.store.zk.AutoFallbackPropertyStore.getChildNames(AutoFallbackPropertyStore.java:283) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildren(ZkCacheBaseDataAccessor.java:692) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildren(ZkCacheBaseDataAccessor.java:681) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.refreshWatchers(PinotRealtimeSegmentManager.java:318) ~[classes/:?]
at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.processPropertyStoreChange(PinotRealtimeSegmentManager.java:295) ~[classes/:?]
at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.handleChildChange(PinotRealtimeSegmentManager.java:383) ~[classes/:?]
at org.apache.helix.manager.zk.zookeeper.ZkClient$8.run(ZkClient.java:1070) ~[helix-core-0.9.8.jar:0.9.8]
at org.apache.helix.manager.zk.zookeeper.ZkEventThread.run(ZkEventThread.java:69) [helix-core-0.9.8.jar:0.9.8]
Mar 28, 2022 2:03:36 PM org.glassfish.grizzly.http.server.NetworkListener shutdownNow
INFO: Stopped listener bound to [0.0.0.0:18998]
14:03:36.416 [HelixController-pipeline-task-PinotTaskManagerStatelessTest-(767f1028_TASK)] ERROR org.apache.helix.controller.GenericHelixController - Cluster manager: localhost_18998 is not leader for PinotTaskManagerStatelessTest. Pipeline will not be invoked
===============================================
Default Suite
Total tests run: 1, Failures: 0, Skips: 0
===============================================
Disconnected from the target VM, address: '127.0.0.1:54163', transport: 'socket'
Process finished with exit code 0
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org