You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/03/29 14:38:55 UTC

[GitHub] [pinot] walterddr opened a new issue #8431: Potential controller start/stop race condition

walterddr opened a new issue #8431:
URL: https://github.com/apache/pinot/issues/8431


   When a test controller is restarted (such as in `PinotTaskManagerStatelessTest`), the controller will be come unavailable for ~1sec until it returns back to normal.
   
   see code block:
   <details>
   ```
   Connected to the target VM, address: '127.0.0.1:54163', transport: 'socket'
   Mar 28, 2022 2:03:04 PM org.glassfish.grizzly.http.server.NetworkListener start
   INFO: Started listener bound to [0.0.0.0:18998]
   Mar 28, 2022 2:03:04 PM org.glassfish.grizzly.http.server.HttpServer start
   INFO: [HttpServer] Started.
   Mar 28, 2022 2:03:10 PM org.glassfish.grizzly.http.server.NetworkListener shutdownNow
   INFO: Stopped listener bound to [0.0.0.0:18998]
   14:03:10.409 [HelixController-pipeline-task-PinotTaskManagerStatelessTest-(b0992338_TASK)] ERROR org.apache.helix.controller.GenericHelixController - Cluster manager: localhost_18998 is not leader for PinotTaskManagerStatelessTest. Pipeline will not be invoked
   Mar 28, 2022 2:03:11 PM org.glassfish.grizzly.http.server.NetworkListener start
   INFO: Started listener bound to [0.0.0.0:18998]
   Mar 28, 2022 2:03:11 PM org.glassfish.grizzly.http.server.HttpServer start
   INFO: [HttpServer-1] Started.
   14:03:36.069 [ZkClient-EventThread-113-localhost:2191] ERROR org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager - Caught exception while processing change for path /PinotTaskManagerStatelessTest/PROPERTYSTORE/CONFIGS/TABLE
   java.lang.IllegalStateException: ZkClient already closed!
   	at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1171) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.zookeeper.ZkClient.getChildren(ZkClient.java:725) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.zookeeper.ZkClient.getChildren(ZkClient.java:718) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.ZkBaseDataAccessor.getChildNames(ZkBaseDataAccessor.java:511) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildNames(ZkCacheBaseDataAccessor.java:675) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.store.zk.AutoFallbackPropertyStore.getChildNames(AutoFallbackPropertyStore.java:283) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildren(ZkCacheBaseDataAccessor.java:692) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildren(ZkCacheBaseDataAccessor.java:681) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.refreshWatchers(PinotRealtimeSegmentManager.java:318) ~[classes/:?]
   	at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.processPropertyStoreChange(PinotRealtimeSegmentManager.java:295) [classes/:?]
   	at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.handleChildChange(PinotRealtimeSegmentManager.java:383) [classes/:?]
   	at org.apache.helix.manager.zk.zookeeper.ZkClient$8.run(ZkClient.java:1070) [helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.zookeeper.ZkEventThread.run(ZkEventThread.java:69) [helix-core-0.9.8.jar:0.9.8]
   14:03:36.076 [ZkClient-EventThread-113-localhost:2191] ERROR org.apache.helix.manager.zk.zookeeper.ZkClient - Error handling event ZkEvent[Children of /PinotTaskManagerStatelessTest/PROPERTYSTORE/CONFIGS/TABLE changed sent to org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager@4ebfdbb4]
   java.lang.IllegalStateException: ZkClient already closed!
   	at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1171) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.zookeeper.ZkClient.getChildren(ZkClient.java:725) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.zookeeper.ZkClient.getChildren(ZkClient.java:718) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.ZkBaseDataAccessor.getChildNames(ZkBaseDataAccessor.java:511) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildNames(ZkCacheBaseDataAccessor.java:675) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.store.zk.AutoFallbackPropertyStore.getChildNames(AutoFallbackPropertyStore.java:283) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildren(ZkCacheBaseDataAccessor.java:692) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildren(ZkCacheBaseDataAccessor.java:681) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.refreshWatchers(PinotRealtimeSegmentManager.java:318) ~[classes/:?]
   	at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.processPropertyStoreChange(PinotRealtimeSegmentManager.java:295) ~[classes/:?]
   	at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.handleChildChange(PinotRealtimeSegmentManager.java:383) ~[classes/:?]
   	at org.apache.helix.manager.zk.zookeeper.ZkClient$8.run(ZkClient.java:1070) ~[helix-core-0.9.8.jar:0.9.8]
   	at org.apache.helix.manager.zk.zookeeper.ZkEventThread.run(ZkEventThread.java:69) [helix-core-0.9.8.jar:0.9.8]
   Mar 28, 2022 2:03:36 PM org.glassfish.grizzly.http.server.NetworkListener shutdownNow
   INFO: Stopped listener bound to [0.0.0.0:18998]
   14:03:36.416 [HelixController-pipeline-task-PinotTaskManagerStatelessTest-(767f1028_TASK)] ERROR org.apache.helix.controller.GenericHelixController - Cluster manager: localhost_18998 is not leader for PinotTaskManagerStatelessTest. Pipeline will not be invoked
   
   ===============================================
   Default Suite
   Total tests run: 1, Failures: 0, Skips: 0
   ===============================================
   
   Disconnected from the target VM, address: '127.0.0.1:54163', transport: 'socket'
   
   Process finished with exit code 0
   ```
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org