You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@dubbo.apache.org by GitBox <gi...@apache.org> on 2021/10/08 10:57:01 UTC

[GitHub] [dubbo] zrlw opened a new issue #8993: ConfigCenterConfig,ConfigCenterBuilder与AbstractZookeeperClient的设置的默认超时不一致

zrlw opened a new issue #8993:
URL: https://github.com/apache/dubbo/issues/8993


   * Dubbo version: 3.0 / master
   搜zookeeper not connected可以搜到很多issue,现在github构建依然有:
   ```
   2021-10-08T07:35:43.7080332Z [ERROR] integrate  Time elapsed: 4.058 s  <<< ERROR!
   2021-10-08T07:35:43.7081792Z java.lang.IllegalStateException: java.lang.IllegalStateException: zookeeper not connected
   2021-10-08T07:35:43.7087481Z 	at org.apache.dubbo.integration.single.SingleRegistryCenterDubboProtocolIntegrationTest.integrate(SingleRegistryCenterDubboProtocolIntegrationTest.java:138)
   ```
   问题原因很简单,就是连接zk慢,设置的超时时间短,还没连接上就超时了。
   AbstractZookeeperClient的DEFAULT_CONNECTION_TIMEOUT_MS是5000ms,实际上这个默认值用不上,url不设置timeout就会被ConfigCenterConfig的checkDefault方法设为3000ms。
   下面是3.0分支debug SingleRegistryCenterDubboProtocolIntegrationTest的调用栈:
   ```
   	ConfigCenterConfig.checkDefault() line: 128   <=== timeout为null时设为3000 
   	ConfigCenterConfig(AbstractConfig).postProcessRefresh() line: 714	
   	ConfigCenterConfig(AbstractConfig).refresh() line: 620	
   	DefaultApplicationDeployer.startConfigCenter() line: 253	
   	DefaultApplicationDeployer.initialize() line: 195	
   	DefaultApplicationDeployer.start() line: 538	
   	DubboBootstrap.start(boolean) line: 230	
   	DubboBootstrap.start() line: 220
   ```
   checkDefault代码片段:
   ```
       protected void checkDefault() {
           super.checkDefault();
   
           if (namespace == null) {
               namespace = CommonConstants.DUBBO;
           }
           if (group == null) {
               group = CommonConstants.DUBBO;
           }
           if (timeout == null) {
               timeout = 3000L; 
           }
          ...
       }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org


[GitHub] [dubbo] zrlw edited a comment on issue #8993: ConfigCenterConfig,ConfigCenterBuilder与AbstractZookeeperClient的设置的默认超时不一致

Posted by GitBox <gi...@apache.org>.
zrlw edited a comment on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-939205361


   zookeeper not connected连接失败的SingleRegistryCenterDubboProtocolIntegrationTest日志有前一个测试类SingleRegistryCenterInjvmIntegrationTest的zk client session timeout告警, 
   (构建日志: https://github.com/apache/dubbo/runs/3835835450?check_suite_focus=true)
   摘了主要的内容如下
   ```
   [INFO] Running org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest  
   <== 前一个测试类SingleRegistryCenterInjvmIntegrationTest  
   [08/10/21 07:35:39:199 UTC] Curator-ConnectionStateManager-0  INFO curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper client instance initiated successfully, session id is 100001a38dd0000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1  
   <== 前一个测试类的zk client session ( id: 100001a38dd0000 )
   [08/10/21 07:35:39:624 UTC] main  INFO support.RegistryManager:  [DUBBO] Close all registries [], dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭所有注册
   [08/10/21 07:35:39:624 UTC] main  INFO deploy.DefaultApplicationDeployer:  [DUBBO] Dubbo Application[243.1] has stopped., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类dubbo应用已停止
   [08/10/21 07:35:39:632 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter close successfully., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭zk注册中心
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.493 s - in org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest
   <== 前一个测试类结束
   [INFO] Running org.apache.dubbo.integration.single.SingleRegistryCenterDubboProtocolIntegrationTest
   
   <== 开始出zk连接失败的测试类SingleRegistryCenterDubboProtocolIntegrationTest
   [08/10/21 07:35:39:634 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is starting..., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 启动zk注册中心
   [08/10/21 07:35:39:727 UTC] Curator-ConnectionStateManager-0  WARN curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper connection of session 100001a38dd0000 timed out. connection timeout value is 3000, session expire timeout value is 60000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 提示前一个测试类的session超时 (id: 100001a38dd0000 ) 
   [08/10/21 07:35:40:655 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is started successfully, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   ```
   问题:
   前一个测试类SingleRegistryCenterInjvmIntegrationTest的tearDown做了DubboBootstrap.reset(),zk客户端应该全都被关闭了,但是debug跟踪发现并没有调用CuratorZookeeperClient的doClose方法,RegistryManager里面的registries是个空map。
   这个测试类只是将zk作为configcenter时,通过destroy注册中心的方式关闭zk client是搞不定的,但是关闭配置中心应该要关闭zk客户端吧。
   
   看了一篇curator连接异常问题定位的帖子,里面说curator的event loop是个死循环处理,依次调用各个watcher,如果有一个watcher挂住hang掉了,后面的事件都不会被处理,换句话如果zk连接的事件排在了hang掉的事件处理后面,那么curator就没有机会处理connected事件将currentConnectionState改为已连接,应用的连接就会超时失败。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org


[GitHub] [dubbo] zrlw edited a comment on issue #8993: ConfigCenterConfig,ConfigCenterBuilder与AbstractZookeeperClient的设置的默认超时不一致

Posted by GitBox <gi...@apache.org>.
zrlw edited a comment on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-939205361


   zookeeper not connected连接失败的SingleRegistryCenterDubboProtocolIntegrationTest日志有前一个测试类SingleRegistryCenterInjvmIntegrationTest的zk client session timeout告警, 
   (构建日志: https://github.com/apache/dubbo/runs/3835835450?check_suite_focus=true)
   摘了主要的内容如下
   ```
   [INFO] Running org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest  
   <== 前一个测试类SingleRegistryCenterInjvmIntegrationTest  
   [08/10/21 07:35:39:199 UTC] Curator-ConnectionStateManager-0  INFO curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper client instance initiated successfully, session id is 100001a38dd0000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1  
   <== 前一个测试类的zk client session ( id: 100001a38dd0000 )
   [08/10/21 07:35:39:624 UTC] main  INFO support.RegistryManager:  [DUBBO] Close all registries [], dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭所有注册
   [08/10/21 07:35:39:624 UTC] main  INFO deploy.DefaultApplicationDeployer:  [DUBBO] Dubbo Application[243.1] has stopped., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类dubbo应用已停止
   [08/10/21 07:35:39:632 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter close successfully., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭zk注册中心
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.493 s - in org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest
   <== 前一个测试类结束
   [INFO] Running org.apache.dubbo.integration.single.SingleRegistryCenterDubboProtocolIntegrationTest
   
   <== 开始出zk连接失败的测试类SingleRegistryCenterDubboProtocolIntegrationTest
   [08/10/21 07:35:39:634 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is starting..., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 启动zk注册中心
   [08/10/21 07:35:39:727 UTC] Curator-ConnectionStateManager-0  WARN curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper connection of session 100001a38dd0000 timed out. connection timeout value is 3000, session expire timeout value is 60000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 提示前一个测试类的session超时 (id: 100001a38dd0000 ) 
   [08/10/21 07:35:40:655 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is started successfully, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   ```
   问题:
   前一个测试类SingleRegistryCenterDubboProtocolIntegrationTest的tearDown做了DubboBootstrap.reset(),zk客户端应该全都被关闭了,但是debug跟踪发现SingleRegistryCenterDubboProtocolIntegrationTest里的DubboBootstrap.reset()并没有调用CuratorZookeeperClient的doClose方法。
   
   看了一篇curator连接异常问题定位的帖子,里面说curator的event loop是个死循环处理,依次调用各个watcher,如果有一个watcher挂住hang掉了,后面的事件都不会被处理,换句话如果zk连接的事件排在了hang掉的事件处理后面,那么curator就没有机会处理connected事件将currentConnectionState改为已连接,应用的连接就会超时失败。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org


[GitHub] [dubbo] zrlw commented on issue #8993: ConfigCenterConfig,ConfigCenterBuilder与AbstractZookeeperClient的设置的默认超时不一致

Posted by GitBox <gi...@apache.org>.
zrlw commented on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-939205361


   zookeeper not connected连接失败的SingleRegistryCenterDubboProtocolIntegrationTest日志有前一个测试类SingleRegistryCenterInjvmIntegrationTest的zk client session timeout告警, 
   (构建日志: https://github.com/apache/dubbo/runs/3835835450?check_suite_focus=true)
   摘了主要的内容如下
   ```
   [INFO] Running org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest  
   <== 前一个测试类SingleRegistryCenterInjvmIntegrationTest  
   [08/10/21 07:35:39:199 UTC] Curator-ConnectionStateManager-0  INFO curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper client instance initiated successfully, session id is 100001a38dd0000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1  
   <== 前一个测试类的zk client session ( id: 100001a38dd0000 )
   [08/10/21 07:35:39:624 UTC] main  INFO support.RegistryManager:  [DUBBO] Close all registries [], dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭所有注册
   [08/10/21 07:35:39:624 UTC] main  INFO deploy.DefaultApplicationDeployer:  [DUBBO] Dubbo Application[243.1] has stopped., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类dubbo应用已停止
   [08/10/21 07:35:39:632 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter close successfully., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭zk注册中心
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.493 s - in org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest
   <== 前一个测试类结束
   [INFO] Running org.apache.dubbo.integration.single.SingleRegistryCenterDubboProtocolIntegrationTest
   
   <== 开始出zk连接失败的测试类SingleRegistryCenterDubboProtocolIntegrationTest
   [08/10/21 07:35:39:634 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is starting..., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 启动zk注册中心
   [08/10/21 07:35:39:727 UTC] Curator-ConnectionStateManager-0  WARN curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper connection of session 100001a38dd0000 timed out. connection timeout value is 3000, session expire timeout value is 60000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 提示前一个测试类的session超时 (id: 100001a38dd0000 ) 
   [08/10/21 07:35:40:655 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is started successfully, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   ```
   问题是前一个测试类SingleRegistryCenterDubboProtocolIntegrationTest的tearDown做了DubboBootstrap.reset(),zk客户端应该全都被关闭了,还继续出现session timeout,curator没清理client?
   
   看了一篇curator连接异常问题定位的帖子,里面说curator的event loop是个死循环处理,依次调用各个watcher,如果有一个watcher挂住hang掉了,后面的事件都不会被处理,换句话如果zk连接的事件排在了hang掉的事件处理后面,那么curator就没有机会处理connected事件将currentConnectionState改为已连接,应用的连接就会超时失败。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org


[GitHub] [dubbo] zrlw commented on issue #8993: ConfigCenterConfig,ConfigCenterBuilder与AbstractZookeeperClient的设置的默认超时不一致

Posted by GitBox <gi...@apache.org>.
zrlw commented on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-938568861


   翻了一下连接zk出现超时前面的测试类的日志,用use registry as config-center为关键词查了同一日志内的其余25个zk连接耗时情况,除了第1个zk连接耗时470ms比较久之外,其余24个zk连接耗时都在40ms内,比如下面这个不过20多ms:
   ```
   2021-10-08T07:35:39.1812856Z [08/10/21 07:35:39:179 UTC] main  INFO deploy.DefaultApplicationDeployer:  [DUBBO] use registry as config-center: <dubbo:config-center highestPriority="false" id="config-center-zookeeper-127.0.0.1-36605" address="zookeeper://127.0.0.1:36605" protocol="zookeeper" port="36605" />, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   《== 紧接着DefaultApplicationDeployer会执行getDynamicConfiguration方法开始连接zk
   2021-10-08T07:35:39.2028731Z [08/10/21 07:35:39:199 UTC] Curator-ConnectionStateManager-0  INFO curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper client instance initiated successfully, session id is 100001a38dd0000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   《== zk连接创建完毕
   2021-10-08T07:35:39.2034461Z [08/10/21 07:35:39:199 UTC] main  INFO zookeeper.ZookeeperTransporter:  [DUBBO] No valid zookeeper client found from cache, therefore create a new client for url. 
   ```
   因此正常情况下设置3秒连接超时也够用。
   
   curator官方声明要兼容zk 3.4.13,需要用curator4.2.0,dubbo现在用的4.1.0。
   
   ZooKeeper Version 3.4.x Compatibility
   https://curator.apache.org/zk-compatibility-34.html
   
   ZooKeeper 3.4.x is now at end-of-life. Consequently, the latest versions of Curator have removed support for it. If you wish to use Curator with ZooKeeper 3.4.x you should pin to version 4.2.x of Curator. Curator 4.2.x supports ZooKeeper 3.4.x ensembles in a soft-compatibility mode. To use this mode you must exclude ZooKeeper when adding Curator to your dependency management tool.
   
   Maven
   ```
   <dependency>
       <groupId>org.apache.curator</groupId>
       <artifactId>curator-recipes</artifactId>
       <version>4.2.0</version>
       <exclusions>
           <exclusion>
               <groupId>org.apache.zookeeper</groupId>
               <artifactId>zookeeper</artifactId>
           </exclusion>
       </exclusions>
   </dependency>
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org


[GitHub] [dubbo] zrlw edited a comment on issue #8993: ConfigCenterConfig,ConfigCenterBuilder与AbstractZookeeperClient的设置的默认超时不一致

Posted by GitBox <gi...@apache.org>.
zrlw edited a comment on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-939205361


   zookeeper not connected连接失败的SingleRegistryCenterDubboProtocolIntegrationTest日志有前一个测试类SingleRegistryCenterInjvmIntegrationTest的zk client session timeout告警, 
   (构建日志: https://github.com/apache/dubbo/runs/3835835450?check_suite_focus=true)
   摘了主要的内容如下
   ```
   [INFO] Running org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest  
   <== 前一个测试类SingleRegistryCenterInjvmIntegrationTest  
   [08/10/21 07:35:39:199 UTC] Curator-ConnectionStateManager-0  INFO curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper client instance initiated successfully, session id is 100001a38dd0000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1  
   <== 前一个测试类的zk client session ( id: 100001a38dd0000 )
   [08/10/21 07:35:39:624 UTC] main  INFO support.RegistryManager:  [DUBBO] Close all registries [], dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭所有注册
   [08/10/21 07:35:39:624 UTC] main  INFO deploy.DefaultApplicationDeployer:  [DUBBO] Dubbo Application[243.1] has stopped., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类dubbo应用已停止
   [08/10/21 07:35:39:632 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter close successfully., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭zk注册中心
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.493 s - in org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest
   <== 前一个测试类结束
   [INFO] Running org.apache.dubbo.integration.single.SingleRegistryCenterDubboProtocolIntegrationTest
   
   <== 开始出zk连接失败的测试类SingleRegistryCenterDubboProtocolIntegrationTest
   [08/10/21 07:35:39:634 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is starting..., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 启动zk注册中心
   [08/10/21 07:35:39:727 UTC] Curator-ConnectionStateManager-0  WARN curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper connection of session 100001a38dd0000 timed out. connection timeout value is 3000, session expire timeout value is 60000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 提示前一个测试类的session超时 (id: 100001a38dd0000 ) 
   [08/10/21 07:35:40:655 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is started successfully, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   ```
   问题:
   前一个测试类SingleRegistryCenterInjvmIntegrationTest的tearDown做了DubboBootstrap.reset(),zk客户端应该全都被关闭了,但是debug跟踪发现并没有调用CuratorZookeeperClient的doClose方法。
   
   看了一篇curator连接异常问题定位的帖子,里面说curator的event loop是个死循环处理,依次调用各个watcher,如果有一个watcher挂住hang掉了,后面的事件都不会被处理,换句话如果zk连接的事件排在了hang掉的事件处理后面,那么curator就没有机会处理connected事件将currentConnectionState改为已连接,应用的连接就会超时失败。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org


[GitHub] [dubbo] zrlw edited a comment on issue #8993: ConfigCenterConfig,ConfigCenterBuilder与AbstractZookeeperClient的设置的默认超时不一致

Posted by GitBox <gi...@apache.org>.
zrlw edited a comment on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-938568861


   翻了一下连接zk出现超时前面的测试类的日志,用use registry as config-center为关键词查了同一日志内的其余25个zk连接耗时情况,除了第1个zk连接耗时470ms比较久之外,其余24个zk连接耗时都在40ms内,比如下面这个不到20ms:
   ```
   2021-10-08T07:35:39.1812856Z [08/10/21 07:35:39:179 UTC] main  INFO deploy.DefaultApplicationDeployer:  [DUBBO] use registry as config-center: <dubbo:config-center highestPriority="false" id="config-center-zookeeper-127.0.0.1-36605" address="zookeeper://127.0.0.1:36605" protocol="zookeeper" port="36605" />, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   《== 紧接着DefaultApplicationDeployer会执行getDynamicConfiguration方法开始连接zk
   2021-10-08T07:35:39.2028731Z [08/10/21 07:35:39:199 UTC] Curator-ConnectionStateManager-0  INFO curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper client instance initiated successfully, session id is 100001a38dd0000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   《== zk连接创建完毕
   2021-10-08T07:35:39.2034461Z [08/10/21 07:35:39:199 UTC] main  INFO zookeeper.ZookeeperTransporter:  [DUBBO] No valid zookeeper client found from cache, therefore create a new client for url. 
   ```
   因此正常情况下设置3秒连接超时也够用。
   
   curator官方声明要兼容zk 3.4.13,需要用curator4.2.0,dubbo现在用的4.1.0。
   
   ZooKeeper Version 3.4.x Compatibility
   https://curator.apache.org/zk-compatibility-34.html
   
   ZooKeeper 3.4.x is now at end-of-life. Consequently, the latest versions of Curator have removed support for it. If you wish to use Curator with ZooKeeper 3.4.x you should pin to version 4.2.x of Curator. Curator 4.2.x supports ZooKeeper 3.4.x ensembles in a soft-compatibility mode. To use this mode you must exclude ZooKeeper when adding Curator to your dependency management tool.
   
   Maven
   ```
   <dependency>
       <groupId>org.apache.curator</groupId>
       <artifactId>curator-recipes</artifactId>
       <version>4.2.0</version>
       <exclusions>
           <exclusion>
               <groupId>org.apache.zookeeper</groupId>
               <artifactId>zookeeper</artifactId>
           </exclusion>
       </exclusions>
   </dependency>
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org


[GitHub] [dubbo] zrlw edited a comment on issue #8993: ConfigCenterConfig,ConfigCenterBuilder与AbstractZookeeperClient的设置的默认超时不一致

Posted by GitBox <gi...@apache.org>.
zrlw edited a comment on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-939219910


   eclipse交叉执行若干次SingleRegistryCenterInjvmIntegrationTest和SingleRegistryCenterDubboProtocolIntegrationTest单元测试之后,突然发现这两个测试类无论哪一个都报错,DubboBootstrap.getInstance().start()的结果都是zookeeper not connected:
   ```
   java.lang.IllegalStateException: java.lang.IllegalStateException: zookeeper not connected
   	at org.apache.dubbo.config.deploy.DefaultApplicationDeployer.prepareEnvironment(DefaultApplicationDeployer.java:637)
   	at org.apache.dubbo.config.deploy.DefaultApplicationDeployer.startConfigCenter(DefaultApplicationDeployer.java:266)
   	at org.apache.dubbo.config.deploy.DefaultApplicationDeployer.initialize(DefaultApplicationDeployer.java:195)
   	at org.apache.dubbo.config.deploy.DefaultApplicationDeployer.start(DefaultApplicationDeployer.java:538)
   	at org.apache.dubbo.config.bootstrap.DubboBootstrap.start(DubboBootstrap.java:230)
   ```
   跟踪进去CuratorFrameworkImpl的blockUntilConnected方法,发现currentConnectionState的值一直是null,curator的事件驱动不运转了,另外开始折腾前,本机C盘有27G空闲空间,现在只剩了45M空闲了,肯定是有问题的。
   
   dubbo: 3.0  7c2f52d  [3.0-Triple] support streamObserver cancel (#8946)
   pom里配置的zookeeper: 3.4.13, curator: 4.1.0,  curator-test: 2.12.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org


[GitHub] [dubbo] zrlw edited a comment on issue #8993: ConfigCenterConfig,ConfigCenterBuilder与AbstractZookeeperClient的设置的默认超时不一致

Posted by GitBox <gi...@apache.org>.
zrlw edited a comment on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-939205361


   zookeeper not connected连接失败的SingleRegistryCenterDubboProtocolIntegrationTest日志有前一个测试类SingleRegistryCenterInjvmIntegrationTest的zk client session timeout告警, 
   (构建日志: https://github.com/apache/dubbo/runs/3835835450?check_suite_focus=true)
   摘了主要的内容如下
   ```
   [INFO] Running org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest  
   <== 前一个测试类SingleRegistryCenterInjvmIntegrationTest  
   [08/10/21 07:35:39:199 UTC] Curator-ConnectionStateManager-0  INFO curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper client instance initiated successfully, session id is 100001a38dd0000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1  
   <== 前一个测试类的zk client session ( id: 100001a38dd0000 )
   [08/10/21 07:35:39:624 UTC] main  INFO support.RegistryManager:  [DUBBO] Close all registries [], dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭所有注册
   [08/10/21 07:35:39:624 UTC] main  INFO deploy.DefaultApplicationDeployer:  [DUBBO] Dubbo Application[243.1] has stopped., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类dubbo应用已停止
   [08/10/21 07:35:39:632 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter close successfully., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭zk注册中心
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.493 s - in org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest
   <== 前一个测试类结束
   [INFO] Running org.apache.dubbo.integration.single.SingleRegistryCenterDubboProtocolIntegrationTest
   
   <== 开始出zk连接失败的测试类SingleRegistryCenterDubboProtocolIntegrationTest
   [08/10/21 07:35:39:634 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is starting..., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 启动zk注册中心
   [08/10/21 07:35:39:727 UTC] Curator-ConnectionStateManager-0  WARN curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper connection of session 100001a38dd0000 timed out. connection timeout value is 3000, session expire timeout value is 60000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 提示前一个测试类的session超时 (id: 100001a38dd0000 ) 
   [08/10/21 07:35:40:655 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is started successfully, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   ```
   问题:
   前一个测试类SingleRegistryCenterInjvmIntegrationTest的tearDown做了DubboBootstrap.reset(),zk客户端应该全都被关闭了,但是debug跟踪发现SingleRegistryCenterDubboProtocolIntegrationTest里的DubboBootstrap.reset()并没有调用CuratorZookeeperClient的doClose方法。
   
   看了一篇curator连接异常问题定位的帖子,里面说curator的event loop是个死循环处理,依次调用各个watcher,如果有一个watcher挂住hang掉了,后面的事件都不会被处理,换句话如果zk连接的事件排在了hang掉的事件处理后面,那么curator就没有机会处理connected事件将currentConnectionState改为已连接,应用的连接就会超时失败。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org


[GitHub] [dubbo] zrlw commented on issue #8993: [3.0] CuratorZookeeperClient zookeeper not connected的原因可能是前面已经结束的测试类没有关闭zkClient所致

Posted by GitBox <gi...@apache.org>.
zrlw commented on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-944886717


   跟踪调试代码发现即使dubbo创建一个CuratorZookeeperClient,但是发现有的UT会建两个session,最后只关了其中一个,导致zk服务关闭后另一个没有关闭的session不停地尝试重连,查了半天,才发现是ZookeeperServiceDiscovery自己另起炉灶搞了一套连接,但是destroy的时候并没有关闭。
   本来想把ZookeeperServiceDiscovery的zk连接改为ZookeeperTransporter统一搞,但是动手时发现ZookeeperServiceDiscovery自己不仅有自己一套的连接参数,还有自己的一套状态管理,改动起来伤筋动骨,还是等熟悉ZookeeperServiceDiscovery的commiter重构算了,我先给它的destroy方法补个zk连接关闭操作算了。
   #9015


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org


[GitHub] [dubbo] zrlw edited a comment on issue #8993: ConfigCenterConfig,ConfigCenterBuilder与AbstractZookeeperClient的设置的默认超时不一致

Posted by GitBox <gi...@apache.org>.
zrlw edited a comment on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-939219910


   eclipse交叉执行若干次SingleRegistryCenterInjvmIntegrationTest和SingleRegistryCenterDubboProtocolIntegrationTest单元测试之后,突然发现这两个测试类无论哪一个都报错,DubboBootstrap.getInstance().start()的结果都是zookeeper not connected:
   ```
   java.lang.IllegalStateException: java.lang.IllegalStateException: zookeeper not connected
   	at org.apache.dubbo.config.deploy.DefaultApplicationDeployer.prepareEnvironment(DefaultApplicationDeployer.java:637)
   	at org.apache.dubbo.config.deploy.DefaultApplicationDeployer.startConfigCenter(DefaultApplicationDeployer.java:266)
   	at org.apache.dubbo.config.deploy.DefaultApplicationDeployer.initialize(DefaultApplicationDeployer.java:195)
   	at org.apache.dubbo.config.deploy.DefaultApplicationDeployer.start(DefaultApplicationDeployer.java:538)
   	at org.apache.dubbo.config.bootstrap.DubboBootstrap.start(DubboBootstrap.java:230)
   ```
   跟踪进去CuratorFrameworkImpl的blockUntilConnected方法,发现currentConnectionState的值一直是null,curator的事件驱动不运转了。另外开始测试前,本机C盘有27G空闲空间,现在只剩了45M空闲了,c盘下的pagefile.sys涨了20多G,肯定是有问题。
   
   dubbo: 3.0  7c2f52d  [3.0-Triple] support streamObserver cancel (#8946)
   pom里配置的zookeeper: 3.4.13, curator: 4.1.0,  curator-test: 2.12.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org


[GitHub] [dubbo] zrlw commented on issue #8993: ConfigCenterConfig,ConfigCenterBuilder与AbstractZookeeperClient的设置的默认超时不一致

Posted by GitBox <gi...@apache.org>.
zrlw commented on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-939219910


   eclipse交叉执行若干次SingleRegistryCenterInjvmIntegrationTest和SingleRegistryCenterDubboProtocolIntegrationTest单元测试之后,突然发现这两个测试类执行DubboBootstrap.getInstance().start()的结果都是zookeeper not connected了:
   ```
   java.lang.IllegalStateException: java.lang.IllegalStateException: zookeeper not connected
   	at org.apache.dubbo.config.deploy.DefaultApplicationDeployer.prepareEnvironment(DefaultApplicationDeployer.java:637)
   	at org.apache.dubbo.config.deploy.DefaultApplicationDeployer.startConfigCenter(DefaultApplicationDeployer.java:266)
   	at org.apache.dubbo.config.deploy.DefaultApplicationDeployer.initialize(DefaultApplicationDeployer.java:195)
   	at org.apache.dubbo.config.deploy.DefaultApplicationDeployer.start(DefaultApplicationDeployer.java:538)
   	at org.apache.dubbo.config.bootstrap.DubboBootstrap.start(DubboBootstrap.java:230)
   ```
   跟踪进去CuratorFrameworkImpl的blockUntilConnected方法,发现currentConnectionState的值一直是null,curator的事件驱动不运转了。
   
   dubbo: 3.0  7c2f52d  [3.0-Triple] support streamObserver cancel (#8946)
   pom里配置的zookeeper: 3.4.13, curator: 4.1.0,  curator-test: 2.12.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org


[GitHub] [dubbo] zrlw edited a comment on issue #8993: ConfigCenterConfig,ConfigCenterBuilder与AbstractZookeeperClient的设置的默认超时不一致

Posted by GitBox <gi...@apache.org>.
zrlw edited a comment on issue #8993:
URL: https://github.com/apache/dubbo/issues/8993#issuecomment-939205361


   zookeeper not connected连接失败的SingleRegistryCenterDubboProtocolIntegrationTest日志有前一个测试类SingleRegistryCenterInjvmIntegrationTest的zk client session timeout告警, 
   (构建日志: https://github.com/apache/dubbo/runs/3835835450?check_suite_focus=true)
   摘了主要的内容如下
   ```
   [INFO] Running org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest  
   <== 前一个测试类SingleRegistryCenterInjvmIntegrationTest  
   [08/10/21 07:35:39:199 UTC] Curator-ConnectionStateManager-0  INFO curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper client instance initiated successfully, session id is 100001a38dd0000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1  
   <== 前一个测试类的zk client session ( id: 100001a38dd0000 )
   [08/10/21 07:35:39:624 UTC] main  INFO support.RegistryManager:  [DUBBO] Close all registries [], dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭所有注册
   [08/10/21 07:35:39:624 UTC] main  INFO deploy.DefaultApplicationDeployer:  [DUBBO] Dubbo Application[243.1] has stopped., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类dubbo应用已停止
   [08/10/21 07:35:39:632 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter close successfully., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 前一个测试类关闭zk注册中心
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.493 s - in org.apache.dubbo.integration.single.injvm.SingleRegistryCenterInjvmIntegrationTest
   <== 前一个测试类结束
   [INFO] Running org.apache.dubbo.integration.single.SingleRegistryCenterDubboProtocolIntegrationTest
   
   <== 开始出zk连接失败的测试类SingleRegistryCenterDubboProtocolIntegrationTest
   [08/10/21 07:35:39:634 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is starting..., dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 启动zk注册中心
   [08/10/21 07:35:39:727 UTC] Curator-ConnectionStateManager-0  WARN curator.CuratorZookeeperClient:  [DUBBO] Curator zookeeper connection of session 100001a38dd0000 timed out. connection timeout value is 3000, session expire timeout value is 60000, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   <== 提示前一个测试类的session超时 (id: 100001a38dd0000 ) 
   [08/10/21 07:35:40:655 UTC] main  INFO registrycenter.ZookeeperRegistryCenter:  [DUBBO] The ZookeeperRegistryCenter is started successfully, dubbo version: 3.0.4-SNAPSHOT, current host: 172.19.112.1
   ```
   问题:
   前一个测试类SingleRegistryCenterInjvmIntegrationTest的tearDown做了DubboBootstrap.reset(),zk客户端应该全都被关闭了,但是debug跟踪发现并没有调用CuratorZookeeperClient的doClose方法,RegistryManager里面的registries是个空map。
   
   看了一篇curator连接异常问题定位的帖子,里面说curator的event loop是个死循环处理,依次调用各个watcher,如果有一个watcher挂住hang掉了,后面的事件都不会被处理,换句话如果zk连接的事件排在了hang掉的事件处理后面,那么curator就没有机会处理connected事件将currentConnectionState改为已连接,应用的连接就会超时失败。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@dubbo.apache.org
For additional commands, e-mail: notifications-help@dubbo.apache.org