You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Duo Zhang (Jira)" <ji...@apache.org> on 2022/07/14 00:58:00 UTC

[jira] [Resolved] (HBASE-27192) The retry number for TestSeparateClientZKCluster is too small

     [ https://issues.apache.org/jira/browse/HBASE-27192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Duo Zhang resolved HBASE-27192.
-------------------------------
    Fix Version/s: 2.5.0
                   3.0.0-alpha-4
                   2.4.14
     Hadoop Flags: Reviewed
       Resolution: Fixed

Pushed to branch-2.4+.

Thanks [~GeorryHuang] for reviewing!

> The retry number for TestSeparateClientZKCluster is too small
> -------------------------------------------------------------
>
>                 Key: HBASE-27192
>                 URL: https://issues.apache.org/jira/browse/HBASE-27192
>             Project: HBase
>          Issue Type: Bug
>          Components: test, Zookeeper
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>             Fix For: 2.5.0, 3.0.0-alpha-4, 2.4.14
>
>
> The retry number is only 2, checking the log output, we will fail the request within 600ms, which is too small in testMetaMoveDuringClientZkClusterRestart, as in this method we will shutdown the client zookeeper, the retry interval when updating zookeeper is way more greater, usually several seconds. For example
> {noformat}
> 2022-07-11T00:51:09,998 DEBUG [ClientZKUpdater-/hbase/meta-region-server] zookeeper.RecoverableZooKeeper(303): Retry, connectivity issue (JVM Pause?); quorum=localhost:21828,exceptionorg.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server=
> 2022-07-11T00:51:11,187 DEBUG [ClientZKUpdater-/hbase/meta-region-server] zookeeper.RecoverableZooKeeper(303): Retry, connectivity issue (JVM Pause?); quorum=localhost:21828,exceptionorg.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server=
> 2022-07-11T00:51:13,617 WARN  [HBase-Metrics2-1] impl.MetricsConfig(136): Cannot locate configuration: tried hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
> 2022-07-11T00:51:13,852 DEBUG [HBase-Metrics2-1] regionserver.MetricsTableSourceImpl(130): Creating new MetricsTableSourceImpl for table 'hbase:meta'
> 2022-07-11T00:51:13,853 DEBUG [HBase-Metrics2-1] regionserver.MetricsTableSourceImpl(130): Creating new MetricsTableSourceImpl for table 'testAsyncTable'
> 2022-07-11T00:51:13,854 DEBUG [HBase-Metrics2-1] regionserver.MetricsTableSourceImpl(130): Creating new MetricsTableSourceImpl for table 'testMetaMoveDuringClientZkClusterRestart'
> 2022-07-11T00:51:14,124 ERROR [ClientZKUpdater-/hbase/meta-region-server] zookeeper.RecoverableZooKeeper(300): ZooKeeper setData failed after 2 attempts
> 2022-07-11T00:51:14,124 DEBUG [ClientZKUpdater-/hbase/meta-region-server] zksyncer.ClientZKSyncer(179): Failed to set data for /hbase/meta-region-server to client ZK, will retry later
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:102) ~[zookeeper-3.5.7.jar:3.5.7]
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) ~[zookeeper-3.5.7.jar:3.5.7]
> 	at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2384) ~[zookeeper-3.5.7.jar:3.5.7]
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:428) ~[hbase-zookeeper-3.0.0-alpha-4-SNAPSHOT.jar:3.0.0-alpha-4-SNAPSHOT]
> 	at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:558) ~[hbase-zookeeper-3.0.0-alpha-4-SNAPSHOT.jar:3.0.0-alpha-4-SNAPSHOT]
> 	at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:603) ~[hbase-zookeeper-3.0.0-alpha-4-SNAPSHOT.jar:3.0.0-alpha-4-SNAPSHOT]
> 	at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:597) ~[hbase-zookeeper-3.0.0-alpha-4-SNAPSHOT.jar:3.0.0-alpha-4-SNAPSHOT]
> 	at org.apache.hadoop.hbase.master.zksyncer.ClientZKSyncer.setDataForClientZkUntilSuccess(ClientZKSyncer.java:175) ~[classes/:?]
> 	at org.apache.hadoop.hbase.master.zksyncer.ClientZKSyncer.access$300(ClientZKSyncer.java:45) ~[classes/:?]
> 	at org.apache.hadoop.hbase.master.zksyncer.ClientZKSyncer$ClientZkUpdater.run(ClientZKSyncer.java:319) ~[classes/:?]
> {noformat}
> The first retry log is at 09.998, the second one is at 11.187, and the third one is at 14.124.
> Let's just remove the line which set retry number to 2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)