You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Xinyi Yan (Jira)" <ji...@apache.org> on 2020/10/27 19:46:00 UTC

[jira] [Updated] (PHOENIX-4216) Figure out why tests randomly fail with master not able to initialize in 200 seconds

     [ https://issues.apache.org/jira/browse/PHOENIX-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xinyi Yan updated PHOENIX-4216:
-------------------------------
    Fix Version/s:     (was: 4.16.0)
                   4.17.0
                   4.16.1

> Figure out why tests randomly fail with master not able to initialize in 200 seconds
> ------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-4216
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4216
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0, 4.15.0, 4.14.3
>            Reporter: Samarth Jain
>            Priority: Major
>              Labels: phoenix-hardening, precommit, quality-improvement
>             Fix For: 5.1.0, 4.16.1, 4.17.0
>
>         Attachments: Precommit-3849.log
>
>
> Sample failure:
>  [https://builds.apache.org/job/PreCommit-PHOENIX-Build/1450//testReport/]
> [~apurtell] - Looking at the thread dump in the above link, do you see why master startup failed? I couldn't see any obvious deadlocks
>  
> Exception stacktrace:
> org.apache.hadoop.hbase.regionserver.HRegionServer(2414): Master rejected startup because clock is out of syncorg.apache.hadoop.hbase.regionserver.HRegionServer(2414): Master rejected startup because clock is out of syncorg.apache.hadoop.hbase.ClockOutOfSyncException: org.apache.hadoop.hbase.ClockOutOfSyncException: Server 2a3b1691db3a,42899,1590685404919 has been rejected; Reported time is too far out of sync with master.  Time difference of 1590685396313ms > max allowed of 30000ms at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:411) at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:277) at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:368) at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8615) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2417) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:186) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:166)
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:95) at org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:85) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:372) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:331) at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2412) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:960) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:158) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:110) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:142) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1744) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:334) at org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:139) at java.lang.Thread.run(Thread.java:748)Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.ClockOutOfSyncException): org.apache.hadoop.hbase.ClockOutOfSyncException: Server 2a3b1691db3a,42899,1590685404919 has been rejected; Reported time is too far out of sync with master.  Time difference of 1590685396313ms > max allowed of 30000ms at org.apache.hadoop.hbase.master.ServerManager.checkClockSkew(ServerManager.java:411) at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:277) at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:368) at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8615) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2417) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:186) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:166)
>  at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1291) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:231) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:340) at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8982) at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2410) ... 10 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)