You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Arpit Agarwal (Jira)" <ji...@apache.org> on 2020/06/01 17:04:00 UTC

[jira] [Updated] (HDDS-1807) TestWatchForCommit#testWatchForCommitForRetryfailure fails as a result of no leader election for extended period of time

     [ https://issues.apache.org/jira/browse/HDDS-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arpit Agarwal updated HDDS-1807:
--------------------------------
    Labels: Triaged  (was: )

> TestWatchForCommit#testWatchForCommitForRetryfailure fails as a result of no leader election for extended period of time 
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-1807
>                 URL: https://issues.apache.org/jira/browse/HDDS-1807
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Client
>            Reporter: Shashikant Banerjee
>            Assignee: Shashikant Banerjee
>            Priority: Major
>              Labels: Triaged
>
> {code:java}
> org.apache.ratis.protocol.RaftRetryFailureException: Failed RaftClientRequest:client-6C83DC527A4C->73bdd98d-b003-44ff-a45b-bd12dfd50509@group-75C642DF7AE9, cid=55, seq=1*, RW, org.apache.hadoop.hdds.scm.XceiverClientRatis$$Lambda$407/213850519@1a8843a2 for 10 attempts with RetryLimited(maxAttempts=10, sleepTime=1000ms)
> Stacktrace
> java.util.concurrent.ExecutionException: org.apache.ratis.protocol.RaftRetryFailureException: Failed RaftClientRequest:client-6C83DC527A4C->73bdd98d-b003-44ff-a45b-bd12dfd50509@group-75C642DF7AE9, cid=55, seq=1*, RW, org.apache.hadoop.hdds.scm.XceiverClientRatis$$Lambda$407/213850519@1a8843a2 for 10 attempts with RetryLimited(maxAttempts=10, sleepTime=1000ms)
> 	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> 	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
> 	at org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.testWatchForCommitForRetryfailure(TestWatchForCommit.java:345)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> 	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> 	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> 	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> 	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> 	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
> The client here retries times with a delay of 1 sec between each retry but leader eleactiocouldnot complete.
> {code:java}
> 2019-07-12 19:30:46,451 INFO  client.GrpcClientProtocolClient (GrpcClientProtocolClient.java:onNext(255)) - client-6C83DC527A4C->5931fd83-b899-480e-b15a-ecb8e7f7dd46: receive RaftClientReply:client-6C83DC527A4C->5931fd83-b899-480e-b15a-ecb8e7f7dd46@group-75C642DF7AE9, cid=55, FAILED org.apache.ratis.protocol.NotLeaderException: Server 5931fd83-b899-480e-b15a-ecb8e7f7dd46 is not the leader (null). Request must be sent to leader., logIndex=0, commits[5931fd83-b899-480e-b15a-ecb8e7f7dd46:c-1]
> 2019-07-12 19:30:47,469 INFO  client.GrpcClientProtocolClient (GrpcClientProtocolClient.java:onNext(255)) - client-6C83DC527A4C->d83929f1-c4db-499d-b67f-ad7f10dd7dde: receive RaftClientReply:client-6C83DC527A4C->d83929f1-c4db-499d-b67f-ad7f10dd7dde@group-75C642DF7AE9, cid=55, FAILED org.apache.ratis.protocol.NotLeaderException: Server d83929f1-c4db-499d-b67f-ad7f10dd7dde is not the leader (null). Request must be sent to leader., logIndex=0, commits[d83929f1-c4db-499d-b67f-ad7f10dd7dde:c-1]
> 2019-07-12 19:30:48,504 INFO  client.GrpcClientProtocolClient (GrpcClientProtocolClient.java:onNext(255)) - client-6C83DC527A4C->5931fd83-b899-480e-b15a-ecb8e7f7dd46: receive RaftClientReply:client-6C83DC527A4C->5931fd83-b899-480e-b15a-ecb8e7f7dd46@group-75C642DF7AE9, cid=55, FAILED org.apache.ratis.protocol.NotLeaderException: Server 5931fd83-b899-480e-b15a-ecb8e7f7dd46 is not the leader (null). Request must be sent to leader., logIndex=0, commits[5931fd83-b899-480e-b15a-ecb8e7f7dd46:c-1]
> 2019-07-12 19:30:49,540 INFO  client.GrpcClientProtocolClient (GrpcClientProtocolClient.java:onNext(255)) - client-6C83DC527A4C->73bdd98d-b003-44ff-a45b-bd12dfd50509: receive RaftClientReply:client-6C83DC527A4C->73bdd98d-b003-44ff-a45b-bd12dfd50509@group-75C642DF7AE9, cid=55, FAILED org.apache.ratis.protocol.NotLeaderException: Server 73bdd98d-b003-44ff-a45b-bd12dfd50509 is not the leader (null). Request must be sent to leader., 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org