You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Arpit Agarwal (Jira)" <ji...@apache.org> on 2020/06/01 01:40:00 UTC

[jira] [Updated] (HDDS-904) RATIS group not found thrown on datanodes while leader election

     [ https://issues.apache.org/jira/browse/HDDS-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arpit Agarwal updated HDDS-904:
-------------------------------
    Target Version/s: 0.6.0
              Labels: TriagePending  (was: )

> RATIS group not found thrown on datanodes while leader election
> ---------------------------------------------------------------
>
>                 Key: HDDS-904
>                 URL: https://issues.apache.org/jira/browse/HDDS-904
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode, SCM
>            Reporter: Nilotpal Nandi
>            Priority: Major
>              Labels: TriagePending
>         Attachments: datanode_1.log, datanode_2.log, datanode_3.log, scm.log
>
>
> Following exception seen in datanode.log of one the docker nodes
> ---------------------------------------------------------------------------------------------
> {noformat}
> 2018-12-06 09:32:11 INFO LeaderElection:127 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: begin an election in Term 1
> 2018-12-06 09:32:12 INFO LeaderElection:46 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: Election TIMEOUT; received 0 response(s) [] and 0 exception(s); 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:t1, leader=null, voted=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500, raftlog=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500-SegmentedRaftLog:OPENED, conf=-1: [76153aab-4681-40b6-bc32-cc9ed5ef1daf:192.168.0.7:9858, 79ca7251-7514-4c53-968c-ade59d6df07b:192.168.0.6:9858, 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:192.168.0.4:9858], old=null
> 2018-12-06 09:32:13 INFO LeaderElection:127 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: begin an election in Term 2
> 2018-12-06 09:32:13 INFO LeaderElection:230 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500 got exception when requesting votes: {}
> java.util.concurrent.ExecutionException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 76153aab-4681-40b6-bc32-cc9ed5ef1daf: group-41B8C34A6DE4 not found.
>  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>  at org.apache.ratis.server.impl.LeaderElection.waitForResults(LeaderElection.java:214)
>  at org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:146)
>  at org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:102)
> Caused by: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 76153aab-4681-40b6-bc32-cc9ed5ef1daf: group-41B8C34A6DE4 not found.
>  at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:222)
>  at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:203)
>  at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:132)
>  at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$RaftServerProtocolServiceBlockingStub.requestVote(RaftServerProtocolServiceGrpc.java:265)
>  at org.apache.ratis.grpc.server.GrpcServerProtocolClient.requestVote(GrpcServerProtocolClient.java:63)
>  at org.apache.ratis.grpc.server.GrpcService.requestVote(GrpcService.java:150)
>  at org.apache.ratis.server.impl.LeaderElection.lambda$submitRequests$0(LeaderElection.java:188)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> 2018-12-06 09:32:14 INFO LeaderElection:46 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: Election TIMEOUT; received 0 response(s) [] and 1 exception(s); 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:t2, leader=null, voted=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500, raftlog=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500-SegmentedRaftLog:OPENED, conf=-1: [76153aab-4681-40b6-bc32-cc9ed5ef1daf:192.168.0.7:9858, 79ca7251-7514-4c53-968c-ade59d6df07b:192.168.0.6:9858, 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:192.168.0.4:9858], old=null{noformat}
>  
> cc - [~ljain]
> all logs attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org