You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Ethan Rose (Jira)" <ji...@apache.org> on 2021/10/20 20:35:10 UTC

[jira] [Updated] (HDDS-904) RATIS group not found thrown on datanodes while leader election

     [ https://issues.apache.org/jira/browse/HDDS-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ethan Rose updated HDDS-904:
----------------------------
    Target Version/s: 1.3.0  (was: 1.2.0)

I am managing the 1.2.0 release and we currently have more than 600 issues targeted for 1.2.0. I am moving the target field to 1.3.0.

If you are actively working on this jira and believe this should be targeted for the 1.2.0 release, Please reach out to me via Apache email or Slack.

> RATIS group not found thrown on datanodes while leader election
> ---------------------------------------------------------------
>
>                 Key: HDDS-904
>                 URL: https://issues.apache.org/jira/browse/HDDS-904
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: Ozone Datanode, SCM
>            Reporter: Nilotpal Nandi
>            Priority: Major
>              Labels: TriagePending
>         Attachments: datanode_1.log, datanode_2.log, datanode_3.log, scm.log
>
>
> Following exception seen in datanode.log of one the docker nodes
> ---------------------------------------------------------------------------------------------
> {noformat}
> 2018-12-06 09:32:11 INFO LeaderElection:127 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: begin an election in Term 1
> 2018-12-06 09:32:12 INFO LeaderElection:46 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: Election TIMEOUT; received 0 response(s) [] and 0 exception(s); 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:t1, leader=null, voted=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500, raftlog=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500-SegmentedRaftLog:OPENED, conf=-1: [76153aab-4681-40b6-bc32-cc9ed5ef1daf:192.168.0.7:9858, 79ca7251-7514-4c53-968c-ade59d6df07b:192.168.0.6:9858, 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:192.168.0.4:9858], old=null
> 2018-12-06 09:32:13 INFO LeaderElection:127 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: begin an election in Term 2
> 2018-12-06 09:32:13 INFO LeaderElection:230 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500 got exception when requesting votes: {}
> java.util.concurrent.ExecutionException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 76153aab-4681-40b6-bc32-cc9ed5ef1daf: group-41B8C34A6DE4 not found.
>  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>  at org.apache.ratis.server.impl.LeaderElection.waitForResults(LeaderElection.java:214)
>  at org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:146)
>  at org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:102)
> Caused by: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 76153aab-4681-40b6-bc32-cc9ed5ef1daf: group-41B8C34A6DE4 not found.
>  at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:222)
>  at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:203)
>  at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:132)
>  at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$RaftServerProtocolServiceBlockingStub.requestVote(RaftServerProtocolServiceGrpc.java:265)
>  at org.apache.ratis.grpc.server.GrpcServerProtocolClient.requestVote(GrpcServerProtocolClient.java:63)
>  at org.apache.ratis.grpc.server.GrpcService.requestVote(GrpcService.java:150)
>  at org.apache.ratis.server.impl.LeaderElection.lambda$submitRequests$0(LeaderElection.java:188)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> 2018-12-06 09:32:14 INFO LeaderElection:46 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: Election TIMEOUT; received 0 response(s) [] and 1 exception(s); 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:t2, leader=null, voted=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500, raftlog=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500-SegmentedRaftLog:OPENED, conf=-1: [76153aab-4681-40b6-bc32-cc9ed5ef1daf:192.168.0.7:9858, 79ca7251-7514-4c53-968c-ade59d6df07b:192.168.0.6:9858, 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:192.168.0.4:9858], old=null{noformat}
>  
> cc - [~ljain]
> all logs attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org