You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Nilotpal Nandi (JIRA)" <ji...@apache.org> on 2018/12/06 11:08:00 UTC

[jira] [Created] (HDDS-904) RATIS group not found thrown on datanodes while leader election

Nilotpal Nandi created HDDS-904:
-----------------------------------

             Summary: RATIS group not found thrown on datanodes while leader election
                 Key: HDDS-904
                 URL: https://issues.apache.org/jira/browse/HDDS-904
             Project: Hadoop Distributed Data Store
          Issue Type: Bug
          Components: Ozone Datanode, SCM
            Reporter: Nilotpal Nandi
         Attachments: datanode_1.log, datanode_2.log, datanode_3.log, scm.log

Following exception seen in datanode.log of one the docker nodes

---------------------------------------------------------------------------------------------
{noformat}
2018-12-06 09:32:11 INFO LeaderElection:127 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: begin an election in Term 1
2018-12-06 09:32:12 INFO LeaderElection:46 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: Election TIMEOUT; received 0 response(s) [] and 0 exception(s); 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:t1, leader=null, voted=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500, raftlog=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500-SegmentedRaftLog:OPENED, conf=-1: [76153aab-4681-40b6-bc32-cc9ed5ef1daf:192.168.0.7:9858, 79ca7251-7514-4c53-968c-ade59d6df07b:192.168.0.6:9858, 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:192.168.0.4:9858], old=null
2018-12-06 09:32:13 INFO LeaderElection:127 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: begin an election in Term 2
2018-12-06 09:32:13 INFO LeaderElection:230 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500 got exception when requesting votes: {}
java.util.concurrent.ExecutionException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 76153aab-4681-40b6-bc32-cc9ed5ef1daf: group-41B8C34A6DE4 not found.
 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
 at java.util.concurrent.FutureTask.get(FutureTask.java:192)
 at org.apache.ratis.server.impl.LeaderElection.waitForResults(LeaderElection.java:214)
 at org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:146)
 at org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:102)
Caused by: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 76153aab-4681-40b6-bc32-cc9ed5ef1daf: group-41B8C34A6DE4 not found.
 at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:222)
 at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:203)
 at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:132)
 at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$RaftServerProtocolServiceBlockingStub.requestVote(RaftServerProtocolServiceGrpc.java:265)
 at org.apache.ratis.grpc.server.GrpcServerProtocolClient.requestVote(GrpcServerProtocolClient.java:63)
 at org.apache.ratis.grpc.server.GrpcService.requestVote(GrpcService.java:150)
 at org.apache.ratis.server.impl.LeaderElection.lambda$submitRequests$0(LeaderElection.java:188)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
2018-12-06 09:32:14 INFO LeaderElection:46 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: Election TIMEOUT; received 0 response(s) [] and 1 exception(s); 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:t2, leader=null, voted=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500, raftlog=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500-SegmentedRaftLog:OPENED, conf=-1: [76153aab-4681-40b6-bc32-cc9ed5ef1daf:192.168.0.7:9858, 79ca7251-7514-4c53-968c-ade59d6df07b:192.168.0.6:9858, 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:192.168.0.4:9858], old=null{noformat}
 

cc - [~ljain]

all logs attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org