You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Attila Doroszlai (Jira)" <ji...@apache.org> on 2020/01/10 08:46:00 UTC

[jira] [Created] (RATIS-788) Server stuck due to exception while becoming leader

Attila Doroszlai created RATIS-788:
--------------------------------------

             Summary: Server stuck due to exception while becoming leader
                 Key: RATIS-788
                 URL: https://issues.apache.org/jira/browse/RATIS-788
             Project: Ratis
          Issue Type: Bug
          Components: server
            Reporter: Attila Doroszlai


It seems single-node Ratis can get stuck if exception happens while becoming leader.  {{LeaderElection}} ignores the exception because it is already shut down after successful vote.  I guess 3-node Ratis might be able to recover.

{code}
2020-01-09 23:31:35,160 [Thread-95] INFO  impl.FollowerState (FollowerState.java:run(108)) - 6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E-FollowerState: change to CANDIDATE, lastRpcTime:1117ms, electionTimeout:1103ms
2020-01-09 23:31:35,161 [Thread-95] INFO  impl.RoleInfo (RoleInfo.java:shutdownFollowerState(121)) - 6b60526e-eae6-4f33-854d-fa396187085c: shutdown FollowerState
2020-01-09 23:31:35,161 [Thread-95] INFO  impl.RaftServerImpl (RaftServerImpl.java:setRole(173)) - 6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E: changes role from  FOLLOWER to CANDIDATE at term 0 for changeToCandidate
2020-01-09 23:31:35,165 [Thread-95] INFO  impl.RoleInfo (RoleInfo.java:updateAndGet(143)) - 6b60526e-eae6-4f33-854d-fa396187085c: start LeaderElection
2020-01-09 23:31:35,176 [6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E-LeaderElection1] INFO  impl.LeaderElection (LeaderElection.java:askForVotes(206)) - 6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E-LeaderElection1: begin an election at term 1 for -1: [6b60526e-eae6-4f33-854d-fa396187085c:localhost:9872], old=null
2020-01-09 23:31:35,177 [6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E-LeaderElection1] INFO  impl.RoleInfo (RoleInfo.java:shutdownLeaderElection(134)) - 6b60526e-eae6-4f33-854d-fa396187085c: shutdown LeaderElection
2020-01-09 23:31:35,178 [6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E-LeaderElection1] INFO  impl.RaftServerImpl (RaftServerImpl.java:setRole(173)) - 6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E: changes role from CANDIDATE to LEADER at term 1 for changeToLeader
2020-01-09 23:31:35,178 [6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E-LeaderElection1] INFO  impl.RaftServerImpl (ServerState.java:setLeader(255)) - 6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E: change Leader from null to 6b60526e-eae6-4f33-854d-fa396187085c at term 1 for becomeLeader, leader elected after 1269ms
2020-01-09 23:31:35,183 [6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E-LeaderElection1] INFO  server.RaftServerConfigKeys (ConfUtils.java:logGet(43)) - raft.server.staging.catchup.gap = 1000 (default)
2020-01-09 23:31:35,185 [6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E-LeaderElection1] INFO  server.RaftServerConfigKeys (ConfUtils.java:logGet(43)) - raft.server.rpc.sleep.time = 25ms (default)
2020-01-09 23:31:35,217 [6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E-LeaderElection1] INFO  impl.LeaderElection (LeaderElection.java:run(165)) - 6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E-LeaderElection1: ConcurrentModificationException is safely ignored since this is already CLOSING
java.util.ConcurrentModificationException
	at java.util.ArrayList.forEach(ArrayList.java:1260)
	at org.apache.ratis.metrics.impl.MetricRegistriesImpl.lambda$create$1(MetricRegistriesImpl.java:66)
	at org.apache.ratis.metrics.impl.RefCountingMap.lambda$put$0(RefCountingMap.java:51)
	at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
	at org.apache.ratis.metrics.impl.RefCountingMap.put(RefCountingMap.java:46)
	at org.apache.ratis.metrics.impl.MetricRegistriesImpl.create(MetricRegistriesImpl.java:59)
	at org.apache.ratis.server.metrics.RatisMetrics.create(RatisMetrics.java:45)
	at org.apache.ratis.server.metrics.RatisMetrics.getMetricRegistryForLogAppender(RatisMetrics.java:82)
	at org.apache.ratis.server.metrics.LogAppenderMetrics.<init>(LogAppenderMetrics.java:32)
	at org.apache.ratis.server.impl.LeaderState.<init>(LeaderState.java:221)
	at org.apache.ratis.server.impl.RoleInfo.startLeaderState(RoleInfo.java:94)
	at org.apache.ratis.server.impl.RaftServerImpl.changeToLeader(RaftServerImpl.java:348)
	at org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:238)
	at org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:161)
	at java.lang.Thread.run(Thread.java:748)
...
2020-01-09 23:31:48,567 ... 6b60526e-eae6-4f33-854d-fa396187085c@group-C5BA1605619E is in LEADER state but not ready yet.
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)