You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@iotdb.apache.org by GitBox <gi...@apache.org> on 2021/11/04 08:31:19 UTC

[GitHub] [iotdb] chengjianyun opened a new issue #4320: Would raft-related request get the correct term in the situation?

chengjianyun opened a new issue #4320:
URL: https://github.com/apache/iotdb/issues/4320


   Let's consider the below situation:
   
   ## scenario1
   3 replica, 3 nodes
   ```
   t0 - term=1, node1: leader, node1, node2: follower
   t1 - node3 timeout when checking heartbeat, start election for term=2 and get vote from node1, 
   t2 - node2 timeout when checking heartbeat, going to start election, thread is hang by system, term=1
   t3 - node2 receive election request for term=2, accept the election, vote for node3, update term=2 and became follower
   t4 - node2 continue the election, get lock of `term`, construct an election request with term=3 and send out
   ```
   In the scenario, we can see that node2 timeout in term=1 but started an election for term=3. I think this should be a bug although it won't make system run into some wrong status. 
   
   If the bug in scenario1 is benign, then I think the next one is very serious.
   ## Senario2
   3 replica, 3 nodes
   ```
   t0 - term=1, node1: leader, node1, node2: follower
   t1 - node1 is going to appendLog() to node2, thread is scheduled out by system before construct the AppendEntryRequest.
   t2 - network partition happens between node1 and node2, node3. node2 and node3 can communication with each other.
   t3 - node3 becomes the leader in term=2 after election
   t4 - network partition recover
   t5 - node1 receives the election request from node3, and accept the election, vote for node3, updates term=2
   t6 - node1 continue the appendLog() thread and send appendLog() for term=2
   ```
   In the scenario, node1 has become follower in term=2 but still send out log to node2 with term=2. This will cause node2 run into mess and then ruin the whole raft group.
   
   ## Summary
   After reviewing code, I think both scenarios would happen in the latest code. Correct me if anything is wrong and welcome to discuss. No matter scenario1 or scenario2, the bugs happen we don't handle sync term well. But If we want to solve the bug by sync on term, it's really very hard and may make everything into serialize. Maybe a better way is create a totally new environment for each term as Ratis did.
   
   Actually the scenarios come from our test environment. As our server run into wired status because too many inodes, it run slowly at that time and we found the scenarios from log. Below are the log after organization. Please send email to me for raw logs if you want to analysis.
   
   ### meta group log for scenario1
   ```
   // Node09 received election request from Node08
   2021-10-28 20:55:45,812 [MetaHeartbeatClientThread-3] INFO  o.a.i.c.s.m.RaftMember:1550 - Meta has update it's term to 4 
   2021-10-28 20:55:45,872 [MetaHeartbeatClientThread-3] INFO  o.a.i.c.s.m.RaftMember:1592 - Meta has update it's voteFor to null 
   2021-10-28 20:55:45,872 [MetaHeartbeatClientThread-3] INFO  o.a.i.c.s.m.RaftMember:654 - Meta has been set to null in term 4 
   
   // Node 09 heartbeat timeout, going to start election by set character to ELECTOR, at the time, term should be 3
   2021-10-28 20:55:45,873 [Meta-HeartbeatThread@1634546688149] INFO  o.a.i.c.s.h.HeartbeatThread:97 - Meta: The leader Node(internalIp:10.0.0.10, metaPort:9003, nodeIdentifier:-1501659200, dataPort:40010, clientPort:6667, clientIp:10.0.0.10) timed out 
   
   // Node09 agreed to vote for Node08,change character to follower, term=4
   2021-10-28 20:55:45,873 [MetaHeartbeatClientThread-3] INFO  o.a.i.c.s.m.RaftMember:656 - Meta has become a follower of Node(internalIp:10.0.0.8, metaPort:9003, nodeIdentifier:-1501774531, dataPort:40010, clientPort:6667, clientIp:10.0.0.8) in term 4 
   
   // Node09 election thread continue execution
   2021-10-28 20:55:45,873 [Meta-HeartbeatThread@1634546688149] INFO  o.a.i.c.s.m.RaftMember:634 - Meta has become a ELECTOR 
   2021-10-28 20:55:45,873 [Meta-HeartbeatThread@1634546688149] INFO  o.a.i.c.s.m.RaftMember:654 - Meta has been set to null in term 4 
   2021-10-28 20:55:45,886 [Meta-HeartbeatThread@1634546688149] INFO  o.a.i.c.s.h.HeartbeatThread:128 - Meta: Start elections 
   
   // Node09 received an duplicated election request from Node08, term=4
   2021-10-28 20:55:45,886 [MetaHeartbeatClientThread-4] INFO  o.a.i.c.s.m.RaftMember:1193 - Meta accepted an election request, term:4/4, logIndex:792/792, logTerm:2/2 
   2021-10-28 20:55:45,886 [MetaHeartbeatClientThread-4] INFO  o.a.i.c.s.m.RaftMember:634 - Meta has become a FOLLOWER 
   2021-10-28 20:55:45,887 [MetaHeartbeatClientThread-4] INFO  o.a.i.c.s.m.RaftMember:1592 - Meta has update it's voteFor to Node(internalIp:10.0.0.8, metaPort:9003, nodeIdentifier:-1501774531, dataPort:40010, clientPort:6667, clientIp:10.0.0.8)
   2021-10-28 20:55:45,887 [MetaHeartbeatClientThread-4] INFO  o.a.i.c.s.m.RaftMember:468 - Meta sending response -1 to the elector Node(internalIp:10.0.0.8, metaPort:9003, nodeIdentifier:-1501774531, dataPort:40010, clientPort:6667, clientIp:10.0.0.8) 
   2021-10-28 20:55:45,887 [MetaHeartbeatClientThread-2] INFO  o.a.i.c.s.m.RaftMember:656 - Meta has become a follower of Node(internalIp:10.0.0.8, metaPort:9003, nodeIdentifier:-1501774531, dataPort:40010, clientPort:6667, clientIp:10.0.0.8) in term 4
   
   // Node09 election thread continue execution, start a election in term=5
   2021-10-28 20:55:45,887 [Meta-HeartbeatThread@1634546688149] INFO  o.a.i.c.s.m.RaftMember:1592 - Meta has update it's voteFor to Node(internalIp:10.0.0.9, metaPort:9003, nodeIdentifier:-1501660166, dataPort:40010, clientPort:6667, clientIp:10.0.0.9) 
   2021-10-28 20:55:45,888 [Meta-HeartbeatThread@1634546688149] INFO  o.a.i.c.s.h.HeartbeatThread:282 - Meta: Election 5 starts, quorum: 1 
   2021-10-28 20:55:45,888 [Meta-HeartbeatThread@1634546688149] INFO  o.a.i.c.s.h.HeartbeatThread:314 - Meta: Wait for 20000ms until election time out 
   ....
   ```
   
   ### log of Data(10.0.0.10) group for scenario2
   ```
   // Node09 heartbeat timeout in term=1, update term=2 and start election request
   2021-10-28 20:54:44,005 [Data(10.0.0.10:9003)-HeartbeatThread@1634546693722] INFO  o.a.i.c.s.h.HeartbeatThread:97 - Data(10.0.0.10:9003): The leader Node(internalIp:10.0.0.10, metaPort:9003, nodeIdentifier:-1501659200, dataPort:40010, clientPort:6667, clientIp:10.0.0.10) timed out 
   2021-10-28 20:54:44,005 [Data(10.0.0.10:9003)-HeartbeatThread@1634546693722] INFO  o.a.i.c.s.m.RaftMember:634 - Data(10.0.0.10:9003) has become a ELECTOR 
   2021-10-28 20:54:44,005 [Data(10.0.0.10:9003)-HeartbeatThread@1634546693722] INFO  o.a.i.c.s.m.RaftMember:654 - Data(10.0.0.10:9003) has been set to null in term 1 
   2021-10-28 20:54:44,006 [Data(10.0.0.10:9003)-HeartbeatThread@1634546693722] INFO  o.a.i.c.s.h.HeartbeatThread:128 - Data(10.0.0.10:9003): Start elections 
   2021-10-28 20:54:44,006 [Data(10.0.0.10:9003)-HeartbeatThread@1634546693722] INFO  o.a.i.c.s.m.RaftMember:1592 - Data(10.0.0.10:9003) has update it's voteFor to Node(internalIp:10.0.0.9, metaPort:9003, nodeIdentifier:-1501660166, dataPort:40010, clientPort:6667, clientIp:10.0.0.9) 
   2021-10-28 20:54:44,006 [Data(10.0.0.10:9003)-HeartbeatThread@1634546693722] INFO  o.a.i.c.s.h.HeartbeatThread:282 - Data(10.0.0.10:9003): Election 2 starts, quorum: 1 
   2021-10-28 20:54:44,006 [Data(10.0.0.10:9003)-HeartbeatThread@1634546693722] INFO  o.a.i.c.s.h.HeartbeatThread:314 - Data(10.0.0.10:9003): Wait for 20000ms until election time out 
   
   //Node09 get vote from Node08, became leader in term=2
   2021-10-28 20:54:44,009 [Data(10.0.0.10:9003)-SerialToParallel1] INFO  o.a.i.c.s.h.c.ElectionHandler:89 - Data(10.0.0.10:9003): Received a grant vote from Node(internalIp:10.0.0.8, metaPort:9003, nodeIdentifier:-1501774531, dataPort:40010, clientPort:6667, clientIp:10.0.0.8), remaining votes to succeed: 0 
   2021-10-28 20:54:44,009 [Data(10.0.0.10:9003)-SerialToParallel1] INFO  o.a.i.c.s.h.c.ElectionHandler:100 - Data(10.0.0.10:9003): Election 2 is won 
   2021-10-28 20:54:44,009 [Data(10.0.0.10:9003)-HeartbeatThread@1634546693722] INFO  o.a.i.c.s.h.HeartbeatThread:330 - Data(10.0.0.10:9003): Election 2 accepted 
   2021-10-28 20:54:44,009 [Data(10.0.0.10:9003)-HeartbeatThread@1634546693722] INFO  o.a.i.c.s.m.RaftMember:634 - Data(10.0.0.10:9003) has become a LEADER 
   2021-10-28 20:54:44,009 [Data(10.0.0.10:9003)-HeartbeatThread@1634546693722] INFO  o.a.i.c.s.h.HeartbeatThread:132 - Data(10.0.0.10:9003): End elections 
   
   // Node08 received election request from Node09 for term=2,Node08 accept the election and vote for Node09, became follower
   2021-10-28 20:54:44,014 [DataHeartbeatClientThread-7] INFO  o.a.i.c.s.m.RaftMember:498 - Data(10.0.0.10:9003) received an election from elector Node(internalIp:10.0.0.9, metaPort:9003, nodeIdentifier:-1501660166, dataPort:40010, clientPort:6667, clientIp:10.0.0.9) which has bigger term 2 than localTerm 1, raftMember should step down first and then continue to decide whether to grant it's vote by log status. 
   2021-10-28 20:54:44,014 [DataHeartbeatClientThread-7] INFO  o.a.i.c.s.m.RaftMember:1550 - Data(10.0.0.10:9003) has update it's term to 2 
   2021-10-28 20:54:44,014 [DataHeartbeatClientThread-7] INFO  o.a.i.c.s.m.RaftMember:1592 - Data(10.0.0.10:9003) has update it's voteFor to null 
   2021-10-28 20:54:44,014 [DataHeartbeatClientThread-7] INFO  o.a.i.c.s.m.RaftMember:654 - Data(10.0.0.10:9003) has been set to null in term 2 
   2021-10-28 20:54:44,015 [DataHeartbeatClientThread-7] INFO  o.a.i.c.s.m.DataGroupMember:358 - Data(10.0.0.10:9003) received an dataGroup election request, term:2, metaLastLogIndex:792, metaLastLogTerm:2, dataLastLogIndex:7828302, dataLastLogTerm:1 
   2021-10-28 20:54:44,015 [DataHeartbeatClientThread-7] INFO  o.a.i.c.s.m.DataGroupMember:377 - Data(10.0.0.10:9003) accepted an dataGroup election request, term:2/2, dataLogIndex:7828302/7828302, dataLogTerm:1/1, metaLogIndex:792/792,metaLogTerm:2/2 
   2021-10-28 20:54:44,015 [DataHeartbeatClientThread-7] INFO  o.a.i.c.s.m.RaftMember:1592 - Data(10.0.0.10:9003) has update it's voteFor to Node(internalIp:10.0.0.9, metaPort:9003, nodeIdentifier:-1501660166, dataPort:40010, clientPort:6667, clientIp:10.0.0.9) 
   2021-10-28 20:54:44,015 [DataHeartbeatClientThread-7] INFO  o.a.i.c.s.m.RaftMember:468 - Data(10.0.0.10:9003) sending response -1 to the elector Node(internalIp:10.0.0.9, metaPort:9003, nodeIdentifier:-1501660166, dataPort:40010, clientPort:6667, clientIp:10.0.0.9) 
   2021-10-28 20:54:44,016 [DataHeartbeatClientThread-7] INFO  o.a.i.c.s.m.RaftMember:656 - Data(10.0.0.10:9003) has become a follower of Node(internalIp:10.0.0.9, metaPort:9003, nodeIdentifier:-1501660166, dataPort:40010, clientPort:6667, clientIp:10.0.0.9) in term 2 
   
   // As some log should be lost, I guess:
   // Node10 received heartbeat from Node09 in term=2, Node 10 update its term=2 and became follower
   2021-10-28 20:54:49,133 [DataHeartbeatClientThread-12] INFO  o.a.i.c.s.m.RaftMember:1550 - Data(10.0.0.10:9003) has update it's term to 2 
   2021-10-28 20:54:49,133 [DataHeartbeatClientThread-12] INFO  o.a.i.c.s.m.RaftMember:1592 - Data(10.0.0.10:9003) has update it's voteFor to null 
   2021-10-28 20:54:49,133 [DataHeartbeatClientThread-12] INFO  o.a.i.c.s.m.RaftMember:654 - Data(10.0.0.10:9003) has been set to null in term 2 
   2021-10-28 20:54:49,134 [DataHeartbeatClientThread-12] INFO  o.a.i.c.s.m.RaftMember:634 - Data(10.0.0.10:9003) has become a FOLLOWER 
   2021-10-28 20:54:49,134 [DataHeartbeatClientThread-12] INFO  o.a.i.c.s.m.RaftMember:656 - Data(10.0.0.10:9003) has become a follower of Node(internalIp:10.0.0.9, metaPort:9003, nodeIdentifier:-1501660166, dataPort:40010, clientPort:6667, clientIp:10.0.0.9) in term 2 
   
   // Node09 received raft request like appendLog from Node10 in term = 2,Node09 set character to follower
   2021-10-28 20:54:49,134 [DataClientThread-349] INFO  o.a.i.c.s.m.RaftMember:656 - Data(10.0.0.10:9003) has become a follower of Node(internalIp:10.0.0.10, metaPort:9003, nodeIdentifier:-1501659200, dataPort:40010, clientPort:6667, clientIp:10.0.0.10) in term 2 
   
   // Node10 received election request from Node09 in term=2, it rejected the election as log mismatch
   2021-10-28 20:54:49,151 [DataHeartbeatClientThread-11] INFO  o.a.i.c.s.m.DataGroupMember:358 - Data(10.0.0.10:9003) received an dataGroup election request, term:2, metaLastLogIndex:792, metaLastLogTerm:2, dataLastLogIndex:7828302, dataLastLogTerm:1 
   2021-10-28 20:54:49,162 [DataHeartbeatClientThread-11] INFO  o.a.i.c.s.m.DataGroupMember:395 - Data(10.0.0.10:9003) rejected an dataGroup election request, term:2/2, dataLogIndex:7828302/7828313, dataLogTerm:1/2, metaLogIndex:792/792,metaLogTerm:2/2 
   2021-10-28 20:54:49,162 [DataHeartbeatClientThread-11] INFO  o.a.i.c.s.m.RaftMember:468 - Data(10.0.0.10:9003) sending response -2 to the elector Node(internalIp:10.0.0.9, metaPort:9003, nodeIdentifier:-1501660166, dataPort:40010, clientPort:6667, clientIp:10.0.0.9) 
   
   // Node09 received election for term=2 response from Node10
   2021-10-28 20:54:49,162 [Data(10.0.0.10:9003)-SerialToParallel0] INFO  o.a.i.c.s.h.c.ElectionHandler:79 - Data(10.0.0.10:9003): Terminated election received a election response -2 from Node(internalIp:10.0.0.10, metaPort:9003, nodeIdentifier:-1501659200, dataPort:40010, clientPort:6667, clientIp:10.0.0.10) 
   
   // Node08 became follower as received appendLog request from Node10, set leader to 10, term=2
   2021-10-28 20:54:49,141 [DataClientThread-300] INFO  o.a.i.c.s.m.RaftMember:656 - Data(10.0.0.10:9003) has become a follower of Node(internalIp:10.0.0.10, metaPort:9003, nodeIdentifier:-1501659200, dataPort:40010, clientPort:6667, clientIp:10.0.0.10) in term 2 
   
   // Node08 became follower as received appendLog request from Node10, set leader to 09, term=2
   2021-10-28 20:54:49,158 [DataHeartbeatClientThread-7] INFO  o.a.i.c.s.m.RaftMember:656 - Data(10.0.0.10:9003) has become a follower of Node(internalIp:10.0.0.9, metaPort:9003, nodeIdentifier:-1501660166, dataPort:40010, clientPort:6667, clientIp:10.0.0.9) in term 2 
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@iotdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org