You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ratis.apache.org by GitBox <gi...@apache.org> on 2020/11/10 00:02:14 UTC

[GitHub] [incubator-ratis] amaliujia opened a new pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

amaliujia opened a new pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264


   ## What changes were proposed in this pull request?
   
   I found this issue when setting up MiniRaftCluster for data stream. In a normal case, every Raft peer can act as a primary server and clients will decide which Raft to use.
   
   However, when all Raft peers can act as a primary. in previous implementation, there will be a deadlock case that a server will receive a request, and then route to other peers, and other peers will do the same.
   
   Thus we should distinguish primary server by adding primary server id into the Raft proto, and do not forward requests to peers when the server is not the primary.
   
   ## What is the link to the Apache JIRA
   
   
   
   Please replace this section with the link to the Apache JIRA)
   
   ## How was this patch tested?
   
   UT
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo commented on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724384251


   @amaliujia , after RATIS-1133, the stream header (DataStreamOutputImpl.header or StreamInfo.reqeust) in each the servers becomes the same.  The primary server can be determined by calling RaftClientRequest.getReplierId().  Therefore, we don't have to add the primary server id to each packet.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] amaliujia commented on a change in pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#discussion_r520199444



##########
File path: ratis-netty/src/main/java/org/apache/ratis/netty/server/NettyServerStreamRpc.java
##########
@@ -247,13 +248,13 @@ public void addRaftPeers(Collection<RaftPeer> newPeers) {
     proxies.addPeers(newPeers);
   }
 
-  private StreamInfo newStreamInfo(ByteBuf buf) {
+  private StreamInfo newStreamInfo(ByteBuf buf, boolean isPrimary) {
     try {
       final RaftClientRequest request = ClientProtoUtils.toRaftClientRequest(
           RaftClientRequestProto.parseFrom(buf.nioBuffer()));
       final StateMachine stateMachine = server.getStateMachine(request.getRaftGroupId());
       return new StreamInfo(request, stateMachine.data().stream(request),
-          proxies.getDataStreamOutput(request));
+          isPrimary ? proxies.getDataStreamOutput(request) : Collections.EMPTY_LIST);

Review comment:
       Only primary server should create DataStreamOutputs for peers, otherwise it will build up a deadlock.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724377078


   > there will be a deadlock case that a server will receive a request, and then route to other peers, and other peers will do the same
   
   Could you explain the deadlock clearer ?  which lock was held and wait for which lock ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #264: RATIS-1140. Do not create DataStreamOutput for non-primary server

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724520869


   @amaliujia Thanks the patch. @szetszwo Thanks for review. I have merged it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724375773


   @amaliujia PR title is too long. Please notice that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] amaliujia edited a comment on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
amaliujia edited a comment on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724386492


   > > there will be a deadlock case that a server will receive a request, and then route to other peers, and other peers will do the same
   > 
   > Could you explain the deadlock clearer ? which lock was held and which lock was waited?
   
   For a normal cluster, every server can be a primary server
   
   ```
   A ----B
   \    /
   \   /
     C
   ```
   
   In previous implementation, if A receives a header request, it will forward to B and C. And then B and C will also forward the request to A and C ( A and B). So in this case, A will wait its peers to reply on the header request, same for B and C. Because each peer will wait for other to reply, I will call this is a "deadlock" situation. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] amaliujia commented on a change in pull request #264: RATIS-1140. Do not create DataStreamOutput for non-primary server

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#discussion_r520276919



##########
File path: ratis-netty/src/main/java/org/apache/ratis/netty/server/NettyServerStreamRpc.java
##########
@@ -247,13 +248,13 @@ public void addRaftPeers(Collection<RaftPeer> newPeers) {
     proxies.addPeers(newPeers);
   }
 
-  private StreamInfo newStreamInfo(ByteBuf buf) {
+  private StreamInfo newStreamInfo(ByteBuf buf, boolean isPrimary) {
     try {
       final RaftClientRequest request = ClientProtoUtils.toRaftClientRequest(
           RaftClientRequestProto.parseFrom(buf.nioBuffer()));
       final StateMachine stateMachine = server.getStateMachine(request.getRaftGroupId());
       return new StreamInfo(request, stateMachine.data().stream(request),
-          proxies.getDataStreamOutput(request));
+          isPrimary ? proxies.getDataStreamOutput(request) : Collections.EMPTY_LIST);

Review comment:
       Ack. Didn't notice that the RaftClientRequest in header has the server id.
   
   Now this PR becomes very simple again :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #264: RATIS-1140. Do not create DataStreamOutput for non-primary server

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724530980


   > @runzhiwang, RATIS-1133 is really great. After it, each server can determine who is the primary server. Therefore, we may remove DataStreamPacketHeaderProto.STREAM_CLOSE_FORWARD. What do you think?
   
   @szetszwo  Thanks a lot. I agree and create a new pr to do it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo commented on a change in pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
szetszwo commented on a change in pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#discussion_r520225210



##########
File path: ratis-netty/src/main/java/org/apache/ratis/netty/server/NettyServerStreamRpc.java
##########
@@ -247,13 +248,13 @@ public void addRaftPeers(Collection<RaftPeer> newPeers) {
     proxies.addPeers(newPeers);
   }
 
-  private StreamInfo newStreamInfo(ByteBuf buf) {
+  private StreamInfo newStreamInfo(ByteBuf buf, boolean isPrimary) {
     try {
       final RaftClientRequest request = ClientProtoUtils.toRaftClientRequest(
           RaftClientRequestProto.parseFrom(buf.nioBuffer()));
       final StateMachine stateMachine = server.getStateMachine(request.getRaftGroupId());
       return new StreamInfo(request, stateMachine.data().stream(request),
-          proxies.getDataStreamOutput(request));
+          isPrimary ? proxies.getDataStreamOutput(request) : Collections.EMPTY_LIST);

Review comment:
       It is a good idea.   We may use
   ```
         final List<DataStreamOutputRpc> outs = server.getId().equals(request.getServerId())?  
             proxies.getDataStreamOutput(request): Collections.emptyList();
         return new StreamInfo(request, stateMachine.data().stream(request), outs);
   ```
   Note that request is RaftClientRequest.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang merged pull request #264: RATIS-1140. Do not create DataStreamOutput for non-primary server

Posted by GitBox <gi...@apache.org>.
runzhiwang merged pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724377078


   > there will be a deadlock case that a server will receive a request, and then route to other peers, and other peers will do the same
   
   Could you explain the deadlock clearer ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang commented on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724388442


   > > > there will be a deadlock case that a server will receive a request, and then route to other peers, and other peers will do the same
   > > 
   > > 
   > > Could you explain the deadlock clearer ? which lock was held and which lock was waited?
   > 
   > For a normal cluster, every server can be a primary server
   > 
   > ```
   > A ----B
   > \    /
   > \   /
   >   C
   > ```
   > 
   > In previous implementation, if A receives a header request, it will forward to B and C. And then B and C will also forward the request to A and C ( A and B). So in this case, A will wait its peers to reply on the header request, same for B and C. Because each peer will wait for others to reply, I will call this is a "deadlock" situation.
   
   @amaliujia Thanks for explanation.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] amaliujia edited a comment on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
amaliujia edited a comment on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724386492


   > > there will be a deadlock case that a server will receive a request, and then route to other peers, and other peers will do the same
   > 
   > Could you explain the deadlock clearer ? which lock was held and which lock was waited?
   
   For a normal cluster, every server can be a primary server
   
   ```
   A ----B
   \    /
   \   /
     C
   ```
   
   In previous implementation, if A receives a header request, it will forward to B and C. And then B and C will also forward the request to A and C ( A and B). So in this case, A will wait its peers to reply on the header request, same for B and C. Because each peer will wait for others to reply, I will call this is a "deadlock" situation. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] amaliujia commented on a change in pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#discussion_r520200313



##########
File path: ratis-test/src/test/java/org/apache/ratis/datastream/DataStreamBaseTest.java
##########
@@ -335,17 +335,18 @@ void setup(List<RaftPeer> peers, List<RaftServer> raftServers) {
     // start stream servers on raft peers.
     for (int i = 0; i < peers.size(); i++) {
       final Server server = new Server(peers.get(i), raftServers.get(i));
-      if (i == 0) {
-        // only the first server routes requests to peers.
-        List<RaftPeer> otherPeers = new ArrayList<>(peers);
-        otherPeers.remove(peers.get(i));
-        server.addRaftPeers(otherPeers);
-      }
+      server.addRaftPeers(removePeerFromList(peers.get(i), peers));

Review comment:
       I updated existing test setup to make every RaftPeer can be a primary server.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo commented on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724385471


   It is even better to use RaftClientRequest.getServerId() instead of RaftClientRequest.getReplierId().


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724377078


   > there will be a deadlock case that a server will receive a request, and then route to other peers, and other peers will do the same
   
   Could you explain the deadlock clearer ?  which lock was held and which lock was waited?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] szetszwo commented on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
szetszwo commented on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724402299


   @runzhiwang,  RATIS-1133 is really great.  After it, each server can determine who is the primary server.  Therefore, we may remove DataStreamPacketHeaderProto.STREAM_CLOSE_FORWARD.  What do you think?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] amaliujia commented on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
amaliujia commented on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724386492


   > > there will be a deadlock case that a server will receive a request, and then route to other peers, and other peers will do the same
   > 
   > Could you explain the deadlock clearer ? which lock was held and which lock was waited?
   
   For a normal cluster, every server can be a primary server
   
   A ----B
   \      /
   \   /
     C
   
   In previous implementation, if A receives a header request, it will forward to B and C. And then B and C will also forward the request to A and C ( A and B). So in this case, A will wait its peers to reply on the header request, same for B and C. Because each peer will wait for other to reply, I will call this is a "deadlock" situation. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] amaliujia commented on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
amaliujia commented on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724358535


   R @szetszwo @runzhiwang 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-ratis] runzhiwang edited a comment on pull request #264: RATIS-1140. Add primary peer id to Raft proto and only create channels with peer for primary servers

Posted by GitBox <gi...@apache.org>.
runzhiwang edited a comment on pull request #264:
URL: https://github.com/apache/incubator-ratis/pull/264#issuecomment-724375773


   @amaliujia PR title is too long. Please notice that. PR title should be one line.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org