You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "adoroszlai (via GitHub)" <gi...@apache.org> on 2023/04/13 09:52:26 UTC

[GitHub] [ratis] adoroszlai opened a new pull request, #872: RATIS-1831. GrpcLogAppender spinning idle between heartbeats

adoroszlai opened a new pull request, #872:
URL: https://github.com/apache/ratis/pull/872

   ## What changes were proposed in this pull request?
   
   `GrpcLogAppender`'s main loop does not wait if there is nothing to be sent or follower is slow (last return in `getWaitTimeMs`):
   
   https://github.com/apache/ratis/blob/a9036487c5f09e8f1132be9655eb54a177573e7c/ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java#L173-L183
   
   The problem is that after RATIS-1793 `getMinWaitTimeMs` is no longer a fixed time period (10 milliseconds by default), rather:
   
   https://github.com/apache/ratis/blob/a9036487c5f09e8f1132be9655eb54a177573e7c/ratis-server/src/main/java/org/apache/ratis/server/leader/LogAppenderBase.java#L139-L141
   
   When `GrpcLogAppender` has nothing to send, it first waits 10 milliseconds, but then wait time is calculated to be 0, -1, -2, etc.
   
   So the main loop is repeated without delay, executing only the calculations required for `mayWait()`, the `if` conditions and the health check at the end:
   
   https://github.com/apache/ratis/blob/a9036487c5f09e8f1132be9655eb54a177573e7c/ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java#L144-L168
   
   Loops and actual `appendLog` calls in e.g. `TestRaftServerWithGrpc` (`Time elapsed: 43.617 s`):
   
   ```
   [s2@group-C70075F1A2A0->s0-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 4495026, appends: 23050
   [s2@group-C70075F1A2A0->s1-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 4557988, appends: 19479
   [s5@group-FB3DFE05D468->s4-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 764760, appends: 26981
   [s5@group-FB3DFE05D468->s3-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 811251, appends: 30514
   [s1@group-B7E5427EDB7E->s2-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 560013, appends: 16701
   [s1@group-B7E5427EDB7E->s0-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 550258, appends: 16038
   [s0@group-3FA11CC46038->s2-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 170617, appends: 52885
   [s0@group-3FA11CC46038->s1-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 176756, appends: 51417
   [s2@group-3FA11CC46038->s0-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 4371812, appends: 14661
   [s2@group-3FA11CC46038->s1-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 3944262, appends: 14548
   [s0@group-D9AF8EBABE98->s1-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 4142687, appends: 34829
   [s0@group-D9AF8EBABE98->s2-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 4396141, appends: 35557
   [s3@group-A19A2D3A94D0->s4-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 513534, appends: 25769
   [s3@group-A19A2D3A94D0->s5-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 488910, appends: 25721
   [s1@group-4DD6E4A8B460->s0-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 527922, appends: 18814
   [s1@group-4DD6E4A8B460->s2-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 553175, appends: 21177
   [s1@group-E53FC550349E->s2-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 250066, appends: 69599
   [s1@group-E53FC550349E->s0-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 244747, appends: 70276
   [s0@group-E53FC550349E->s1-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 4326137, appends: 14724
   [s0@group-E53FC550349E->s2-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 4307451, appends: 15278
   ```
   
   With the fix (time elapsed is similar):
   
   ```
   [s2@group-45E35B7301C2->s0-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 496, appends: 77
   [s2@group-45E35B7301C2->s1-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 497, appends: 76
   [s4@group-71BED91C6F16->s3-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 103, appends: 45
   [s4@group-71BED91C6F16->s5-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 98, appends: 41
   [s1@group-F29A85842EB6->s0-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 85, appends: 27
   [s1@group-F29A85842EB6->s2-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 86, appends: 28
   [s0@group-DF2C773274F6->s1-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 70, appends: 48
   [s0@group-DF2C773274F6->s2-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 67, appends: 47
   [s2@group-DF2C773274F6->s0-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 508, appends: 69
   [s2@group-DF2C773274F6->s1-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 508, appends: 69
   [s1@group-2BD0D83FC743->s2-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 502, appends: 77
   [s1@group-2BD0D83FC743->s0-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 500, appends: 76
   [s4@group-066452773F9D->s3-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 105, appends: 37
   [s4@group-066452773F9D->s5-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 105, appends: 46
   [s1@group-CD875777E36F->s0-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 82, appends: 21
   [s1@group-CD875777E36F->s2-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 83, appends: 26
   [s0@group-04394E731326->s1-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 56, appends: 40
   [s0@group-04394E731326->s2-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 56, appends: 42
   [s1@group-04394E731326->s0-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 470, appends: 63
   [s1@group-04394E731326->s2-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:run(172)) - loops: 470, appends: 64
   ```
   
   https://issues.apache.org/jira/browse/RATIS-1831


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] adoroszlai merged pull request #872: RATIS-1831. GrpcLogAppender spinning idle between heartbeats

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai merged PR #872:
URL: https://github.com/apache/ratis/pull/872


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] adoroszlai commented on pull request #872: RATIS-1831. GrpcLogAppender spinning idle between heartbeats

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on PR #872:
URL: https://github.com/apache/ratis/pull/872#issuecomment-1507278923

   Thanks @szetszwo, @SzyWilliam for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] adoroszlai commented on a diff in pull request #872: RATIS-1831. GrpcLogAppender spinning idle between heartbeats

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on code in PR #872:
URL: https://github.com/apache/ratis/pull/872#discussion_r1165736568


##########
ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java:
##########
@@ -177,7 +177,7 @@ public long getWaitTimeMs() {
       // For normal nodes, new entries should be sent ASAP
       // however for slow followers (especially when the follower is down),
       // keep sending without any wait time only ends up in high CPU load
-      return Math.max(getMinWaitTimeMs(), 0L);
+      return Math.max(getTimeToWaitUntilNextSend(), 0L);
     }
     return Math.min(getMinWaitTimeMs(), getHeartbeatWaitTimeMs());

Review Comment:
   Thanks @szetszwo for the suggestion, I think this will also work.  Counts with this change:
   
   ```
   [s2@group-F8FAB959D329->s2-GrpcLogAppender-LogAppenderDaemon] - loops: 123, appends: 64
   [s1@group-F8FAB959D329->s0-GrpcLogAppender-LogAppenderDaemon] - loops: 122, appends: 63
   [s3@group-1E6858E6BEE2->s4-GrpcLogAppender-LogAppenderDaemon] - loops: 37, appends: 21
   [s3@group-1E6858E6BEE2->s5-GrpcLogAppender-LogAppenderDaemon] - loops: 35, appends: 23
   [s0@group-6D3C124B49BF->s1-GrpcLogAppender-LogAppenderDaemon] - loops: 37, appends: 25
   [s0@group-6D3C124B49BF->s2-GrpcLogAppender-LogAppenderDaemon] - loops: 36, appends: 22
   [s2@group-94A34C37A647->s0-GrpcLogAppender-LogAppenderDaemon] - loops: 24, appends: 18
   [s2@group-94A34C37A647->s1-GrpcLogAppender-LogAppenderDaemon] - loops: 25, appends: 17
   [s1@group-94A34C37A647->s2-GrpcLogAppender-LogAppenderDaemon] - loops: 117, appends: 59
   [s1@group-94A34C37A647->s0-GrpcLogAppender-LogAppenderDaemon] - loops: 117, appends: 61
   [s1@group-F9F7394FB05A->s2-GrpcLogAppender-LogAppenderDaemon] - loops: 120, appends: 61
   [s1@group-F9F7394FB05A->s0-GrpcLogAppender-LogAppenderDaemon] - loops: 121, appends: 63
   [s3@group-B7E5BC575823->s5-GrpcLogAppender-LogAppenderDaemon] - loops: 37, appends: 23
   [s3@group-B7E5BC575823->s4-GrpcLogAppender-LogAppenderDaemon] - loops: 41, appends: 28
   [s0@group-8D067B36BF3B->s2-GrpcLogAppender-LogAppenderDaemon] - loops: 33, appends: 21
   [s0@group-8D067B36BF3B->s1-GrpcLogAppender-LogAppenderDaemon] - loops: 37, appends: 24
   [s0@group-2EC9A438CF0C->s1-GrpcLogAppender-LogAppenderDaemon] - loops: 21, appends: 16
   [s0@group-2EC9A438CF0C->s2-GrpcLogAppender-LogAppenderDaemon] - loops: 24, appends: 16
   [s2@group-2EC9A438CF0C->s0-GrpcLogAppender-LogAppenderDaemon] - loops: 121, appends: 61
   [s2@group-2EC9A438CF0C->s1-GrpcLogAppender-LogAppenderDaemon] - loops: 120, appends: 63
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [ratis] szetszwo commented on a diff in pull request #872: RATIS-1831. GrpcLogAppender spinning idle between heartbeats

Posted by "szetszwo (via GitHub)" <gi...@apache.org>.
szetszwo commented on code in PR #872:
URL: https://github.com/apache/ratis/pull/872#discussion_r1165664971


##########
ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java:
##########
@@ -177,7 +177,7 @@ public long getWaitTimeMs() {
       // For normal nodes, new entries should be sent ASAP
       // however for slow followers (especially when the follower is down),
       // keep sending without any wait time only ends up in high CPU load
-      return Math.max(getMinWaitTimeMs(), 0L);
+      return Math.max(getTimeToWaitUntilNextSend(), 0L);
     }
     return Math.min(getMinWaitTimeMs(), getHeartbeatWaitTimeMs());

Review Comment:
   @adoroszlai , thanks a lot for working on this!
   
   Since `appendLog(..)` already takes care the min wait time, how about we simply return `getHeartbeatWaitTimeMs()` and keep other code unchanged?
   ```java
   +++ b/ratis-grpc/src/main/java/org/apache/ratis/grpc/server/GrpcLogAppender.java
   @@ -179,7 +179,7 @@ public class GrpcLogAppender extends LogAppenderBase {
          // keep sending without any wait time only ends up in high CPU load
          return Math.max(getMinWaitTimeMs(), 0L);
        }
   -    return Math.min(getMinWaitTimeMs(), getHeartbeatWaitTimeMs());
   +    return getHeartbeatWaitTimeMs();
      }
    
      private boolean isSlowFollower() {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ratis.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org