You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "hachikuji (via GitHub)" <gi...@apache.org> on 2023/02/06 23:03:44 UTC

[GitHub] [kafka] hachikuji opened a new pull request, #13207: KAFKA-14664; Fix inaccurate raft idle ratio metric

hachikuji opened a new pull request, #13207:
URL: https://github.com/apache/kafka/pull/13207

   The raft idle ratio is currently computed as the average of all recorded poll durations. This tends to underestimate the actual idle ratio since it treats all measurements equally regardless how much time was spent. For example, say we poll twice with the following durations:
   
   Poll 1: 2s
   Poll 2: 0s
   
   Assume that the busy time is negligible, so 2s passes overall.
   
   In the first measurement, 2s is spent waiting, so we compute and record a ratio of 1.0. In the second measurement, no time passes, and we record 0.0. The idle ratio is then computed as the average of these two values (1.0 + 0.0 / 2 = 0.5), which suggests that the process was busy for 1s, which overestimates the true busy time.
   
   In this patch, I've created a new `TimeRatio` class which tracks the total duration of a periodic event over a full interval of time measurement.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] hachikuji commented on a diff in pull request #13207: KAFKA-14664; Fix inaccurate raft idle ratio metric

Posted by "hachikuji (via GitHub)" <gi...@apache.org>.
hachikuji commented on code in PR #13207:
URL: https://github.com/apache/kafka/pull/13207#discussion_r1105000782


##########
raft/src/main/java/org/apache/kafka/raft/internals/TimeRatio.java:
##########
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.raft.internals;
+
+import org.apache.kafka.common.metrics.MeasurableStat;
+import org.apache.kafka.common.metrics.MetricConfig;
+
+/**
+ * Maintains an approximate ratio of the duration of a specific event
+ * over all time. For example, this can be used to compute the ratio of
+ * time that a thread is busy or idle. The value is approximate since the
+ * measurement and recording intervals may not be aligned.
+ *
+ * Note that the duration of the event is assumed to be small relative to
+ * the interval of measurement.
+ *
+ */
+public class TimeRatio implements MeasurableStat {
+    private long intervalStartTimestampMs = -1;
+    private long lastRecordedTimestampMs = -1;
+    private double totalRecordedDurationMs = 0;
+
+    private final double defaultRatio;
+
+    public TimeRatio(double defaultRatio) {
+        this.defaultRatio = defaultRatio;
+    }
+
+    @Override
+    public double measure(MetricConfig config, long currentTimestampMs) {
+        if (lastRecordedTimestampMs < 0) {
+            // Return the default value if no recordings have been captured.
+            return defaultRatio;
+        } else {
+            // We measure the ratio over the
+            double intervalDurationMs = Math.max(lastRecordedTimestampMs - intervalStartTimestampMs, 0);
+            final double ratio;
+            if (intervalDurationMs == 0) {
+                ratio = defaultRatio;
+            } else if (totalRecordedDurationMs > intervalDurationMs) {
+                ratio = 1.0;
+            } else {
+                ratio = totalRecordedDurationMs / intervalDurationMs;
+            }
+
+            // The next interval begins at the
+            intervalStartTimestampMs = lastRecordedTimestampMs;
+            totalRecordedDurationMs = 0;
+            return ratio;
+        }
+    }
+
+    @Override
+    public void record(MetricConfig config, double value, long currentTimestampMs) {
+        if (intervalStartTimestampMs < 0) {
+            // Discard the initial value since the value occurred prior to the interval start
+            intervalStartTimestampMs = currentTimestampMs;

Review Comment:
   Yeah, the awkward part of this patch is the reliance on the measuring interval. We might be able to do without it if we tracked the non-poll time as well, but we'd still need some notion of a window of measurement. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] jsancio commented on a diff in pull request #13207: KAFKA-14664; Fix inaccurate raft idle ratio metric

Posted by "jsancio (via GitHub)" <gi...@apache.org>.
jsancio commented on code in PR #13207:
URL: https://github.com/apache/kafka/pull/13207#discussion_r1098994575


##########
raft/src/main/java/org/apache/kafka/raft/internals/KafkaRaftMetrics.java:
##########
@@ -133,26 +131,27 @@ public KafkaRaftMetrics(Metrics metrics, String metricGrpPrefix, QuorumState sta
                 "The average number of records appended per sec as the leader of the raft quorum."),
                 new Rate(TimeUnit.SECONDS, new WindowedSum()));
 
-        this.pollIdleSensor = metrics.sensor("poll-idle-ratio");
-        this.pollIdleSensor.add(metrics.metricName("poll-idle-ratio-avg",
+        this.pollDurationSensor = metrics.sensor("poll-idle-ratio");
+        this.pollDurationSensor.add(metrics.metricName(

Review Comment:
   Minor but I would add a newline before `metrics.metricName`.



##########
raft/src/main/java/org/apache/kafka/raft/internals/KafkaRaftMetrics.java:
##########
@@ -133,26 +131,27 @@ public KafkaRaftMetrics(Metrics metrics, String metricGrpPrefix, QuorumState sta
                 "The average number of records appended per sec as the leader of the raft quorum."),
                 new Rate(TimeUnit.SECONDS, new WindowedSum()));
 
-        this.pollIdleSensor = metrics.sensor("poll-idle-ratio");
-        this.pollIdleSensor.add(metrics.metricName("poll-idle-ratio-avg",
+        this.pollDurationSensor = metrics.sensor("poll-idle-ratio");
+        this.pollDurationSensor.add(metrics.metricName(
+                "poll-idle-ratio-avg",
                 metricGroupName,
-                "The average fraction of time the client's poll() is idle as opposed to waiting for the user code to process records."),
-                new Avg());
+                "The ratio of time the Raft IO thread is idle as opposed to " +
+                    "doing work (e.g. handling requests or replicating from the leader)"
+            ),
+            new TimeRatio(1.0)
+        );
     }
 
     public void updatePollStart(long currentTimeMs) {
-        if (pollEndMs.isPresent() && pollStartMs.isPresent()) {
-            long pollTimeMs = Math.max(pollEndMs.getAsLong() - pollStartMs.getAsLong(), 0L);
-            long totalTimeMs = Math.max(currentTimeMs - pollStartMs.getAsLong(), 1L);
-            this.pollIdleSensor.record(pollTimeMs / (double) totalTimeMs, currentTimeMs);
-        }
-
         this.pollStartMs = OptionalLong.of(currentTimeMs);
-        this.pollEndMs = OptionalLong.empty();
     }
 
     public void updatePollEnd(long currentTimeMs) {
-        this.pollEndMs = OptionalLong.of(currentTimeMs);
+        if (pollStartMs.isPresent()) {
+            long pollDurationMs = Math.max(currentTimeMs - pollStartMs.getAsLong(), 0L);

Review Comment:
   Instead of taking the max should we throw `IllegalArgumentException` or `IllegalStateException` if the difference is negative?



##########
raft/src/test/java/org/apache/kafka/raft/internals/KafkaRaftMetricsTest.java:
##########
@@ -190,25 +190,48 @@ public void shouldRecordNumUnknownVoterConnections() throws IOException {
     }
 
     @Test
-    public void shouldRecordPollIdleRatio() throws IOException {
+    public void shouldRecordPollIdleRatio() {
         QuorumState state = buildQuorumState(Collections.singleton(localId));
         state.initialize(new OffsetAndEpoch(0L, 0));
         raftMetrics = new KafkaRaftMetrics(metrics, "raft", state);
 
+        // First recording is discarded (in order to align the interval of measurement)
+        raftMetrics.updatePollStart(time.milliseconds());
+        raftMetrics.updatePollEnd(time.milliseconds());
+
+        // Idle for 100ms
         raftMetrics.updatePollStart(time.milliseconds());
         time.sleep(100L);
         raftMetrics.updatePollEnd(time.milliseconds());
-        time.sleep(900L);
+
+        // Busy for 100ms
+        time.sleep(100L);
+
+        // Idle for 200ms
         raftMetrics.updatePollStart(time.milliseconds());
+        time.sleep(200L);
+        raftMetrics.updatePollEnd(time.milliseconds());
 
-        assertEquals(0.1, getMetric(metrics, "poll-idle-ratio-avg").metricValue());
+        assertEquals(0.75, getMetric(metrics, "poll-idle-ratio-avg").metricValue());
 
+        // Busy for 100ms
         time.sleep(100L);
+
+        // Idle for 75ms
+        raftMetrics.updatePollStart(time.milliseconds());
+        time.sleep(75L);
         raftMetrics.updatePollEnd(time.milliseconds());
-        time.sleep(100L);
+
+        // Idle for 25ms
         raftMetrics.updatePollStart(time.milliseconds());
+        time.sleep(25L);
+        raftMetrics.updatePollEnd(time.milliseconds());
+
+        // Idle for 0ms
+        raftMetrics.updatePollStart(time.milliseconds());
+        raftMetrics.updatePollEnd(time.milliseconds());
 
-        assertEquals(0.3, getMetric(metrics, "poll-idle-ratio-avg").metricValue());
+        assertEquals(0.5, getMetric(metrics, "poll-idle-ratio-avg").metricValue());

Review Comment:
   Should we add a test for measuring the metric in between `updatePollStart` and `updatePollEnd`?



##########
raft/src/main/java/org/apache/kafka/raft/internals/TimeRatio.java:
##########
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.raft.internals;
+
+import org.apache.kafka.common.metrics.MeasurableStat;
+import org.apache.kafka.common.metrics.MetricConfig;
+
+/**
+ * Maintains an approximate ratio of the duration of a specific event
+ * over all time. For example, this can be used to compute the ratio of
+ * time that a thread is busy or idle. The value is approximate since the
+ * measurement and recording intervals may not be aligned.
+ *
+ * Note that the duration of the event is assumed to be small relative to
+ * the interval of measurement.
+ *
+ */
+public class TimeRatio implements MeasurableStat {
+    private long intervalStartTimestampMs = -1;
+    private long lastRecordedTimestampMs = -1;
+    private double totalRecordedDurationMs = 0;
+
+    private final double defaultRatio;
+
+    public TimeRatio(double defaultRatio) {
+        this.defaultRatio = defaultRatio;

Review Comment:
   Should this check that `defaultRatio` is between `1.0` and `0.0`?



##########
raft/src/test/java/org/apache/kafka/raft/internals/TimeRatioTest.java:
##########
@@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.raft.internals;
+
+import org.apache.kafka.common.metrics.MetricConfig;
+import org.apache.kafka.common.utils.MockTime;
+import org.junit.jupiter.api.Test;
+
+import static org.junit.jupiter.api.Assertions.assertEquals;
+
+class TimeRatioTest {
+
+    @Test
+    public void testRatio() {
+        MetricConfig config = new MetricConfig();
+        MockTime time = new MockTime();
+        TimeRatio ratio = new TimeRatio(1.0);
+
+        ratio.record(config, 0.0, time.milliseconds());
+        time.sleep(10);
+        ratio.record(config, 10, time.milliseconds());
+        time.sleep(10);
+        ratio.record(config, 0, time.milliseconds());
+        assertEquals(0.5, ratio.measure(config, time.milliseconds()));
+
+        time.sleep(10);
+        ratio.record(config, 10, time.milliseconds());
+        time.sleep(40);
+        ratio.record(config, 0, time.milliseconds());
+        assertEquals(0.2, ratio.measure(config, time.milliseconds()));
+    }
+
+}

Review Comment:
   Missing newline.



##########
raft/src/main/java/org/apache/kafka/raft/internals/TimeRatio.java:
##########
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.raft.internals;
+
+import org.apache.kafka.common.metrics.MeasurableStat;
+import org.apache.kafka.common.metrics.MetricConfig;
+
+/**
+ * Maintains an approximate ratio of the duration of a specific event
+ * over all time. For example, this can be used to compute the ratio of
+ * time that a thread is busy or idle. The value is approximate since the
+ * measurement and recording intervals may not be aligned.
+ *
+ * Note that the duration of the event is assumed to be small relative to
+ * the interval of measurement.
+ *
+ */
+public class TimeRatio implements MeasurableStat {
+    private long intervalStartTimestampMs = -1;
+    private long lastRecordedTimestampMs = -1;
+    private double totalRecordedDurationMs = 0;
+
+    private final double defaultRatio;
+
+    public TimeRatio(double defaultRatio) {
+        this.defaultRatio = defaultRatio;
+    }
+
+    @Override
+    public double measure(MetricConfig config, long currentTimestampMs) {
+        if (lastRecordedTimestampMs < 0) {
+            // Return the default value if no recordings have been captured.
+            return defaultRatio;
+        } else {
+            // We measure the ratio over the
+            double intervalDurationMs = Math.max(lastRecordedTimestampMs - intervalStartTimestampMs, 0);
+            final double ratio;
+            if (intervalDurationMs == 0) {
+                ratio = defaultRatio;
+            } else if (totalRecordedDurationMs > intervalDurationMs) {
+                ratio = 1.0;
+            } else {
+                ratio = totalRecordedDurationMs / intervalDurationMs;
+            }
+
+            // The next interval begins at the
+            intervalStartTimestampMs = lastRecordedTimestampMs;
+            totalRecordedDurationMs = 0;
+            return ratio;
+        }
+    }
+
+    @Override
+    public void record(MetricConfig config, double value, long currentTimestampMs) {
+        if (intervalStartTimestampMs < 0) {
+            // Discard the initial value since the value occurred prior to the interval start
+            intervalStartTimestampMs = currentTimestampMs;

Review Comment:
   Got it. To be able to remove this restriction we would have to change the `Sensor` API to allow the setting of this interval start time without recording a `value`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] hachikuji merged pull request #13207: KAFKA-14664; Fix inaccurate raft idle ratio metric

Posted by "hachikuji (via GitHub)" <gi...@apache.org>.
hachikuji merged PR #13207:
URL: https://github.com/apache/kafka/pull/13207


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] jsancio commented on a diff in pull request #13207: KAFKA-14664; Fix inaccurate raft idle ratio metric

Posted by "jsancio (via GitHub)" <gi...@apache.org>.
jsancio commented on code in PR #13207:
URL: https://github.com/apache/kafka/pull/13207#discussion_r1100891108


##########
raft/src/main/java/org/apache/kafka/raft/internals/KafkaRaftMetrics.java:
##########
@@ -133,26 +131,27 @@ public KafkaRaftMetrics(Metrics metrics, String metricGrpPrefix, QuorumState sta
                 "The average number of records appended per sec as the leader of the raft quorum."),
                 new Rate(TimeUnit.SECONDS, new WindowedSum()));
 
-        this.pollIdleSensor = metrics.sensor("poll-idle-ratio");
-        this.pollIdleSensor.add(metrics.metricName("poll-idle-ratio-avg",
+        this.pollDurationSensor = metrics.sensor("poll-idle-ratio");
+        this.pollDurationSensor.add(metrics.metricName(
+                "poll-idle-ratio-avg",
                 metricGroupName,
-                "The average fraction of time the client's poll() is idle as opposed to waiting for the user code to process records."),
-                new Avg());
+                "The ratio of time the Raft IO thread is idle as opposed to " +
+                    "doing work (e.g. handling requests or replicating from the leader)"
+            ),
+            new TimeRatio(1.0)
+        );
     }
 
     public void updatePollStart(long currentTimeMs) {
-        if (pollEndMs.isPresent() && pollStartMs.isPresent()) {
-            long pollTimeMs = Math.max(pollEndMs.getAsLong() - pollStartMs.getAsLong(), 0L);
-            long totalTimeMs = Math.max(currentTimeMs - pollStartMs.getAsLong(), 1L);
-            this.pollIdleSensor.record(pollTimeMs / (double) totalTimeMs, currentTimeMs);
-        }
-
         this.pollStartMs = OptionalLong.of(currentTimeMs);
-        this.pollEndMs = OptionalLong.empty();
     }
 
     public void updatePollEnd(long currentTimeMs) {
-        this.pollEndMs = OptionalLong.of(currentTimeMs);
+        if (pollStartMs.isPresent()) {
+            long pollDurationMs = Math.max(currentTimeMs - pollStartMs.getAsLong(), 0L);

Review Comment:
   Hmm. `KafkaRaftClient` uses `Time.milliseconds`. I think this is true if it used `Time.hiResClockMs`, no?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] hachikuji commented on a diff in pull request #13207: KAFKA-14664; Fix inaccurate raft idle ratio metric

Posted by "hachikuji (via GitHub)" <gi...@apache.org>.
hachikuji commented on code in PR #13207:
URL: https://github.com/apache/kafka/pull/13207#discussion_r1099065953


##########
raft/src/main/java/org/apache/kafka/raft/internals/KafkaRaftMetrics.java:
##########
@@ -133,26 +131,27 @@ public KafkaRaftMetrics(Metrics metrics, String metricGrpPrefix, QuorumState sta
                 "The average number of records appended per sec as the leader of the raft quorum."),
                 new Rate(TimeUnit.SECONDS, new WindowedSum()));
 
-        this.pollIdleSensor = metrics.sensor("poll-idle-ratio");
-        this.pollIdleSensor.add(metrics.metricName("poll-idle-ratio-avg",
+        this.pollDurationSensor = metrics.sensor("poll-idle-ratio");
+        this.pollDurationSensor.add(metrics.metricName(
+                "poll-idle-ratio-avg",
                 metricGroupName,
-                "The average fraction of time the client's poll() is idle as opposed to waiting for the user code to process records."),
-                new Avg());
+                "The ratio of time the Raft IO thread is idle as opposed to " +
+                    "doing work (e.g. handling requests or replicating from the leader)"
+            ),
+            new TimeRatio(1.0)
+        );
     }
 
     public void updatePollStart(long currentTimeMs) {
-        if (pollEndMs.isPresent() && pollStartMs.isPresent()) {
-            long pollTimeMs = Math.max(pollEndMs.getAsLong() - pollStartMs.getAsLong(), 0L);
-            long totalTimeMs = Math.max(currentTimeMs - pollStartMs.getAsLong(), 1L);
-            this.pollIdleSensor.record(pollTimeMs / (double) totalTimeMs, currentTimeMs);
-        }
-
         this.pollStartMs = OptionalLong.of(currentTimeMs);
-        this.pollEndMs = OptionalLong.empty();
     }
 
     public void updatePollEnd(long currentTimeMs) {
-        this.pollEndMs = OptionalLong.of(currentTimeMs);
+        if (pollStartMs.isPresent()) {
+            long pollDurationMs = Math.max(currentTimeMs - pollStartMs.getAsLong(), 0L);

Review Comment:
   The clock we rely on is not monotonic, so it is possible to go backwards. It seems better to handle this case gracefully with a default value than to crash the IO thread.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] jsancio commented on a diff in pull request #13207: KAFKA-14664; Fix inaccurate raft idle ratio metric

Posted by "jsancio (via GitHub)" <gi...@apache.org>.
jsancio commented on code in PR #13207:
URL: https://github.com/apache/kafka/pull/13207#discussion_r1105042885


##########
raft/src/test/java/org/apache/kafka/raft/internals/KafkaRaftMetricsTest.java:
##########
@@ -232,6 +232,32 @@ public void shouldRecordPollIdleRatio() {
         raftMetrics.updatePollEnd(time.milliseconds());
 
         assertEquals(0.5, getMetric(metrics, "poll-idle-ratio-avg").metricValue());
+
+        // Busy for 40ms
+        time.sleep(40);
+
+        // Idle for 60ms
+        raftMetrics.updatePollStart(time.milliseconds());
+        time.sleep(60);
+        raftMetrics.updatePollEnd(time.milliseconds());
+
+        // Busy for 10ms
+        time.sleep(10);
+
+        // Begin idle time for 5ms
+        raftMetrics.updatePollStart(time.milliseconds());
+        time.sleep(5);
+
+        // Measurement arrives before poll end

Review Comment:
   How about documenting that this measurement is for busy of 40ms and idle of 60ms?



##########
raft/src/test/java/org/apache/kafka/raft/internals/KafkaRaftMetricsTest.java:
##########
@@ -232,6 +232,32 @@ public void shouldRecordPollIdleRatio() {
         raftMetrics.updatePollEnd(time.milliseconds());
 
         assertEquals(0.5, getMetric(metrics, "poll-idle-ratio-avg").metricValue());
+
+        // Busy for 40ms
+        time.sleep(40);
+
+        // Idle for 60ms
+        raftMetrics.updatePollStart(time.milliseconds());
+        time.sleep(60);
+        raftMetrics.updatePollEnd(time.milliseconds());
+
+        // Busy for 10ms
+        time.sleep(10);
+
+        // Begin idle time for 5ms
+        raftMetrics.updatePollStart(time.milliseconds());
+        time.sleep(5);
+
+        // Measurement arrives before poll end
+        assertEquals(0.6, getMetric(metrics, "poll-idle-ratio-avg").metricValue());
+
+        // More idle time for 5ms
+        time.sleep(5);
+        raftMetrics.updatePollEnd(time.milliseconds());
+
+        // The measurement includes the interval beginning at the last recording.
+        // This counts 10ms of busy time and 10ms of idle time.

Review Comment:
   How about this documenting this information: busy of 10ms and Idle of 5ms + 5ms?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org