You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@bookkeeper.apache.org by ch...@apache.org on 2022/12/07 03:00:08 UTC
[bookkeeper] branch master updated: use READ_ENTRY_SCHEDULING_DELAY to stable stickyReadsWithFailures (#3628)
This is an automated email from the ASF dual-hosted git repository.
chenhang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/bookkeeper.git
The following commit(s) were added to refs/heads/master by this push:
new eda9c4284d use READ_ENTRY_SCHEDULING_DELAY to stable stickyReadsWithFailures (#3628)
eda9c4284d is described below
commit eda9c4284d434cdc4d94637f6257a1c6bf179745
Author: wenbingshen <ol...@gmail.com>
AuthorDate: Wed Dec 7 11:00:01 2022 +0800
use READ_ENTRY_SCHEDULING_DELAY to stable stickyReadsWithFailures (#3628)
### Motivation
I found the following flaky-test: org.apache.bookkeeper.bookie.BookieStickyReadsTest.stickyReadsWithFailures:
https://github.com/apache/bookkeeper/actions/runs/3367374609/jobs/5584792353
```
Error: Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 8.925 s <<< FAILURE! - in org.apache.bookkeeper.bookie.BookieStickyReadsTest
Error: org.apache.bookkeeper.bookie.BookieStickyReadsTest.stickyReadsWithFailures Time elapsed: 1.752 s <<< ERROR!
java.lang.IndexOutOfBoundsException: Index: -1, Size: 3
at java.base/java.util.LinkedList.checkElementIndex(LinkedList.java:559)
at java.base/java.util.LinkedList.get(LinkedList.java:480)
at org.apache.bookkeeper.test.BookKeeperClusterTestCase.serverByIndex(BookKeeperClusterTestCase.java:369)
at org.apache.bookkeeper.bookie.BookieStickyReadsTest.stickyReadsWithFailures(BookieStickyReadsTest.java:153)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.lang.Thread.run(Thread.java:829)
```
In the stickyReadsWithFailures test, the client successfully reads the entry, but the `READ_ENTRY_REQUEST` metric does not get bonuses. After reading the `READ_ENTRY_REQUEST`update logic, I found the metric updating after netty channel sends the response successfully, the metric is updated through the `ChannelFutureListener` callback, and the asynchronous update causes the above test to fail.
```java
protected void sendResponse(StatusCode code, Object response, OpStatsLogger statsLogger) {
......
if (channel.isActive()) {
channel.writeAndFlush(response).addListener(new ChannelFutureListener() {
@Override
public void operationComplete(ChannelFuture future) throws Exception {
long writeElapsedNanos = MathUtils.elapsedNanos(writeNanos);
if (!future.isSuccess()) {
requestProcessor.getRequestStats().getChannelWriteStats()
.registerFailedEvent(writeElapsedNanos, TimeUnit.NANOSECONDS);
} else {
requestProcessor.getRequestStats().getChannelWriteStats()
.registerSuccessfulEvent(writeElapsedNanos, TimeUnit.NANOSECONDS);
}
if (StatusCode.EOK == code) {
statsLogger.registerSuccessfulEvent(MathUtils.elapsedNanos(enqueueNanos), TimeUnit.NANOSECONDS);
} else {
statsLogger.registerFailedEvent(MathUtils.elapsedNanos(enqueueNanos), TimeUnit.NANOSECONDS);
}
}
});
}
......
}
```
### Changes
The `READ_ENTRY_SCHEDULING_DELAY` metric is processed before the Read request is processed, which proved that bookie receives the read request from the client and can well meet the needs of `BookieStickyReadsTest`.
This makes the `BookieStickyReadsTest` test more stabled.
---
.../bookkeeper/bookie/BookieStickyReadsTest.java | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/BookieStickyReadsTest.java b/bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/BookieStickyReadsTest.java
index 698c816eb8..99bcc78845 100644
--- a/bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/BookieStickyReadsTest.java
+++ b/bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/BookieStickyReadsTest.java
@@ -48,7 +48,7 @@ public class BookieStickyReadsTest extends BookKeeperClusterTestCase {
private static final int NUM_BOOKIES = 3;
- private static final String READ_ENTRY_REQUEST_METRIC = "bookkeeper_server.READ_ENTRY_REQUEST";
+ private static final String READ_ENTRY_SCHEDULING_DELAY_METRIC = "bookkeeper_server.READ_ENTRY_SCHEDULING_DELAY";
public BookieStickyReadsTest() {
super(NUM_BOOKIES);
@@ -64,7 +64,8 @@ public class BookieStickyReadsTest extends BookKeeperClusterTestCase {
writeAndReadEntries(conf, 3, 3, 3);
// All bookies should have received at least some read request
- getBookieReadRequestStats().values().forEach(readRequests -> assertTrue(readRequests > 0));
+ getBookieReadEntrySchedulingDelayStats().values()
+ .forEach(readRequests -> assertTrue(readRequests > 0));
}
@Test
@@ -76,7 +77,8 @@ public class BookieStickyReadsTest extends BookKeeperClusterTestCase {
// All bookies should have received at least some read request since we
// don't enable sticky reads when striping is enabled
- getBookieReadRequestStats().values().forEach(readRequests -> assertTrue(readRequests > 0));
+ getBookieReadEntrySchedulingDelayStats().values()
+ .forEach(readRequests -> assertTrue(readRequests > 0));
}
@Test
@@ -87,7 +89,7 @@ public class BookieStickyReadsTest extends BookKeeperClusterTestCase {
writeAndReadEntries(conf, 3, 3, 3);
// All read requests should have been made to a single bookie
- Map<Integer, Long> stats = getBookieReadRequestStats();
+ Map<Integer, Long> stats = getBookieReadEntrySchedulingDelayStats();
boolean foundBookieWithRequests = false;
for (long readRequests : stats.values()) {
if (readRequests > 0) {
@@ -137,7 +139,7 @@ public class BookieStickyReadsTest extends BookKeeperClusterTestCase {
// All read requests should have been made to a single bookie
int bookieWithRequests = -1;
for (int i = 0; i < NUM_BOOKIES; i++) {
- long requests = getStatsProvider(i).getOpStatsLogger(READ_ENTRY_REQUEST_METRIC)
+ long requests = getStatsProvider(i).getOpStatsLogger(READ_ENTRY_SCHEDULING_DELAY_METRIC)
.getSuccessCount();
log.info("Bookie {} --- requests: {}", i, requests);
@@ -162,18 +164,17 @@ public class BookieStickyReadsTest extends BookKeeperClusterTestCase {
// At this point, we should have 1 bookie with 1 request (the initial
// request), and a second bookie with 10 requests. The 3rd bookie should
// have no requests
- List<Long> requestCounts = Lists.newArrayList(getBookieReadRequestStats().values());
+ List<Long> requestCounts = Lists.newArrayList(getBookieReadEntrySchedulingDelayStats().values());
Collections.sort(requestCounts);
assertEquals(0, requestCounts.get(0).longValue());
assertEquals(1, requestCounts.get(1).longValue());
assertEquals(10, requestCounts.get(2).longValue());
}
-
- private Map<Integer, Long> getBookieReadRequestStats() throws Exception {
+ private Map<Integer, Long> getBookieReadEntrySchedulingDelayStats() throws Exception {
Map<Integer, Long> stats = new TreeMap<>();
for (int i = 0; i < NUM_BOOKIES; i++) {
- stats.put(i, getStatsProvider(i).getOpStatsLogger(READ_ENTRY_REQUEST_METRIC)
+ stats.put(i, getStatsProvider(i).getOpStatsLogger(READ_ENTRY_SCHEDULING_DELAY_METRIC)
.getSuccessCount());
}