You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2022/04/08 11:09:34 UTC
[GitHub] [bookkeeper] hangc0276 opened a new pull request, #3192: Fix compaction throttle imprecise
hangc0276 opened a new pull request, #3192:
URL: https://github.com/apache/bookkeeper/pull/3192
### Motivation
When do compaction for an entry log file, it will scan each entry header metadata from beginning to the end of the file. When an entry doesn't deleted, it will read the entry data and write into a new entry log file. The throttle only works on the entry which doesn't deleted.
https://github.com/apache/bookkeeper/blob/b0030ab5d9132e5c5cd5092b8e6dab79e9fbf16c/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/EntryLogger.java#L1012-L1051
For those deleted entries, the throttle strategy doesn't applied. However, we use `BufferedChannel` to read the entry log file which will prefetch data from file when the read buffer missed.
https://github.com/apache/bookkeeper/blob/b0030ab5d9132e5c5cd5092b8e6dab79e9fbf16c/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BufferedReadChannel.java#L91-L97
For an entry log file, more than 90% of entries has been deleted, the compactor will scan those entry's header metadata one by one. When reading one entry's metadata, it will missed in BufferedChannel read buffer, it will trigger prefetch from disk. For the following entries, the header metadata reading will also missed in BufferedChannel read buffer, and will continue to prefetch from disk without throttle. It will lead to ledger disk IO util runs high.
Moreover, for each prefetch operation from disk, it will also trigger OS PageCache prefetch. For the compaction model, the OS PageCache prefetch will lead to PageCache pollution,and maybe also affect the journal sync latency. For this one, we can use the Direct IO to reduce the PageCache effect. https://github.com/apache/bookkeeper/issues/2943
### Changes
Add throttle on entry header metadata check stage.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] zymap commented on pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
zymap commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1097453358
@hangc0276 Looks like the compaction tests failed. Could you please take a look?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1094246341
rerun failure checks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1097464831
> @hangc0276 Looks like the compaction tests failed. Could you please take a look?
Ok, I will take a look today.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1098087867
rerun failure checks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1093818553
rerun failure checks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1109302185
> Good work
>
> We definitely need tests about these new methods
@eolivelli I removed the duplicated new methods, and return the prefetched bytes when calling `BufferedReadChannel.read()`. So we do not need extra unit tests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] StevenLuMT commented on a diff in pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
StevenLuMT commented on code in PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#discussion_r930092677
##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BufferedReadChannel.java:
##########
@@ -61,17 +63,18 @@ public BufferedReadChannel(FileChannel fileChannel, int readCapacity) {
* -1 if the given position is greater than or equal to the file's current size.
Review Comment:
I think @hangc0276 you should update this description for method's return
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on pull request #3192: Fix compaction throttle imprecise
Posted by "hangc0276 (via GitHub)" <gi...@apache.org>.
hangc0276 commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1568101353
The compaction introduced performance issue has been addressed by https://github.com/apache/bookkeeper/pull/3959, close this PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 closed pull request #3192: Fix compaction throttle imprecise
Posted by "hangc0276 (via GitHub)" <gi...@apache.org>.
hangc0276 closed pull request #3192: Fix compaction throttle imprecise
URL: https://github.com/apache/bookkeeper/pull/3192
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1093010936
rerun failure checks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] dlg99 commented on a diff in pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
dlg99 commented on code in PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#discussion_r858327213
##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BufferedReadChannel.java:
##########
@@ -95,9 +98,10 @@ public synchronized int read(ByteBuf dest, long pos, int length) throws IOExcept
throw new IOException("Reading from filechannel returned a non-positive value. Short read.");
}
readBuffer.writerIndex(readBytes);
+ bytesFetchedFromFile += readBytes;
}
}
- return (int) (currentPosition - pos);
+ return new ImmutablePair<> ((int) (currentPosition - pos), bytesFetchedFromFile);
Review Comment:
This is a very frequently called code, can we avoid creating a new object?
I think the throttler can be safely passed down the call chain into this method.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1098148230
rerun failure checks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] zymap commented on a diff in pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
zymap commented on code in PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#discussion_r846862216
##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/EntryLogger.java:
##########
@@ -1014,10 +1041,18 @@ public void scanEntryLog(long entryLogId, EntryLogScanner scanner) throws IOExce
if (pos >= bc.size()) {
break;
}
- if (readFromLogChannel(entryLogId, bc, headerBuffer, pos) != headerBuffer.capacity()) {
+
+ Pair<Integer, Integer> readBytesPair =
+ readFromLogChannelWithPrefetchBytes(entryLogId, bc, headerBuffer, pos);
+ if (readBytesPair.getLeft() != headerBuffer.capacity()) {
LOG.warn("Short read for entry size from entrylog {}", entryLogId);
return;
}
+
+ if (throttler != null) {
+ throttler.acquire(readBytesPair.getRight());
Review Comment:
When do we release it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on a diff in pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
hangc0276 commented on code in PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#discussion_r858372194
##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BufferedReadChannel.java:
##########
@@ -95,9 +98,10 @@ public synchronized int read(ByteBuf dest, long pos, int length) throws IOExcept
throw new IOException("Reading from filechannel returned a non-positive value. Short read.");
}
readBuffer.writerIndex(readBytes);
+ bytesFetchedFromFile += readBytes;
}
}
- return (int) (currentPosition - pos);
+ return new ImmutablePair<> ((int) (currentPosition - pos), bytesFetchedFromFile);
Review Comment:
Good catch. I'm considering how to avoiding creating new object.
The `BufferedReadChannel.read` method is synchronized, we should use throttle in the read method to block it when throttled, otherwise, it will block other threading calling. I still have no good idea about it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] StevenLuMT commented on a diff in pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
StevenLuMT commented on code in PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#discussion_r930132496
##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BufferedReadChannel.java:
##########
@@ -95,9 +98,10 @@ public synchronized int read(ByteBuf dest, long pos, int length) throws IOExcept
throw new IOException("Reading from filechannel returned a non-positive value. Short read.");
}
readBuffer.writerIndex(readBytes);
+ bytesFetchedFromFile += readBytes;
}
}
- return (int) (currentPosition - pos);
+ return new ImmutablePair<> ((int) (currentPosition - pos), bytesFetchedFromFile);
Review Comment:
@hangc0276 @dlg99
I think you can write like this simple code :
`return Pair.of((int) (currentPosition - pos), bytesFetchedFromFile); `
instead of
`return new ImmutablePair<> ((int) (currentPosition - pos), bytesFetchedFromFile); `
but I think it's ok, frequent creation of ImmutablePair has no affect
<img width="964" alt="image" src="https://user-images.githubusercontent.com/42990025/181050024-667fd915-ba55-4de9-9bb1-da395b60391e.png">
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on a diff in pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
hangc0276 commented on code in PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#discussion_r846881714
##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/EntryLogger.java:
##########
@@ -1014,10 +1041,18 @@ public void scanEntryLog(long entryLogId, EntryLogScanner scanner) throws IOExce
if (pos >= bc.size()) {
break;
}
- if (readFromLogChannel(entryLogId, bc, headerBuffer, pos) != headerBuffer.capacity()) {
+
+ Pair<Integer, Integer> readBytesPair =
+ readFromLogChannelWithPrefetchBytes(entryLogId, bc, headerBuffer, pos);
+ if (readBytesPair.getLeft() != headerBuffer.capacity()) {
LOG.warn("Short read for entry size from entrylog {}", entryLogId);
return;
}
+
+ if (throttler != null) {
+ throttler.acquire(readBytesPair.getRight());
Review Comment:
The release operation controlled by `RateLimiter` library in guava
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1093559869
rerun failure checks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1094153786
rerun failure checks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] hangc0276 commented on pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
hangc0276 commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1109392133
rerun failure checks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [bookkeeper] StevenLuMT commented on pull request #3192: Fix compaction throttle imprecise
Posted by GitBox <gi...@apache.org>.
StevenLuMT commented on PR #3192:
URL: https://github.com/apache/bookkeeper/pull/3192#issuecomment-1225342932
fix old workflow,please see #3455 for detail
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org