You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "jbewing (via GitHub)" <gi...@apache.org> on 2023/06/16 20:49:24 UTC
[GitHub] [pinot] jbewing opened a new pull request, #10936: Fix casting when prefetching mmap'd segment larger than 2GB
jbewing opened a new pull request, #10936:
URL: https://github.com/apache/pinot/pull/10936
### What
This PR fixes a bug which may trigger when an mmap'd segment larger than 2GB (Integer.MAX_VALUE bytes) is prefetched by SegmentLocalFSDirectory.
There is an (outdated) comment above the prefetching code that specifies that `buffer.size()` is 32-bit and therefore `pos` cannot be greater than `Integer.MAX_VALUE`. That isn't true as LArray-backed buffers can exceed Integer.MAX_VALUE in size (and `PinotDataBuffer#size` returns a `long`). If a buffer with a size that exceeds `Integer.MAX_VALUE` is pre-fetched, `pos` will be casted to an `int` (`Integer.MIN_VALUE`) and passed to `PinotDataBuffer#getByte`. On some implementations, this will loudly throw. On others (like LArray), the behavior is undefined as that buffer implementation is backed by Unsafe and the memory address being accessed is out of the range of the buffer.
I've observed this cause a (hard) JVM crash when running batch ingestion jobs on Java 17 with a recent patch https://github.com/apache/pinot/pull/10528 that adds a substitute for LArray for buffers that map more than 2^31-1 bytes. The logs for that crash are captured in this gist: https://gist.github.com/jbewing/471fb86f419c9975e5f22994673b0c1b
I hypothesize that this bug is less noticeable on java versions less than 17 as it may not result in a hard crash of the JVM when using the LArray large buffer implementation.
### Testing
I've tested this patch by re-running the batch ingestion job that caused the crash and verifying that it didn't cause the crash. I welcome any advice on additional testing steps needed here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] klsince commented on a diff in pull request #10936: Fix casting when prefetching mmap'd segment larger than 2GB
Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #10936:
URL: https://github.com/apache/pinot/pull/10936#discussion_r1235967450
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/store/SegmentLocalFSDirectory.java:
##########
@@ -332,11 +332,9 @@ private void prefetchMmapData(PinotDataBuffer buffer) {
PREFETCHED_PAGES.incrementAndGet();
}
} else {
- // pos needs to be long because buffer.size() is 32 bit but
- // adding 4k can make it go over int size
for (long pos = 0; pos < buffer.size() && PREFETCHED_PAGES.get() < prefetchSlowdownPageLimit;
pos += PAGE_SIZE_BYTES) {
- buffer.getByte((int) pos);
+ buffer.getByte(pos);
Review Comment:
as far as I can see, this is a good catch 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] codecov-commenter commented on pull request #10936: Fix casting when prefetching mmap'd segment larger than 2GB
Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10936:
URL: https://github.com/apache/pinot/pull/10936#issuecomment-1599698035
## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/10936?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
> Merging [#10936](https://app.codecov.io/gh/apache/pinot/pull/10936?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (e75cc84) into [master](https://app.codecov.io/gh/apache/pinot/commit/85bb1fab54d44e847195e35e3e0b117022985d36?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (85bb1fa) will **not change** coverage.
> The diff coverage is `0.00%`.
```diff
@@ Coverage Diff @@
## master #10936 +/- ##
=======================================
Coverage 0.11% 0.11%
=======================================
Files 2188 2188
Lines 117754 117754
Branches 17791 17791
=======================================
Hits 137 137
Misses 117597 117597
Partials 20 20
```
| Flag | Coverage Δ | |
|---|---|---|
| integration1temurin11 | `0.00% <0.00%> (?)` | |
| integration1temurin17 | `0.00% <0.00%> (ø)` | |
| integration1temurin20 | `0.00% <0.00%> (ø)` | |
| integration2temurin11 | `?` | |
| integration2temurin17 | `?` | |
| integration2temurin20 | `?` | |
| unittests1temurin17 | `?` | |
| unittests1temurin20 | `?` | |
| unittests2temurin11 | `0.11% <0.00%> (ø)` | |
| unittests2temurin17 | `0.11% <0.00%> (ø)` | |
| unittests2temurin20 | `0.11% <0.00%> (ø)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://app.codecov.io/gh/apache/pinot/pull/10936?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
|---|---|---|
| [...t/local/segment/store/SegmentLocalFSDirectory.java](https://app.codecov.io/gh/apache/pinot/pull/10936?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L3N0b3JlL1NlZ21lbnRMb2NhbEZTRGlyZWN0b3J5LmphdmE=) | `0.00% <0.00%> (ø)` | |
:mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang commented on pull request #10936: Fix casting when prefetching mmap'd segment larger than 2GB
Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on PR #10936:
URL: https://github.com/apache/pinot/pull/10936#issuecomment-1599648101
cc @klsince
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang merged pull request #10936: Fix casting when prefetching mmap'd segment larger than 2GB
Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang merged PR #10936:
URL: https://github.com/apache/pinot/pull/10936
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org