You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "jbewing (via GitHub)" <gi...@apache.org> on 2023/06/16 20:49:24 UTC

[GitHub] [pinot] jbewing opened a new pull request, #10936: Fix casting when prefetching mmap'd segment larger than 2GB

jbewing opened a new pull request, #10936:
URL: https://github.com/apache/pinot/pull/10936

   ### What
   This PR fixes a bug which may trigger when an mmap'd segment larger than 2GB (Integer.MAX_VALUE bytes) is prefetched by SegmentLocalFSDirectory.
   
   There is an (outdated) comment above the prefetching code that specifies that `buffer.size()` is 32-bit and therefore `pos` cannot be greater than `Integer.MAX_VALUE`. That isn't true as LArray-backed buffers can exceed Integer.MAX_VALUE in size (and `PinotDataBuffer#size` returns a `long`). If a buffer with a size that exceeds `Integer.MAX_VALUE` is pre-fetched, `pos` will be casted to an `int` (`Integer.MIN_VALUE`) and passed to `PinotDataBuffer#getByte`. On some implementations, this will loudly throw. On others (like LArray), the behavior is undefined as that buffer implementation is backed by Unsafe and the memory address being accessed is out of the range of the buffer.
   
   I've observed this cause a (hard) JVM crash when running batch ingestion jobs on Java 17 with a recent patch https://github.com/apache/pinot/pull/10528 that adds a substitute for LArray for buffers that map more than 2^31-1 bytes. The logs for that crash are captured in this gist: https://gist.github.com/jbewing/471fb86f419c9975e5f22994673b0c1b
   
   I hypothesize that this bug is less noticeable on java versions less than 17 as it may not result in a hard crash of the JVM when using the LArray large buffer implementation.
   
   ### Testing
   I've tested this patch by re-running the batch ingestion job that caused the crash and verifying that it didn't cause the crash. I welcome any advice on additional testing steps needed here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] klsince commented on a diff in pull request #10936: Fix casting when prefetching mmap'd segment larger than 2GB

Posted by "klsince (via GitHub)" <gi...@apache.org>.
klsince commented on code in PR #10936:
URL: https://github.com/apache/pinot/pull/10936#discussion_r1235967450


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/store/SegmentLocalFSDirectory.java:
##########
@@ -332,11 +332,9 @@ private void prefetchMmapData(PinotDataBuffer buffer) {
         PREFETCHED_PAGES.incrementAndGet();
       }
     } else {
-      // pos needs to be long because buffer.size() is 32 bit but
-      // adding 4k can make it go over int size
       for (long pos = 0; pos < buffer.size() && PREFETCHED_PAGES.get() < prefetchSlowdownPageLimit;
           pos += PAGE_SIZE_BYTES) {
-        buffer.getByte((int) pos);
+        buffer.getByte(pos);

Review Comment:
   as far as I can see, this is a good catch 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #10936: Fix casting when prefetching mmap'd segment larger than 2GB

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10936:
URL: https://github.com/apache/pinot/pull/10936#issuecomment-1599698035

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/10936?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#10936](https://app.codecov.io/gh/apache/pinot/pull/10936?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (e75cc84) into [master](https://app.codecov.io/gh/apache/pinot/commit/85bb1fab54d44e847195e35e3e0b117022985d36?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (85bb1fa) will **not change** coverage.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@           Coverage Diff           @@
   ##           master   #10936   +/-   ##
   =======================================
     Coverage    0.11%    0.11%           
   =======================================
     Files        2188     2188           
     Lines      117754   117754           
     Branches    17791    17791           
   =======================================
     Hits          137      137           
     Misses     117597   117597           
     Partials       20       20           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1temurin11 | `0.00% <0.00%> (?)` | |
   | integration1temurin17 | `0.00% <0.00%> (ø)` | |
   | integration1temurin20 | `0.00% <0.00%> (ø)` | |
   | integration2temurin11 | `?` | |
   | integration2temurin17 | `?` | |
   | integration2temurin20 | `?` | |
   | unittests1temurin17 | `?` | |
   | unittests1temurin20 | `?` | |
   | unittests2temurin11 | `0.11% <0.00%> (ø)` | |
   | unittests2temurin17 | `0.11% <0.00%> (ø)` | |
   | unittests2temurin20 | `0.11% <0.00%> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://app.codecov.io/gh/apache/pinot/pull/10936?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [...t/local/segment/store/SegmentLocalFSDirectory.java](https://app.codecov.io/gh/apache/pinot/pull/10936?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L3N0b3JlL1NlZ21lbnRMb2NhbEZTRGlyZWN0b3J5LmphdmE=) | `0.00% <0.00%> (ø)` | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on pull request #10936: Fix casting when prefetching mmap'd segment larger than 2GB

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on PR #10936:
URL: https://github.com/apache/pinot/pull/10936#issuecomment-1599648101

   cc @klsince 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang merged pull request #10936: Fix casting when prefetching mmap'd segment larger than 2GB

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang merged PR #10936:
URL: https://github.com/apache/pinot/pull/10936


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org