You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "gortiz (via GitHub)" <gi...@apache.org> on 2023/06/30 15:10:25 UTC

[GitHub] [pinot] gortiz opened a new pull request, #11015: Apply some allocation optimizations on GrpcSendingMailbox

gortiz opened a new pull request, #11015:
URL: https://github.com/apache/pinot/pull/11015

   This is a simple PR that tries to reduce the amount of byte[] that are created by `GrpcSendingMailbox`. There are still some places that can be improved, but these changes are easy to apply and very effective.
   
   The main changes are:
   - To use `org.apache.commons.io.output.UnsynchronizedByteArrayOutputStream` instead of `java.io.ByteArrayOutputStream`. As stated in the Javadoc, UnsynchronizedByteArrayOutputStream is a 1-1 replacement of `ByteArrayOutputStream`, but instead of doubling the size of the stored byte[], it starts to write into a new byte[]. Given that we are not going to read the `byte[]` later but just copy it with `toByteArray`, the apache common implementation is faster for us. Maybe we should work in our own implementation that works like STD ByteArrayOutputStream but does let us access the byte[] without copy (as we know it is safe) or even better, we may need to think on another way to serialize the information.
   - Try to initialize `UnsynchronizedByteArrayOutputStream` with closer to reality initial values. For example we know the exact number of bytes we need in `DataBlockBuilder._fixedSizeDataByteArrayOutputStream`
   - Reuse the ByteBuffer in `DataBlockBuilder.buildFromRows` instead of creating one per row
   - Change `GrpcSendingMailbox.toMailboxcContent` to do not call `ByteString.copyFrom` but `UnsafeByteOperations.unsafeWrap(bytes)`, which is _unsafe_ because the content should not be modified, but we provide there a fresh new byte[], so we know it won't.
   
   Some initial performance I made show a noticeable change in allocation. These flamegraphs have been obtained using async-profiler but executing the same queries the same number of times:
   
   Without optimization: 77.9 GBs allocated
   
   ![image](https://github.com/apache/pinot/assets/1913993/ffd83a3b-1044-41a5-aa56-e31d8e9e0657)
   
   
   With optiumization: 48 Gbs allocated
   ![image (1)](https://github.com/apache/pinot/assets/1913993/e3dd3dfe-3ab0-4fbb-b8a4-da1dc159a130)
   
   Labels: performance


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #11015: Apply some allocation optimizations on GrpcSendingMailbox

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #11015:
URL: https://github.com/apache/pinot/pull/11015#issuecomment-1614841523

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/11015?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#11015](https://app.codecov.io/gh/apache/pinot/pull/11015?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (8671006) into [master](https://app.codecov.io/gh/apache/pinot/commit/0b097a8dab7d2953d331010c856420967fe873e1?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (0b097a8) will **increase** coverage by `0.00%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@            Coverage Diff            @@
   ##           master   #11015     +/-   ##
   =========================================
     Coverage    0.11%    0.11%             
   =========================================
     Files        2192     2138     -54     
     Lines      118016   115527   -2489     
     Branches    17869    17567    -302     
   =========================================
     Hits          137      137             
   + Misses     117859   115370   -2489     
     Partials       20       20             
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1temurin11 | `?` | |
   | integration1temurin17 | `?` | |
   | integration1temurin20 | `?` | |
   | integration2temurin11 | `?` | |
   | integration2temurin17 | `?` | |
   | unittests1temurin17 | `?` | |
   | unittests1temurin20 | `?` | |
   | unittests2temurin11 | `0.11% <0.00%> (-0.01%)` | :arrow_down: |
   | unittests2temurin17 | `?` | |
   | unittests2temurin20 | `0.11% <0.00%> (-0.01%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://app.codecov.io/gh/apache/pinot/pull/11015?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [...g/apache/pinot/common/datablock/BaseDataBlock.java](https://app.codecov.io/gh/apache/pinot/pull/11015?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZGF0YWJsb2NrL0Jhc2VEYXRhQmxvY2suamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [.../pinot/core/common/datablock/DataBlockBuilder.java](https://app.codecov.io/gh/apache/pinot/pull/11015?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9jb21tb24vZGF0YWJsb2NrL0RhdGFCbG9ja0J1aWxkZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...apache/pinot/query/mailbox/GrpcSendingMailbox.java](https://app.codecov.io/gh/apache/pinot/pull/11015?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvbWFpbGJveC9HcnBjU2VuZGluZ01haWxib3guamF2YQ==) | `0.00% <0.00%> (ø)` | |
   
   ... and [54 files with indirect coverage changes](https://app.codecov.io/gh/apache/pinot/pull/11015/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #11015: Apply some allocation optimizations on GrpcSendingMailbox

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on code in PR #11015:
URL: https://github.com/apache/pinot/pull/11015#discussion_r1248573641


##########
pinot-common/src/main/java/org/apache/pinot/common/datablock/BaseDataBlock.java:
##########
@@ -387,7 +387,10 @@ public RoaringBitmap getNullRowIds(int colId) {
    */
   protected byte[] serializeStringDictionary()
       throws IOException {
-    ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
+    if (_stringDictionary.length == 0) {
+      return new byte[]{0, 0, 0, 0};

Review Comment:
   (minor) Equivalent to `new byte[4]`? Same for other places



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] walterddr merged pull request #11015: Apply some allocation optimizations on GrpcSendingMailbox

Posted by "walterddr (via GitHub)" <gi...@apache.org>.
walterddr merged PR #11015:
URL: https://github.com/apache/pinot/pull/11015


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org