You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/04/20 12:50:41 UTC

[GitHub] [pinot] richardstartin opened a new pull request, #8570: Startree streamlining

richardstartin opened a new pull request, #8570:
URL: https://github.com/apache/pinot/pull/8570

   1. Replace `LinkedList` with `ArrayDeque` for storing search entries in the bread-first traversal.
   2. Delay deserialization of most of the `OffHeapStarTreeNode` fields until the fields are used, because most are never used during a traversal.
   
   Change 1 this is motivated by a profile of a StarTree over a high cardinality dimension where the `LinkedList` is of a size large enough to cause problems, resulting in a large number of (doubly-linked) `LinkedList$Node` objects allocated. `ArrayDeque` just stores the values in an array which it maintains as a circular buffer
   <img width="1027" alt="Screenshot 2022-04-20 at 13 31 00" src="https://user-images.githubusercontent.com/16439049/164230787-6c304c49-2601-4209-b98e-6d11242e3532.png">
   
   Change 2 is motivated by a observing a lot of samples being taken within `OffHeapStarTreeNode$1.next` 
   <img width="847" alt="Screenshot 2022-04-20 at 13 35 01" src="https://user-images.githubusercontent.com/16439049/164231624-fbdeb8e7-b05f-43a6-8472-1c9e8d7fc885.png">
   
   There is also a high allocation rate of `OffHeapStarTreeNode` objects:
   <img width="1266" alt="Screenshot 2022-04-20 at 13 36 48" src="https://user-images.githubusercontent.com/16439049/164231771-1f9eed17-8860-48aa-9233-5b8fe18818de.png">
   
   Reducing the number of fields reduces the size of instances when compressed references are enabled by a third: 
   
   before:
   ```
   org.apache.pinot.segment.local.startree.OffHeapStarTreeNode object internals:
   OFF  SZ                                                  TYPE DESCRIPTION                            VALUE
     0   8                                                       (object header: mark)                  N/A
     8   4                                                       (object header: class)                 N/A
    12   4                                                   int OffHeapStarTreeNode._dimensionId       N/A
    16   4                                                   int OffHeapStarTreeNode._dimensionValue    N/A
    20   4                                                   int OffHeapStarTreeNode._startDocId        N/A
    24   4                                                   int OffHeapStarTreeNode._endDocId          N/A
    28   4                                                   int OffHeapStarTreeNode._aggregatedDocId   N/A
    32   4                                                   int OffHeapStarTreeNode._firstChildId      N/A
    36   4                                                   int OffHeapStarTreeNode._lastChildId       N/A
    40   4   org.apache.pinot.segment.spi.memory.PinotDataBuffer OffHeapStarTreeNode._dataBuffer        N/A
    44   4                                                       (object alignment gap)                 
   Instance size: 48 bytes
   Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
   ```
   
   after:
   ```
   org.apache.pinot.segment.local.startree.OffHeapStarTreeNode object internals:
   OFF  SZ                                                  TYPE DESCRIPTION                         VALUE
     0   8                                                       (object header: mark)               N/A
     8   4                                                       (object header: class)              N/A
    12   4                                                   int OffHeapStarTreeNode._nodeId         N/A
    16   4                                                   int OffHeapStarTreeNode._firstChildId   N/A
    20   4                                                   int OffHeapStarTreeNode._lastChildId    N/A
    24   4   org.apache.pinot.segment.spi.memory.PinotDataBuffer OffHeapStarTreeNode._dataBuffer     N/A
    28   4                                                       (object alignment gap)              
   Instance size: 32 bytes
   Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
   ```
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin merged pull request #8570: Improve StarTree traversal performance

Posted by GitBox <gi...@apache.org>.
richardstartin merged PR #8570:
URL: https://github.com/apache/pinot/pull/8570


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin commented on pull request #8570: Startree streamlining

Posted by GitBox <gi...@apache.org>.
richardstartin commented on PR #8570:
URL: https://github.com/apache/pinot/pull/8570#issuecomment-1104090758

   This yields up to 2x reduction in average query time for a sum over a 2D startree:
   
   before:
   ```
   Benchmark                                                (_numRows)                                                                                                               (_query)  (_scenario)  Mode  Cnt           Score            Error   Units
   BenchmarkQueries.query                                      1500000  SELECT INT_COL,SORTED_COL,SUM(RAW_INT_COL) from MyTable group by INT_COL, SORTED_COL order by SORTED_COL, INT_COL ASC   EXP(0.001)  avgt    5      716065.843 ±     152704.805   us/op
   BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000  SELECT INT_COL,SORTED_COL,SUM(RAW_INT_COL) from MyTable group by INT_COL, SORTED_COL order by SORTED_COL, INT_COL ASC   EXP(0.001)  avgt    5   913635041.333 ± 1942153323.171    B/op
   BenchmarkQueries.query                                      1500000  SELECT INT_COL,SORTED_COL,SUM(RAW_INT_COL) from MyTable group by INT_COL, SORTED_COL order by SORTED_COL, INT_COL ASC     EXP(0.5)  avgt    5      543558.326 ±     600710.721   us/op
   BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000  SELECT INT_COL,SORTED_COL,SUM(RAW_INT_COL) from MyTable group by INT_COL, SORTED_COL order by SORTED_COL, INT_COL ASC     EXP(0.5)  avgt    5   910242066.373 ± 1937638062.800    B/op
   BenchmarkQueries.query                                      1500000  SELECT INT_COL,SORTED_COL,SUM(RAW_INT_COL) from MyTable group by INT_COL, SORTED_COL order by SORTED_COL, INT_COL ASC   EXP(0.999)  avgt    5      706489.120 ±     178988.130   us/op
   BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000  SELECT INT_COL,SORTED_COL,SUM(RAW_INT_COL) from MyTable group by INT_COL, SORTED_COL order by SORTED_COL, INT_COL ASC   EXP(0.999)  avgt    5   908499340.533 ± 1936126801.021    B/op
   ```
   
   after:
   ```
   Benchmark                                                (_numRows)                                                                                                               (_query)  (_scenario)  Mode  Cnt           Score            Error   Units
   BenchmarkQueries.query                                      1500000  SELECT INT_COL,SORTED_COL,SUM(RAW_INT_COL) from MyTable group by INT_COL, SORTED_COL order by SORTED_COL, INT_COL ASC   EXP(0.001)  avgt    5      434788.384 ±      67478.376   us/op
   BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000  SELECT INT_COL,SORTED_COL,SUM(RAW_INT_COL) from MyTable group by INT_COL, SORTED_COL order by SORTED_COL, INT_COL ASC   EXP(0.001)  avgt    5   831883216.640 ± 1766190864.794    B/op
   BenchmarkQueries.query                                      1500000  SELECT INT_COL,SORTED_COL,SUM(RAW_INT_COL) from MyTable group by INT_COL, SORTED_COL order by SORTED_COL, INT_COL ASC     EXP(0.5)  avgt    5      378584.087 ±      20401.492   us/op
   BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000  SELECT INT_COL,SORTED_COL,SUM(RAW_INT_COL) from MyTable group by INT_COL, SORTED_COL order by SORTED_COL, INT_COL ASC     EXP(0.5)  avgt    5   828604188.533 ± 1761906482.829    B/op
   BenchmarkQueries.query                                      1500000  SELECT INT_COL,SORTED_COL,SUM(RAW_INT_COL) from MyTable group by INT_COL, SORTED_COL order by SORTED_COL, INT_COL ASC   EXP(0.999)  avgt    5      370225.020 ±      19187.406   us/op
   BenchmarkQueries.query:·gc.alloc.rate.norm                  1500000  SELECT INT_COL,SORTED_COL,SUM(RAW_INT_COL) from MyTable group by INT_COL, SORTED_COL order by SORTED_COL, INT_COL ASC   EXP(0.999)  avgt    5   826189353.600 ± 1758966803.625    B/op
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #8570: Startree streamlining

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #8570:
URL: https://github.com/apache/pinot/pull/8570#issuecomment-1103963932

   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8570?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#8570](https://codecov.io/gh/apache/pinot/pull/8570?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (87cfb19) into [master](https://codecov.io/gh/apache/pinot/commit/6eff17754a0b0ffdfb8670b54b5ada38961b7917?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6eff177) will **decrease** coverage by `34.04%`.
   > The diff coverage is `7.14%`.
   
   > :exclamation: Current head 87cfb19 differs from pull request most recent head e28a0ef. Consider uploading reports for the commit e28a0ef to get more accurate results
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #8570       +/-   ##
   =============================================
   - Coverage     70.83%   36.78%   -34.05%     
   + Complexity     4307       84     -4223     
   =============================================
     Files          1688     1688               
     Lines         88295    88285       -10     
     Branches      13358    13358               
   =============================================
   - Hits          62542    32478    -30064     
   - Misses        21388    53180    +31792     
   + Partials       4365     2627     -1738     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `27.36% <7.14%> (+0.04%)` | :arrow_up: |
   | integration2 | `25.79% <7.14%> (-0.02%)` | :arrow_down: |
   | unittests1 | `?` | |
   | unittests2 | `14.12% <0.00%> (-0.01%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8570?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ot/segment/local/startree/OffHeapStarTreeNode.java](https://codecov.io/gh/apache/pinot/pull/8570/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zdGFydHJlZS9PZmZIZWFwU3RhclRyZWVOb2RlLmphdmE=) | `0.00% <0.00%> (-72.23%)` | :arrow_down: |
   | [...core/startree/operator/StarTreeFilterOperator.java](https://codecov.io/gh/apache/pinot/pull/8570/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zdGFydHJlZS9vcGVyYXRvci9TdGFyVHJlZUZpbHRlck9wZXJhdG9yLmphdmE=) | `85.91% <100.00%> (ø)` | |
   | [.../java/org/apache/pinot/spi/utils/BooleanUtils.java](https://codecov.io/gh/apache/pinot/pull/8570/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQm9vbGVhblV0aWxzLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/BaseRecording.java](https://codecov.io/gh/apache/pinot/pull/8570/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvQmFzZVJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/NoOpRecording.java](https://codecov.io/gh/apache/pinot/pull/8570/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvTm9PcFJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/config/table/FSTType.java](https://codecov.io/gh/apache/pinot/pull/8570/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3RhYmxlL0ZTVFR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/data/MetricFieldSpec.java](https://codecov.io/gh/apache/pinot/pull/8570/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9NZXRyaWNGaWVsZFNwZWMuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...va/org/apache/pinot/spi/utils/BigDecimalUtils.java](https://codecov.io/gh/apache/pinot/pull/8570/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQmlnRGVjaW1hbFV0aWxzLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/common/tier/TierFactory.java](https://codecov.io/gh/apache/pinot/pull/8570/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdGllci9UaWVyRmFjdG9yeS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/pinot/spi/config/table/TableType.java](https://codecov.io/gh/apache/pinot/pull/8570/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3RhYmxlL1RhYmxlVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [961 more](https://codecov.io/gh/apache/pinot/pull/8570/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8570?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8570?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [6eff177...e28a0ef](https://codecov.io/gh/apache/pinot/pull/8570?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org