You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/02/11 18:42:49 UTC
[GitHub] [pinot] richardstartin opened a new pull request #8195: No dictionary group by perf
richardstartin opened a new pull request #8195:
URL: https://github.com/apache/pinot/pull/8195
This change is motivated by slow queries at one of our customers which group by a raw column, where 30GB was seen to be allocated by `NoDictionaryMultiColumnGroupKeyGenerator.generateKeyForBlock`, which is also where most of the method samples were taken:
<img width="1590" alt="Screenshot 2022-02-11 at 18 33 25" src="https://user-images.githubusercontent.com/16439049/153649438-1d8a054a-5bc1-4313-ac08-6faf6b5a41e1.png">
<img width="1602" alt="Screenshot 2022-02-11 at 18 35 21" src="https://user-images.githubusercontent.com/16439049/153649797-f73bd1aa-233f-4e57-8007-ff38e417e14b.png">
This PR starts by generalising one of our pre-existing benchmarks which does a good job of exercising the entire query execution. It is parameterised so different queries can be added easily, and the generated data is parameterised too so that columns with different cardinalities can be created.
Then, the actual improvement is made in the second commit. It transposes the group key generation since the `BlockValSet`s will be cached by `DataBlockCache` anyway, then accumulates keys into a flyweight, which only needs to be allocated to memoize the group key on its first occurrence. This roughly halves average time and reduces allocation by at least a factor of 4:
```
Benchmark (_numRows) (_query) (_scenario) Mode Cnt Score Error Units
BenchmarkQueries.query 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 200.573 ± 36.577 ms/op
BenchmarkQueries.query:·gc.alloc.rate 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 454.459 ± 985.590 MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 139180218.880 ± 299249329.376 B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 510.589 ± 414.979 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 156957846.187 ± 109973654.666 B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 12.236 ± 42.494 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 3732222.293 ± 12807297.981 B/op
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 4.412 ± 19.484 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 1398101.333 ± 6240670.451 B/op
BenchmarkQueries.query:·gc.count 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 8.000 counts
BenchmarkQueries.query:·gc.time 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 407.000 ms
BenchmarkQueries.query 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 98.663 ± 7.845 ms/op
BenchmarkQueries.query:·gc.alloc.rate 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 696.114 ± 1498.561 MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 106449429.440 ± 228808237.226 B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 1029.174 ± 2217.348 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 157963208.145 ± 341415795.326 B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 0.109 ± 0.714 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 16481.891 ± 107743.719 B/op
BenchmarkQueries.query:·gc.count 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 4.000 counts
BenchmarkQueries.query:·gc.time 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 10.000 ms
BenchmarkQueries.query 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 90.816 ± 8.115 ms/op
BenchmarkQueries.query:·gc.alloc.rate 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 752.671 ± 1621.248 MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 106393285.309 ± 228688298.313 B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 1022.984 ± 2203.794 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 143076606.448 ± 309117349.271 B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 0.150 ± 0.832 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 21384.158 ± 119911.555 B/op
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 0.126 ± 1.082 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 17476.267 ± 150475.927 B/op
BenchmarkQueries.query:·gc.count 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 4.000 counts
BenchmarkQueries.query:·gc.time 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 52.000 ms
```
```
Benchmark (_numRows) (_query) (_scenario) Mode Cnt Score Error Units
BenchmarkQueries.query 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 130.071 ± 5.744 ms/op
BenchmarkQueries.query:·gc.alloc.rate 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 197.775 ± 424.170 MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 39989639.600 ± 85754314.959 B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 210.675 ± 161.001 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 42677043.200 ± 33405655.687 B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 0.835 ± 6.634 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 167488.000 ± 1331249.740 B/op
BenchmarkQueries.query:·gc.count 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 11.000 counts
BenchmarkQueries.query:·gc.time 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.001) avgt 5 268.000 ms
BenchmarkQueries.query 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 54.864 ± 4.432 ms/op
BenchmarkQueries.query:·gc.alloc.rate 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 10.390 ± 18.671 MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 883504.473 ± 1574108.609 B/op
BenchmarkQueries.query:·gc.count 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.5) avgt 5 ≈ 0 counts
BenchmarkQueries.query 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 47.429 ± 2.367 ms/op
BenchmarkQueries.query:·gc.alloc.rate 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 10.961 ± 19.191 MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 811307.408 ± 1420111.422 B/op
BenchmarkQueries.query:·gc.count 1500000 SELECT RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_INT_COL,INT_COL EXP(0.999) avgt 5 ≈ 0 counts
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] richardstartin commented on pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
richardstartin commented on pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#issuecomment-1040587358
> This PR definitely improves the performance by reducing the unnecessary allocations. My concern is about whether adding the branches can hurt the performance comparing to reusing the `keys` (even though the overall performance is still much better because of the reduced allocations). If we are confident that JVM can optimize the branches so that the overhead is negligible, then it is good to go.
I'm not too concerned about the branches that don't get pruned because the they are predictable (they go the same way for the entire block) but I think we could have the best of both worlds with code generation - if we can generate a tuple and code to lazily materialise the tuple for each shape we see in queries (of which I doubt there would be more than 20 patterns in a realistic deployment) we can have both low memory footprint and no conditionals.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] richardstartin commented on a change in pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
richardstartin commented on a change in pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#discussion_r806367799
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionaryMultiColumnGroupKeyGenerator.java
##########
@@ -92,60 +92,62 @@ public int getGlobalGroupKeyUpperBound() {
@Override
public void generateKeysForBlock(TransformBlock transformBlock, int[] groupKeys) {
int numDocs = transformBlock.getNumDocs();
- int[][] keys = new int[numDocs][_numGroupByExpressions];
+ Object[] values = new Object[_numGroupByExpressions];
for (int i = 0; i < _numGroupByExpressions; i++) {
BlockValSet blockValSet = transformBlock.getBlockValueSet(_groupByExpressions[i]);
if (_dictionaries[i] != null) {
- int[] dictIds = blockValSet.getDictionaryIdsSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = dictIds[j];
- }
+ values[i] = blockValSet.getDictionaryIdsSV();
} else {
- ValueToIdMap onTheFlyDictionary = _onTheFlyDictionaries[i];
switch (_storedTypes[i]) {
case INT:
- int[] intValues = blockValSet.getIntValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(intValues[j]);
- }
+ values[i] = blockValSet.getIntValuesSV();
break;
case LONG:
- long[] longValues = blockValSet.getLongValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(longValues[j]);
- }
+ values[i] = blockValSet.getLongValuesSV();
break;
case FLOAT:
- float[] floatValues = blockValSet.getFloatValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(floatValues[j]);
- }
+ values[i] = blockValSet.getFloatValuesSV();
break;
case DOUBLE:
- double[] doubleValues = blockValSet.getDoubleValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(doubleValues[j]);
- }
+ values[i] = blockValSet.getDoubleValuesSV();
break;
case STRING:
- String[] stringValues = blockValSet.getStringValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(stringValues[j]);
- }
+ values[i] = blockValSet.getStringValuesSV();
break;
case BYTES:
- byte[][] bytesValues = blockValSet.getBytesValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(new ByteArray(bytesValues[j]));
- }
+ values[i] = blockValSet.getBytesValuesSV();
break;
default:
throw new IllegalArgumentException("Illegal data type for no-dictionary key generator: " + _storedTypes[i]);
}
}
}
- for (int i = 0; i < numDocs; i++) {
- groupKeys[i] = getGroupIdForKey(new FixedIntArray(keys[i]));
+ int[] keyValues = new int[_numGroupByExpressions];
+ // note that we are mutating its backing array for memory efficiency
+ FixedIntArray flyweightKey = new FixedIntArray(keyValues);
+ for (int row = 0; row < numDocs; row++) {
Review comment:
I posted benchmark results. Please read them.
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionaryMultiColumnGroupKeyGenerator.java
##########
@@ -306,7 +325,7 @@ private int getGroupIdForKey(FixedIntArray keyList) {
if (groupId == INVALID_ID) {
if (_numGroups < _globalGroupIdUpperBound) {
groupId = _numGroups;
- _groupKeyMap.put(keyList, _numGroups++);
+ _groupKeyMap.put(keyList.clone(), _numGroups++);
Review comment:
good catch. Note that this does not affect the baseline benchmark results which were taken at 2fa525253a62108dbc91874c77e112eb349337d9
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] codecov-commenter edited a comment on pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#issuecomment-1036547667
# [Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#8195](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9936d55) into [master](https://codecov.io/gh/apache/pinot/commit/bad7106d5c714edf9d52b63dd4428d15cabd79c8?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bad7106) will **decrease** coverage by `1.27%`.
> The diff coverage is `94.44%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8195/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #8195 +/- ##
============================================
- Coverage 71.33% 70.06% -1.28%
- Complexity 4308 4314 +6
============================================
Files 1623 1624 +1
Lines 84365 84883 +518
Branches 12657 12794 +137
============================================
- Hits 60183 59472 -711
- Misses 20050 21296 +1246
+ Partials 4132 4115 -17
```
| Flag | Coverage Δ | |
|---|---|---|
| integration1 | `28.73% <66.66%> (-0.11%)` | :arrow_down: |
| integration2 | `?` | |
| unittests1 | `67.45% <94.44%> (-0.43%)` | :arrow_down: |
| unittests2 | `14.14% <0.00%> (-0.08%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...upby/NoDictionaryMultiColumnGroupKeyGenerator.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9ncm91cGJ5L05vRGljdGlvbmFyeU11bHRpQ29sdW1uR3JvdXBLZXlHZW5lcmF0b3IuamF2YQ==) | `63.55% <94.28%> (+0.62%)` | :arrow_up: |
| [...java/org/apache/pinot/spi/utils/FixedIntArray.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvRml4ZWRJbnRBcnJheS5qYXZh) | `61.53% <100.00%> (+3.20%)` | :arrow_up: |
| [...t/core/plan/StreamingInstanceResponsePlanNode.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9wbGFuL1N0cmVhbWluZ0luc3RhbmNlUmVzcG9uc2VQbGFuTm9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ore/operator/streaming/StreamingResponseUtils.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9zdHJlYW1pbmcvU3RyZWFtaW5nUmVzcG9uc2VVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ager/realtime/PeerSchemeSplitSegmentCommitter.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvUGVlclNjaGVtZVNwbGl0U2VnbWVudENvbW1pdHRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...pache/pinot/common/utils/grpc/GrpcQueryClient.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvZ3JwYy9HcnBjUXVlcnlDbGllbnQuamF2YQ==) | `0.00% <0.00%> (-94.74%)` | :arrow_down: |
| [...he/pinot/core/plan/StreamingSelectionPlanNode.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9wbGFuL1N0cmVhbWluZ1NlbGVjdGlvblBsYW5Ob2RlLmphdmE=) | `0.00% <0.00%> (-88.89%)` | :arrow_down: |
| [...ator/streaming/StreamingSelectionOnlyOperator.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9zdHJlYW1pbmcvU3RyZWFtaW5nU2VsZWN0aW9uT25seU9wZXJhdG9yLmphdmE=) | `0.00% <0.00%> (-87.81%)` | :arrow_down: |
| [...re/query/reduce/SelectionOnlyStreamingReducer.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9yZWR1Y2UvU2VsZWN0aW9uT25seVN0cmVhbWluZ1JlZHVjZXIuamF2YQ==) | `0.00% <0.00%> (-85.72%)` | :arrow_down: |
| [...oker/broker/BrokerServiceAutoDiscoveryFeature.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL0Jyb2tlclNlcnZpY2VBdXRvRGlzY292ZXJ5RmVhdHVyZS5qYXZh) | `0.00% <0.00%> (-81.82%)` | :arrow_down: |
| ... and [100 more](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [bad7106...9936d55](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang commented on a change in pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#discussion_r806355760
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionaryMultiColumnGroupKeyGenerator.java
##########
@@ -306,7 +325,7 @@ private int getGroupIdForKey(FixedIntArray keyList) {
if (groupId == INVALID_ID) {
if (_numGroups < _globalGroupIdUpperBound) {
groupId = _numGroups;
- _groupKeyMap.put(keyList, _numGroups++);
+ _groupKeyMap.put(keyList.clone(), _numGroups++);
Review comment:
This should be reverted?
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionaryMultiColumnGroupKeyGenerator.java
##########
@@ -92,60 +92,62 @@ public int getGlobalGroupKeyUpperBound() {
@Override
public void generateKeysForBlock(TransformBlock transformBlock, int[] groupKeys) {
int numDocs = transformBlock.getNumDocs();
- int[][] keys = new int[numDocs][_numGroupByExpressions];
+ Object[] values = new Object[_numGroupByExpressions];
for (int i = 0; i < _numGroupByExpressions; i++) {
BlockValSet blockValSet = transformBlock.getBlockValueSet(_groupByExpressions[i]);
if (_dictionaries[i] != null) {
- int[] dictIds = blockValSet.getDictionaryIdsSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = dictIds[j];
- }
+ values[i] = blockValSet.getDictionaryIdsSV();
} else {
- ValueToIdMap onTheFlyDictionary = _onTheFlyDictionaries[i];
switch (_storedTypes[i]) {
case INT:
- int[] intValues = blockValSet.getIntValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(intValues[j]);
- }
+ values[i] = blockValSet.getIntValuesSV();
break;
case LONG:
- long[] longValues = blockValSet.getLongValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(longValues[j]);
- }
+ values[i] = blockValSet.getLongValuesSV();
break;
case FLOAT:
- float[] floatValues = blockValSet.getFloatValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(floatValues[j]);
- }
+ values[i] = blockValSet.getFloatValuesSV();
break;
case DOUBLE:
- double[] doubleValues = blockValSet.getDoubleValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(doubleValues[j]);
- }
+ values[i] = blockValSet.getDoubleValuesSV();
break;
case STRING:
- String[] stringValues = blockValSet.getStringValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(stringValues[j]);
- }
+ values[i] = blockValSet.getStringValuesSV();
break;
case BYTES:
- byte[][] bytesValues = blockValSet.getBytesValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(new ByteArray(bytesValues[j]));
- }
+ values[i] = blockValSet.getBytesValuesSV();
break;
default:
throw new IllegalArgumentException("Illegal data type for no-dictionary key generator: " + _storedTypes[i]);
}
}
}
- for (int i = 0; i < numDocs; i++) {
- groupKeys[i] = getGroupIdForKey(new FixedIntArray(keys[i]));
+ int[] keyValues = new int[_numGroupByExpressions];
+ // note that we are mutating its backing array for memory efficiency
+ FixedIntArray flyweightKey = new FixedIntArray(keyValues);
+ for (int row = 0; row < numDocs; row++) {
Review comment:
Is this faster than the original way of processing values? For the new approach we need to do a lot of if checks on a per value basis, which can potentially hurt the performance. If we want to avoid allocating the `keys` multiple times, we can actually reuse the `keys`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] richardstartin merged pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
richardstartin merged pull request #8195:
URL: https://github.com/apache/pinot/pull/8195
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] codecov-commenter commented on pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#issuecomment-1036547667
# [Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#8195](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6f4d2b6) into [master](https://codecov.io/gh/apache/pinot/commit/5fa4737072d7b4e38164f44bdf563bbf3d6a62ba?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5fa4737) will **decrease** coverage by `8.48%`.
> The diff coverage is `94.59%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8195/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #8195 +/- ##
============================================
- Coverage 71.41% 62.92% -8.49%
+ Complexity 4302 4227 -75
============================================
Files 1623 1612 -11
Lines 84312 84023 -289
Branches 12639 12622 -17
============================================
- Hits 60213 52874 -7339
- Misses 19974 27268 +7294
+ Partials 4125 3881 -244
```
| Flag | Coverage Δ | |
|---|---|---|
| integration1 | `28.80% <64.86%> (-0.10%)` | :arrow_down: |
| integration2 | `?` | |
| unittests1 | `67.90% <94.59%> (+0.01%)` | :arrow_up: |
| unittests2 | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...upby/NoDictionaryMultiColumnGroupKeyGenerator.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9ncm91cGJ5L05vRGljdGlvbmFyeU11bHRpQ29sdW1uR3JvdXBLZXlHZW5lcmF0b3IuamF2YQ==) | `63.55% <94.44%> (+0.62%)` | :arrow_up: |
| [...java/org/apache/pinot/spi/utils/FixedIntArray.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvRml4ZWRJbnRBcnJheS5qYXZh) | `61.53% <100.00%> (+3.20%)` | :arrow_up: |
| [...t/core/plan/StreamingInstanceResponsePlanNode.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9wbGFuL1N0cmVhbWluZ0luc3RhbmNlUmVzcG9uc2VQbGFuTm9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...pinot/controller/recommender/io/ConfigManager.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9yZWNvbW1lbmRlci9pby9Db25maWdNYW5hZ2VyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ore/operator/streaming/StreamingResponseUtils.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9zdHJlYW1pbmcvU3RyZWFtaW5nUmVzcG9uc2VVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...troller/recommender/io/metadata/FieldMetadata.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9yZWNvbW1lbmRlci9pby9tZXRhZGF0YS9GaWVsZE1ldGFkYXRhLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...roller/recommender/rules/impl/BloomFilterRule.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9yZWNvbW1lbmRlci9ydWxlcy9pbXBsL0Jsb29tRmlsdGVyUnVsZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...oller/api/resources/PinotControllerAppConfigs.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9hcGkvcmVzb3VyY2VzL1Bpbm90Q29udHJvbGxlckFwcENvbmZpZ3MuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ler/recommender/data/generator/BytesGenerator.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9yZWNvbW1lbmRlci9kYXRhL2dlbmVyYXRvci9CeXRlc0dlbmVyYXRvci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ager/realtime/PeerSchemeSplitSegmentCommitter.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvUGVlclNjaGVtZVNwbGl0U2VnbWVudENvbW1pdHRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [290 more](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5fa4737...6f4d2b6](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] richardstartin commented on pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
richardstartin commented on pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#issuecomment-1040135103
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] codecov-commenter edited a comment on pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#issuecomment-1036547667
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] richardstartin commented on pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
richardstartin commented on pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#issuecomment-1040135103
Added another benchmark with a group by on a raw string column. The effect is mostly drowned out by excessive String allocation (no interning 😢), but this branch still offers a significant improvement. Note that the majority of the branches are pruned by tiered compilation and are entirely predictable.
Before
```
Benchmark (_numRows) (_query) (_scenario) Mode Cnt Score Error Units
BenchmarkQueries.query 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 401.449 ± 40.560 ms/op
BenchmarkQueries.query:·gc.alloc.rate 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 594.902 ± 1280.361 MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 352657742.400 ± 758730386.850 B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 816.002 ± 1783.388 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 490244232.533 ± 1075546345.417 B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 13.856 ± 50.128 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 8247320.000 ± 29878654.316 B/op
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 1.813 ± 15.611 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 1118481.067 ± 9630459.297 B/op
BenchmarkQueries.query:·gc.count 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 4.000 counts
BenchmarkQueries.query:·gc.time 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 259.000 ms
BenchmarkQueries.query 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 219.553 ± 2.789 ms/op
BenchmarkQueries.query:·gc.alloc.rate 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 947.525 ± 2038.543 MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 318633695.680 ± 685512352.413 B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 1342.334 ± 1944.947 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 451726540.800 ± 656852332.622 B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 0.056 ± 0.383 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 18864.320 ± 128684.949 B/op
BenchmarkQueries.query:·gc.count 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 6.000 counts
BenchmarkQueries.query:·gc.time 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 20.000 ms
BenchmarkQueries.query 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 211.974 ± 4.234 ms/op
BenchmarkQueries.query:·gc.alloc.rate 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 970.031 ± 2086.994 MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 318102240.640 ± 684372243.782 B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 1127.183 ± 696.552 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 370147328.000 ± 229406341.423 B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 0.175 ± 0.974 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 57655.680 ± 321478.066 B/op
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 0.255 ± 2.194 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 83886.080 ± 722284.447 B/op
BenchmarkQueries.query:·gc.count 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 6.000 counts
BenchmarkQueries.query:·gc.time 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 14.000 ms
```
After
```
Benchmark (_numRows) (_query) (_scenario) Mode Cnt Score Error Units
BenchmarkQueries.query 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 347.712 ± 27.737 ms/op
BenchmarkQueries.query:·gc.alloc.rate 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 434.512 ± 935.011 MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 235552355.733 ± 506651891.452 B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 446.016 ± 1000.169 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 241731720.533 ± 541560831.721 B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 11.443 ± 44.382 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 6151645.867 ± 23774788.552 B/op
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 1.013 ± 8.723 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Survivor_Space.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 559240.533 ± 4815229.649 B/op
BenchmarkQueries.query:·gc.count 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 4.000 counts
BenchmarkQueries.query:·gc.time 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.001) avgt 5 129.000 ms
BenchmarkQueries.query 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 185.713 ± 4.709 ms/op
BenchmarkQueries.query:·gc.alloc.rate 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 684.654 ± 1472.475 MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 194064742.933 ± 417372573.618 B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 742.663 ± 2610.572 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 210449203.200 ± 739758601.746 B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 0.022 ± 0.186 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 6095.467 ± 52483.806 B/op
BenchmarkQueries.query:·gc.count 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 3.000 counts
BenchmarkQueries.query:·gc.time 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.5) avgt 5 7.000 ms
BenchmarkQueries.query 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 183.992 ± 7.627 ms/op
BenchmarkQueries.query:·gc.alloc.rate 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 686.789 ± 1477.255 MB/sec
BenchmarkQueries.query:·gc.alloc.rate.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 193360184.800 ± 415859888.283 B/op
BenchmarkQueries.query:·gc.churn.G1_Eden_Space 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 919.579 ± 1979.891 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Eden_Space.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 259487607.467 ± 558628861.127 B/op
BenchmarkQueries.query:·gc.churn.G1_Old_Gen 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 0.032 ± 0.190 MB/sec
BenchmarkQueries.query:·gc.churn.G1_Old_Gen.norm 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 8994.933 ± 53455.434 B/op
BenchmarkQueries.query:·gc.count 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 4.000 counts
BenchmarkQueries.query:·gc.time 1500000 SELECT RAW_STRING_COL,RAW_INT_COL,INT_COL,COUNT(*) FROM MyTable GROUP BY RAW_STRING_COL,RAW_INT_COL,INT_COL EXP(0.999) avgt 5 22.000 ms
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] codecov-commenter edited a comment on pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#issuecomment-1036547667
# [Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#8195](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9936d55) into [master](https://codecov.io/gh/apache/pinot/commit/bad7106d5c714edf9d52b63dd4428d15cabd79c8?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bad7106) will **decrease** coverage by `1.26%`.
> The diff coverage is `94.44%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8195/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #8195 +/- ##
============================================
- Coverage 71.33% 70.06% -1.27%
- Complexity 4308 4314 +6
============================================
Files 1623 1624 +1
Lines 84365 84883 +518
Branches 12657 12794 +137
============================================
- Hits 60183 59476 -707
- Misses 20050 21292 +1242
+ Partials 4132 4115 -17
```
| Flag | Coverage Δ | |
|---|---|---|
| integration1 | `28.73% <66.66%> (-0.11%)` | :arrow_down: |
| integration2 | `?` | |
| unittests1 | `67.45% <94.44%> (-0.43%)` | :arrow_down: |
| unittests2 | `14.14% <0.00%> (-0.07%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...upby/NoDictionaryMultiColumnGroupKeyGenerator.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9ncm91cGJ5L05vRGljdGlvbmFyeU11bHRpQ29sdW1uR3JvdXBLZXlHZW5lcmF0b3IuamF2YQ==) | `63.55% <94.28%> (+0.62%)` | :arrow_up: |
| [...java/org/apache/pinot/spi/utils/FixedIntArray.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvRml4ZWRJbnRBcnJheS5qYXZh) | `61.53% <100.00%> (+3.20%)` | :arrow_up: |
| [...t/core/plan/StreamingInstanceResponsePlanNode.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9wbGFuL1N0cmVhbWluZ0luc3RhbmNlUmVzcG9uc2VQbGFuTm9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ore/operator/streaming/StreamingResponseUtils.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9zdHJlYW1pbmcvU3RyZWFtaW5nUmVzcG9uc2VVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ager/realtime/PeerSchemeSplitSegmentCommitter.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvUGVlclNjaGVtZVNwbGl0U2VnbWVudENvbW1pdHRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...pache/pinot/common/utils/grpc/GrpcQueryClient.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvZ3JwYy9HcnBjUXVlcnlDbGllbnQuamF2YQ==) | `0.00% <0.00%> (-94.74%)` | :arrow_down: |
| [...he/pinot/core/plan/StreamingSelectionPlanNode.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9wbGFuL1N0cmVhbWluZ1NlbGVjdGlvblBsYW5Ob2RlLmphdmE=) | `0.00% <0.00%> (-88.89%)` | :arrow_down: |
| [...ator/streaming/StreamingSelectionOnlyOperator.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9zdHJlYW1pbmcvU3RyZWFtaW5nU2VsZWN0aW9uT25seU9wZXJhdG9yLmphdmE=) | `0.00% <0.00%> (-87.81%)` | :arrow_down: |
| [...re/query/reduce/SelectionOnlyStreamingReducer.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9yZWR1Y2UvU2VsZWN0aW9uT25seVN0cmVhbWluZ1JlZHVjZXIuamF2YQ==) | `0.00% <0.00%> (-85.72%)` | :arrow_down: |
| [...oker/broker/BrokerServiceAutoDiscoveryFeature.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL0Jyb2tlclNlcnZpY2VBdXRvRGlzY292ZXJ5RmVhdHVyZS5qYXZh) | `0.00% <0.00%> (-81.82%)` | :arrow_down: |
| ... and [100 more](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [bad7106...9936d55](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] codecov-commenter edited a comment on pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#issuecomment-1036547667
# [Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#8195](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6f4d2b6) into [master](https://codecov.io/gh/apache/pinot/commit/5fa4737072d7b4e38164f44bdf563bbf3d6a62ba?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5fa4737) will **decrease** coverage by `7.32%`.
> The diff coverage is `94.59%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8195/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #8195 +/- ##
============================================
- Coverage 71.41% 64.09% -7.33%
+ Complexity 4302 4227 -75
============================================
Files 1623 1612 -11
Lines 84312 84023 -289
Branches 12639 12622 -17
============================================
- Hits 60213 53856 -6357
- Misses 19974 26249 +6275
+ Partials 4125 3918 -207
```
| Flag | Coverage Δ | |
|---|---|---|
| integration1 | `28.80% <64.86%> (-0.10%)` | :arrow_down: |
| integration2 | `27.60% <64.86%> (-0.10%)` | :arrow_down: |
| unittests1 | `67.90% <94.59%> (+0.01%)` | :arrow_up: |
| unittests2 | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...upby/NoDictionaryMultiColumnGroupKeyGenerator.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9ncm91cGJ5L05vRGljdGlvbmFyeU11bHRpQ29sdW1uR3JvdXBLZXlHZW5lcmF0b3IuamF2YQ==) | `63.55% <94.44%> (+0.62%)` | :arrow_up: |
| [...java/org/apache/pinot/spi/utils/FixedIntArray.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvRml4ZWRJbnRBcnJheS5qYXZh) | `61.53% <100.00%> (+3.20%)` | :arrow_up: |
| [...pinot/controller/recommender/io/ConfigManager.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9yZWNvbW1lbmRlci9pby9Db25maWdNYW5hZ2VyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...troller/recommender/io/metadata/FieldMetadata.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9yZWNvbW1lbmRlci9pby9tZXRhZGF0YS9GaWVsZE1ldGFkYXRhLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...roller/recommender/rules/impl/BloomFilterRule.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9yZWNvbW1lbmRlci9ydWxlcy9pbXBsL0Jsb29tRmlsdGVyUnVsZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...oller/api/resources/PinotControllerAppConfigs.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9hcGkvcmVzb3VyY2VzL1Bpbm90Q29udHJvbGxlckFwcENvbmZpZ3MuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ler/recommender/data/generator/BytesGenerator.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9yZWNvbW1lbmRlci9kYXRhL2dlbmVyYXRvci9CeXRlc0dlbmVyYXRvci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...er/recommender/io/metadata/SchemaWithMetaData.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9yZWNvbW1lbmRlci9pby9tZXRhZGF0YS9TY2hlbWFXaXRoTWV0YURhdGEuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...r/recommender/rules/impl/AggregateMetricsRule.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9yZWNvbW1lbmRlci9ydWxlcy9pbXBsL0FnZ3JlZ2F0ZU1ldHJpY3NSdWxlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../recommender/exceptions/InvalidInputException.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9yZWNvbW1lbmRlci9leGNlcHRpb25zL0ludmFsaWRJbnB1dEV4Y2VwdGlvbi5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [224 more](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5fa4737...6f4d2b6](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] richardstartin merged pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
richardstartin merged pull request #8195:
URL: https://github.com/apache/pinot/pull/8195
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] richardstartin commented on a change in pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
richardstartin commented on a change in pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#discussion_r806367799
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionaryMultiColumnGroupKeyGenerator.java
##########
@@ -92,60 +92,62 @@ public int getGlobalGroupKeyUpperBound() {
@Override
public void generateKeysForBlock(TransformBlock transformBlock, int[] groupKeys) {
int numDocs = transformBlock.getNumDocs();
- int[][] keys = new int[numDocs][_numGroupByExpressions];
+ Object[] values = new Object[_numGroupByExpressions];
for (int i = 0; i < _numGroupByExpressions; i++) {
BlockValSet blockValSet = transformBlock.getBlockValueSet(_groupByExpressions[i]);
if (_dictionaries[i] != null) {
- int[] dictIds = blockValSet.getDictionaryIdsSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = dictIds[j];
- }
+ values[i] = blockValSet.getDictionaryIdsSV();
} else {
- ValueToIdMap onTheFlyDictionary = _onTheFlyDictionaries[i];
switch (_storedTypes[i]) {
case INT:
- int[] intValues = blockValSet.getIntValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(intValues[j]);
- }
+ values[i] = blockValSet.getIntValuesSV();
break;
case LONG:
- long[] longValues = blockValSet.getLongValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(longValues[j]);
- }
+ values[i] = blockValSet.getLongValuesSV();
break;
case FLOAT:
- float[] floatValues = blockValSet.getFloatValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(floatValues[j]);
- }
+ values[i] = blockValSet.getFloatValuesSV();
break;
case DOUBLE:
- double[] doubleValues = blockValSet.getDoubleValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(doubleValues[j]);
- }
+ values[i] = blockValSet.getDoubleValuesSV();
break;
case STRING:
- String[] stringValues = blockValSet.getStringValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(stringValues[j]);
- }
+ values[i] = blockValSet.getStringValuesSV();
break;
case BYTES:
- byte[][] bytesValues = blockValSet.getBytesValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(new ByteArray(bytesValues[j]));
- }
+ values[i] = blockValSet.getBytesValuesSV();
break;
default:
throw new IllegalArgumentException("Illegal data type for no-dictionary key generator: " + _storedTypes[i]);
}
}
}
- for (int i = 0; i < numDocs; i++) {
- groupKeys[i] = getGroupIdForKey(new FixedIntArray(keys[i]));
+ int[] keyValues = new int[_numGroupByExpressions];
+ // note that we are mutating its backing array for memory efficiency
+ FixedIntArray flyweightKey = new FixedIntArray(keyValues);
+ for (int row = 0; row < numDocs; row++) {
Review comment:
I posted benchmark results. Please read them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] richardstartin commented on a change in pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
richardstartin commented on a change in pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#discussion_r806368204
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionaryMultiColumnGroupKeyGenerator.java
##########
@@ -306,7 +325,7 @@ private int getGroupIdForKey(FixedIntArray keyList) {
if (groupId == INVALID_ID) {
if (_numGroups < _globalGroupIdUpperBound) {
groupId = _numGroups;
- _groupKeyMap.put(keyList, _numGroups++);
+ _groupKeyMap.put(keyList.clone(), _numGroups++);
Review comment:
good catch. Note that this does not affect the baseline benchmark results which were taken at 2fa525253a62108dbc91874c77e112eb349337d9
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] codecov-commenter edited a comment on pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#issuecomment-1036547667
# [Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#8195](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9936d55) into [master](https://codecov.io/gh/apache/pinot/commit/bad7106d5c714edf9d52b63dd4428d15cabd79c8?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bad7106) will **decrease** coverage by `1.25%`.
> The diff coverage is `94.44%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8195/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #8195 +/- ##
============================================
- Coverage 71.33% 70.07% -1.26%
- Complexity 4308 4314 +6
============================================
Files 1623 1624 +1
Lines 84365 84883 +518
Branches 12657 12794 +137
============================================
- Hits 60183 59485 -698
- Misses 20050 21286 +1236
+ Partials 4132 4112 -20
```
| Flag | Coverage Δ | |
|---|---|---|
| integration1 | `28.73% <66.66%> (-0.11%)` | :arrow_down: |
| integration2 | `?` | |
| unittests1 | `67.46% <94.44%> (-0.41%)` | :arrow_down: |
| unittests2 | `14.14% <0.00%> (-0.07%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...upby/NoDictionaryMultiColumnGroupKeyGenerator.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9ncm91cGJ5L05vRGljdGlvbmFyeU11bHRpQ29sdW1uR3JvdXBLZXlHZW5lcmF0b3IuamF2YQ==) | `63.55% <94.28%> (+0.62%)` | :arrow_up: |
| [...java/org/apache/pinot/spi/utils/FixedIntArray.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvRml4ZWRJbnRBcnJheS5qYXZh) | `61.53% <100.00%> (+3.20%)` | :arrow_up: |
| [...t/core/plan/StreamingInstanceResponsePlanNode.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9wbGFuL1N0cmVhbWluZ0luc3RhbmNlUmVzcG9uc2VQbGFuTm9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ore/operator/streaming/StreamingResponseUtils.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9zdHJlYW1pbmcvU3RyZWFtaW5nUmVzcG9uc2VVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ager/realtime/PeerSchemeSplitSegmentCommitter.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvUGVlclNjaGVtZVNwbGl0U2VnbWVudENvbW1pdHRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...pache/pinot/common/utils/grpc/GrpcQueryClient.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvZ3JwYy9HcnBjUXVlcnlDbGllbnQuamF2YQ==) | `0.00% <0.00%> (-94.74%)` | :arrow_down: |
| [...he/pinot/core/plan/StreamingSelectionPlanNode.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9wbGFuL1N0cmVhbWluZ1NlbGVjdGlvblBsYW5Ob2RlLmphdmE=) | `0.00% <0.00%> (-88.89%)` | :arrow_down: |
| [...ator/streaming/StreamingSelectionOnlyOperator.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9zdHJlYW1pbmcvU3RyZWFtaW5nU2VsZWN0aW9uT25seU9wZXJhdG9yLmphdmE=) | `0.00% <0.00%> (-87.81%)` | :arrow_down: |
| [...re/query/reduce/SelectionOnlyStreamingReducer.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9yZWR1Y2UvU2VsZWN0aW9uT25seVN0cmVhbWluZ1JlZHVjZXIuamF2YQ==) | `0.00% <0.00%> (-85.72%)` | :arrow_down: |
| [...oker/broker/BrokerServiceAutoDiscoveryFeature.java](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL0Jyb2tlclNlcnZpY2VBdXRvRGlzY292ZXJ5RmVhdHVyZS5qYXZh) | `0.00% <0.00%> (-81.82%)` | :arrow_down: |
| ... and [103 more](https://codecov.io/gh/apache/pinot/pull/8195/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [bad7106...9936d55](https://codecov.io/gh/apache/pinot/pull/8195?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang commented on a change in pull request #8195: No dictionary group by perf
Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #8195:
URL: https://github.com/apache/pinot/pull/8195#discussion_r806355760
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionaryMultiColumnGroupKeyGenerator.java
##########
@@ -306,7 +325,7 @@ private int getGroupIdForKey(FixedIntArray keyList) {
if (groupId == INVALID_ID) {
if (_numGroups < _globalGroupIdUpperBound) {
groupId = _numGroups;
- _groupKeyMap.put(keyList, _numGroups++);
+ _groupKeyMap.put(keyList.clone(), _numGroups++);
Review comment:
This should be reverted?
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/groupby/NoDictionaryMultiColumnGroupKeyGenerator.java
##########
@@ -92,60 +92,62 @@ public int getGlobalGroupKeyUpperBound() {
@Override
public void generateKeysForBlock(TransformBlock transformBlock, int[] groupKeys) {
int numDocs = transformBlock.getNumDocs();
- int[][] keys = new int[numDocs][_numGroupByExpressions];
+ Object[] values = new Object[_numGroupByExpressions];
for (int i = 0; i < _numGroupByExpressions; i++) {
BlockValSet blockValSet = transformBlock.getBlockValueSet(_groupByExpressions[i]);
if (_dictionaries[i] != null) {
- int[] dictIds = blockValSet.getDictionaryIdsSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = dictIds[j];
- }
+ values[i] = blockValSet.getDictionaryIdsSV();
} else {
- ValueToIdMap onTheFlyDictionary = _onTheFlyDictionaries[i];
switch (_storedTypes[i]) {
case INT:
- int[] intValues = blockValSet.getIntValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(intValues[j]);
- }
+ values[i] = blockValSet.getIntValuesSV();
break;
case LONG:
- long[] longValues = blockValSet.getLongValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(longValues[j]);
- }
+ values[i] = blockValSet.getLongValuesSV();
break;
case FLOAT:
- float[] floatValues = blockValSet.getFloatValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(floatValues[j]);
- }
+ values[i] = blockValSet.getFloatValuesSV();
break;
case DOUBLE:
- double[] doubleValues = blockValSet.getDoubleValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(doubleValues[j]);
- }
+ values[i] = blockValSet.getDoubleValuesSV();
break;
case STRING:
- String[] stringValues = blockValSet.getStringValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(stringValues[j]);
- }
+ values[i] = blockValSet.getStringValuesSV();
break;
case BYTES:
- byte[][] bytesValues = blockValSet.getBytesValuesSV();
- for (int j = 0; j < numDocs; j++) {
- keys[j][i] = onTheFlyDictionary.put(new ByteArray(bytesValues[j]));
- }
+ values[i] = blockValSet.getBytesValuesSV();
break;
default:
throw new IllegalArgumentException("Illegal data type for no-dictionary key generator: " + _storedTypes[i]);
}
}
}
- for (int i = 0; i < numDocs; i++) {
- groupKeys[i] = getGroupIdForKey(new FixedIntArray(keys[i]));
+ int[] keyValues = new int[_numGroupByExpressions];
+ // note that we are mutating its backing array for memory efficiency
+ FixedIntArray flyweightKey = new FixedIntArray(keyValues);
+ for (int row = 0; row < numDocs; row++) {
Review comment:
Is this faster than the original way of processing values? For the new approach we need to do a lot of if checks on a per value basis, which can potentially hurt the performance. If we want to avoid allocating the `keys` multiple times, we can actually reuse the `keys`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org