You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/10/28 07:32:00 UTC
[GitHub] [pinot] Jackie-Jiang opened a new pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch
Jackie-Jiang opened a new pull request #7654:
URL: https://github.com/apache/pinot/pull/7654
## Description
Enhance ColumnValueSegmentPruner to early terminate when there is no EQ/IN/RANGE predicate on columns. Also support prefetch bloom filters when prefetch is required.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang commented on a change in pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch
Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#discussion_r739590251
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/pruner/ColumnValueSegmentPruner.java
##########
@@ -72,29 +80,127 @@ public void init(PinotConfiguration config) {
}
@Override
- public boolean prune(IndexSegment segment, QueryContext query) {
+ public List<IndexSegment> prune(List<IndexSegment> segments, QueryContext query) {
+ if (segments.isEmpty()) {
+ return segments;
+ }
FilterContext filter = query.getFilter();
if (filter == null) {
- return false;
+ return segments;
+ }
+
+ // Extract EQ/IN/RANGE predicate columns
+ Set<String> eqInColumns = new HashSet<>();
+ Set<String> rangeColumns = new HashSet<>();
+ extractPredicateColumns(filter, eqInColumns, rangeColumns);
+
+ if (eqInColumns.isEmpty() && rangeColumns.isEmpty()) {
+ return segments;
+ }
+
+ int numSegments = segments.size();
+ List<IndexSegment> selectedSegments = new ArrayList<>(numSegments);
+ if (!eqInColumns.isEmpty() && query.isEnablePrefetch()) {
+ Map[] dataSourceCaches = new Map[numSegments];
+ FetchContext[] fetchContexts = new FetchContext[numSegments];
+ try {
+ // Prefetch bloom filter for columns within the EQ/IN predicate if exists
+ for (int i = 0; i < numSegments; i++) {
+ IndexSegment segment = segments.get(i);
+ Map<String, DataSource> dataSourceCache = new HashMap<>();
+ Map<String, List<ColumnIndexType>> columnToIndexList = new HashMap<>();
+ for (String column : eqInColumns) {
+ DataSource dataSource = segment.getDataSource(column);
+ // NOTE: Column must exist after DataSchemaSegmentPruner
+ assert dataSource != null;
+ dataSourceCache.put(column, dataSource);
+ if (dataSource.getBloomFilter() != null) {
+ columnToIndexList.put(column, Collections.singletonList(ColumnIndexType.BLOOM_FILTER));
+ }
+ }
+ dataSourceCaches[i] = dataSourceCache;
+ if (!columnToIndexList.isEmpty()) {
+ FetchContext fetchContext =
+ new FetchContext(UUID.randomUUID(), segment.getSegmentName(), columnToIndexList);
+ segment.prefetch(fetchContext);
+ fetchContexts[i] = fetchContext;
+ }
+ }
+
+ // Prune segments
+ for (int i = 0; i < numSegments; i++) {
+ IndexSegment segment = segments.get(i);
+ if (!pruneSegment(segment, filter, dataSourceCaches[i], fetchContexts[i])) {
+ selectedSegments.add(segment);
Review comment:
Addressed in #7668
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] codecov-commenter edited a comment on pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#issuecomment-953597423
# [Codecov](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#7654](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (883b75e) into [master](https://codecov.io/gh/apache/pinot/commit/3734ee2021a2141a359aaff28a8ec37345bc80f7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3734ee2) will **decrease** coverage by `57.13%`.
> The diff coverage is `0.00%`.
> :exclamation: Current head 883b75e differs from pull request most recent head 837d152. Consider uploading reports for the commit 837d152 to get more accurate results
[![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/7654/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #7654 +/- ##
=============================================
- Coverage 71.70% 14.57% -57.14%
+ Complexity 3998 80 -3918
=============================================
Files 1576 1530 -46
Lines 79989 78201 -1788
Branches 11854 11674 -180
=============================================
- Hits 57357 11396 -45961
- Misses 18767 65982 +47215
+ Partials 3865 823 -3042
```
| Flag | Coverage Δ | |
|---|---|---|
| integration1 | `?` | |
| integration2 | `?` | |
| unittests1 | `?` | |
| unittests2 | `14.57% <0.00%> (-0.02%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...ot/core/query/pruner/ColumnValueSegmentPruner.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9wcnVuZXIvQ29sdW1uVmFsdWVTZWdtZW50UHJ1bmVyLmphdmE=) | `0.00% <0.00%> (-94.17%)` | :arrow_down: |
| [...c/main/java/org/apache/pinot/common/tier/Tier.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdGllci9UaWVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ain/java/org/apache/pinot/core/data/table/Key.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL3RhYmxlL0tleS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../java/org/apache/pinot/spi/utils/BooleanUtils.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQm9vbGVhblV0aWxzLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../java/org/apache/pinot/core/data/table/Record.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL3RhYmxlL1JlY29yZC5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../java/org/apache/pinot/core/util/GroupByUtils.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS91dGlsL0dyb3VwQnlVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ava/org/apache/pinot/spi/data/MetricFieldSpec.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9NZXRyaWNGaWVsZFNwZWMuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...va/org/apache/pinot/spi/utils/BigDecimalUtils.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQmlnRGVjaW1hbFV0aWxzLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...java/org/apache/pinot/common/tier/TierFactory.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdGllci9UaWVyRmFjdG9yeS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...java/org/apache/pinot/common/utils/StringUtil.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvU3RyaW5nVXRpbC5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [1248 more](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [3734ee2...837d152](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang merged pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch
Posted by GitBox <gi...@apache.org>.
Jackie-Jiang merged pull request #7654:
URL: https://github.com/apache/pinot/pull/7654
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang commented on a change in pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch
Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#discussion_r738937559
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/pruner/ColumnValueSegmentPruner.java
##########
@@ -164,7 +274,73 @@ private boolean pruneEqPredicate(IndexSegment segment, EqPredicate eqPredicate,
}
/**
- * For RANGE predicate, prune the segment based on:
+ * For IN predicate, prune the segments based on:
+ * <ul>
+ * <li>Column min/max value</li>
+ * <li>Column bloom filter</li>
+ * </ul>
+ * <p>NOTE: segments will not be pruned if the number of values is greater than the threshold.
+ */
+ private boolean pruneInPredicate(IndexSegment segment, InPredicate inPredicate,
Review comment:
I moved some methods around to keep the similar methods together, which is identified as delete and add by github
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] Jackie-Jiang commented on a change in pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch
Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#discussion_r739482293
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/pruner/ColumnValueSegmentPruner.java
##########
@@ -72,29 +80,127 @@ public void init(PinotConfiguration config) {
}
@Override
- public boolean prune(IndexSegment segment, QueryContext query) {
+ public List<IndexSegment> prune(List<IndexSegment> segments, QueryContext query) {
+ if (segments.isEmpty()) {
+ return segments;
+ }
FilterContext filter = query.getFilter();
if (filter == null) {
- return false;
+ return segments;
+ }
+
+ // Extract EQ/IN/RANGE predicate columns
+ Set<String> eqInColumns = new HashSet<>();
+ Set<String> rangeColumns = new HashSet<>();
+ extractPredicateColumns(filter, eqInColumns, rangeColumns);
+
+ if (eqInColumns.isEmpty() && rangeColumns.isEmpty()) {
+ return segments;
+ }
+
+ int numSegments = segments.size();
+ List<IndexSegment> selectedSegments = new ArrayList<>(numSegments);
+ if (!eqInColumns.isEmpty() && query.isEnablePrefetch()) {
+ Map[] dataSourceCaches = new Map[numSegments];
+ FetchContext[] fetchContexts = new FetchContext[numSegments];
+ try {
+ // Prefetch bloom filter for columns within the EQ/IN predicate if exists
+ for (int i = 0; i < numSegments; i++) {
+ IndexSegment segment = segments.get(i);
+ Map<String, DataSource> dataSourceCache = new HashMap<>();
+ Map<String, List<ColumnIndexType>> columnToIndexList = new HashMap<>();
+ for (String column : eqInColumns) {
+ DataSource dataSource = segment.getDataSource(column);
+ // NOTE: Column must exist after DataSchemaSegmentPruner
+ assert dataSource != null;
+ dataSourceCache.put(column, dataSource);
+ if (dataSource.getBloomFilter() != null) {
+ columnToIndexList.put(column, Collections.singletonList(ColumnIndexType.BLOOM_FILTER));
+ }
+ }
+ dataSourceCaches[i] = dataSourceCache;
+ if (!columnToIndexList.isEmpty()) {
+ FetchContext fetchContext =
+ new FetchContext(UUID.randomUUID(), segment.getSegmentName(), columnToIndexList);
+ segment.prefetch(fetchContext);
+ fetchContexts[i] = fetchContext;
+ }
+ }
+
+ // Prune segments
+ for (int i = 0; i < numSegments; i++) {
+ IndexSegment segment = segments.get(i);
+ if (!pruneSegment(segment, filter, dataSourceCaches[i], fetchContexts[i])) {
+ selectedSegments.add(segment);
Review comment:
Good point! Let me see if I can reduce the window between acquire and release
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] codecov-commenter commented on pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch
Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#issuecomment-953597423
# [Codecov](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#7654](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (837d152) into [master](https://codecov.io/gh/apache/pinot/commit/3734ee2021a2141a359aaff28a8ec37345bc80f7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3734ee2) will **decrease** coverage by `3.05%`.
> The diff coverage is `59.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/7654/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #7654 +/- ##
============================================
- Coverage 71.70% 68.65% -3.06%
+ Complexity 3998 3916 -82
============================================
Files 1576 1181 -395
Lines 79989 57710 -22279
Branches 11854 8865 -2989
============================================
- Hits 57357 39621 -17736
+ Misses 18767 15278 -3489
+ Partials 3865 2811 -1054
```
| Flag | Coverage Δ | |
|---|---|---|
| integration1 | `?` | |
| integration2 | `?` | |
| unittests1 | `68.65% <59.00%> (-0.05%)` | :arrow_down: |
| unittests2 | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...ot/core/query/pruner/ColumnValueSegmentPruner.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9wcnVuZXIvQ29sdW1uVmFsdWVTZWdtZW50UHJ1bmVyLmphdmE=) | `69.56% <59.00%> (-24.61%)` | :arrow_down: |
| [...a/org/apache/pinot/common/metrics/MinionMeter.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9NaW5pb25NZXRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...g/apache/pinot/common/metrics/ControllerMeter.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Db250cm9sbGVyTWV0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../apache/pinot/common/metrics/BrokerQueryPhase.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Ccm9rZXJRdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../apache/pinot/common/metrics/MinionQueryPhase.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9NaW5pb25RdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...he/pinot/common/messages/SegmentReloadMessage.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWVzc2FnZXMvU2VnbWVudFJlbG9hZE1lc3NhZ2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...pinot/core/data/manager/realtime/TimerService.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvVGltZXJTZXJ2aWNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...not/common/exception/HttpErrorStatusException.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZXhjZXB0aW9uL0h0dHBFcnJvclN0YXR1c0V4Y2VwdGlvbi5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...t/core/startree/plan/StarTreeDocIdSetPlanNode.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zdGFydHJlZS9wbGFuL1N0YXJUcmVlRG9jSWRTZXRQbGFuTm9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...ot/common/restlet/resources/TableMetadataInfo.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vcmVzdGxldC9yZXNvdXJjZXMvVGFibGVNZXRhZGF0YUluZm8uamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [598 more](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [3734ee2...837d152](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] npawar commented on a change in pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch
Posted by GitBox <gi...@apache.org>.
npawar commented on a change in pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#discussion_r739432087
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/pruner/ColumnValueSegmentPruner.java
##########
@@ -72,29 +80,127 @@ public void init(PinotConfiguration config) {
}
@Override
- public boolean prune(IndexSegment segment, QueryContext query) {
+ public List<IndexSegment> prune(List<IndexSegment> segments, QueryContext query) {
+ if (segments.isEmpty()) {
+ return segments;
+ }
FilterContext filter = query.getFilter();
if (filter == null) {
- return false;
+ return segments;
+ }
+
+ // Extract EQ/IN/RANGE predicate columns
+ Set<String> eqInColumns = new HashSet<>();
+ Set<String> rangeColumns = new HashSet<>();
+ extractPredicateColumns(filter, eqInColumns, rangeColumns);
+
+ if (eqInColumns.isEmpty() && rangeColumns.isEmpty()) {
+ return segments;
+ }
+
+ int numSegments = segments.size();
+ List<IndexSegment> selectedSegments = new ArrayList<>(numSegments);
+ if (!eqInColumns.isEmpty() && query.isEnablePrefetch()) {
+ Map[] dataSourceCaches = new Map[numSegments];
+ FetchContext[] fetchContexts = new FetchContext[numSegments];
+ try {
+ // Prefetch bloom filter for columns within the EQ/IN predicate if exists
+ for (int i = 0; i < numSegments; i++) {
+ IndexSegment segment = segments.get(i);
+ Map<String, DataSource> dataSourceCache = new HashMap<>();
+ Map<String, List<ColumnIndexType>> columnToIndexList = new HashMap<>();
+ for (String column : eqInColumns) {
+ DataSource dataSource = segment.getDataSource(column);
+ // NOTE: Column must exist after DataSchemaSegmentPruner
+ assert dataSource != null;
+ dataSourceCache.put(column, dataSource);
+ if (dataSource.getBloomFilter() != null) {
+ columnToIndexList.put(column, Collections.singletonList(ColumnIndexType.BLOOM_FILTER));
+ }
+ }
+ dataSourceCaches[i] = dataSourceCache;
+ if (!columnToIndexList.isEmpty()) {
+ FetchContext fetchContext =
+ new FetchContext(UUID.randomUUID(), segment.getSegmentName(), columnToIndexList);
+ segment.prefetch(fetchContext);
+ fetchContexts[i] = fetchContext;
+ }
+ }
+
+ // Prune segments
+ for (int i = 0; i < numSegments; i++) {
+ IndexSegment segment = segments.get(i);
+ if (!pruneSegment(segment, filter, dataSourceCaches[i], fetchContexts[i])) {
+ selectedSegments.add(segment);
Review comment:
it looks like we'll acquire for all segments, and then later release for all. Unlike the segment execution where we are guaranteed to release an acquired segment. Will this not deadlock if all acquire are not able to complete?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [pinot] siddharthteotia commented on a change in pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch
Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#discussion_r738868832
##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/pruner/ColumnValueSegmentPruner.java
##########
@@ -164,7 +274,73 @@ private boolean pruneEqPredicate(IndexSegment segment, EqPredicate eqPredicate,
}
/**
- * For RANGE predicate, prune the segment based on:
+ * For IN predicate, prune the segments based on:
+ * <ul>
+ * <li>Column min/max value</li>
+ * <li>Column bloom filter</li>
+ * </ul>
+ * <p>NOTE: segments will not be pruned if the number of values is greater than the threshold.
+ */
+ private boolean pruneInPredicate(IndexSegment segment, InPredicate inPredicate,
Review comment:
Maybe this is just github thing that it is showing both functions as deleted and rewritten ? Apart from fetchcontext, nothing else seems to have changed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org