You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/10/28 07:32:00 UTC

[GitHub] [pinot] Jackie-Jiang opened a new pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch

Jackie-Jiang opened a new pull request #7654:
URL: https://github.com/apache/pinot/pull/7654


   ## Description
   Enhance ColumnValueSegmentPruner to early terminate when there is no EQ/IN/RANGE predicate on columns. Also support prefetch bloom filters when prefetch is required.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a change in pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#discussion_r739590251



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/pruner/ColumnValueSegmentPruner.java
##########
@@ -72,29 +80,127 @@ public void init(PinotConfiguration config) {
   }
 
   @Override
-  public boolean prune(IndexSegment segment, QueryContext query) {
+  public List<IndexSegment> prune(List<IndexSegment> segments, QueryContext query) {
+    if (segments.isEmpty()) {
+      return segments;
+    }
     FilterContext filter = query.getFilter();
     if (filter == null) {
-      return false;
+      return segments;
+    }
+
+    // Extract EQ/IN/RANGE predicate columns
+    Set<String> eqInColumns = new HashSet<>();
+    Set<String> rangeColumns = new HashSet<>();
+    extractPredicateColumns(filter, eqInColumns, rangeColumns);
+
+    if (eqInColumns.isEmpty() && rangeColumns.isEmpty()) {
+      return segments;
+    }
+
+    int numSegments = segments.size();
+    List<IndexSegment> selectedSegments = new ArrayList<>(numSegments);
+    if (!eqInColumns.isEmpty() && query.isEnablePrefetch()) {
+      Map[] dataSourceCaches = new Map[numSegments];
+      FetchContext[] fetchContexts = new FetchContext[numSegments];
+      try {
+        // Prefetch bloom filter for columns within the EQ/IN predicate if exists
+        for (int i = 0; i < numSegments; i++) {
+          IndexSegment segment = segments.get(i);
+          Map<String, DataSource> dataSourceCache = new HashMap<>();
+          Map<String, List<ColumnIndexType>> columnToIndexList = new HashMap<>();
+          for (String column : eqInColumns) {
+            DataSource dataSource = segment.getDataSource(column);
+            // NOTE: Column must exist after DataSchemaSegmentPruner
+            assert dataSource != null;
+            dataSourceCache.put(column, dataSource);
+            if (dataSource.getBloomFilter() != null) {
+              columnToIndexList.put(column, Collections.singletonList(ColumnIndexType.BLOOM_FILTER));
+            }
+          }
+          dataSourceCaches[i] = dataSourceCache;
+          if (!columnToIndexList.isEmpty()) {
+            FetchContext fetchContext =
+                new FetchContext(UUID.randomUUID(), segment.getSegmentName(), columnToIndexList);
+            segment.prefetch(fetchContext);
+            fetchContexts[i] = fetchContext;
+          }
+        }
+
+        // Prune segments
+        for (int i = 0; i < numSegments; i++) {
+          IndexSegment segment = segments.get(i);
+          if (!pruneSegment(segment, filter, dataSourceCaches[i], fetchContexts[i])) {
+            selectedSegments.add(segment);

Review comment:
       Addressed in #7668 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#issuecomment-953597423


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#7654](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (883b75e) into [master](https://codecov.io/gh/apache/pinot/commit/3734ee2021a2141a359aaff28a8ec37345bc80f7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3734ee2) will **decrease** coverage by `57.13%`.
   > The diff coverage is `0.00%`.
   
   > :exclamation: Current head 883b75e differs from pull request most recent head 837d152. Consider uploading reports for the commit 837d152 to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/7654/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #7654       +/-   ##
   =============================================
   - Coverage     71.70%   14.57%   -57.14%     
   + Complexity     3998       80     -3918     
   =============================================
     Files          1576     1530       -46     
     Lines         79989    78201     -1788     
     Branches      11854    11674      -180     
   =============================================
   - Hits          57357    11396    -45961     
   - Misses        18767    65982    +47215     
   + Partials       3865      823     -3042     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | integration2 | `?` | |
   | unittests1 | `?` | |
   | unittests2 | `14.57% <0.00%> (-0.02%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ot/core/query/pruner/ColumnValueSegmentPruner.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9wcnVuZXIvQ29sdW1uVmFsdWVTZWdtZW50UHJ1bmVyLmphdmE=) | `0.00% <0.00%> (-94.17%)` | :arrow_down: |
   | [...c/main/java/org/apache/pinot/common/tier/Tier.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdGllci9UaWVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ain/java/org/apache/pinot/core/data/table/Key.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL3RhYmxlL0tleS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../java/org/apache/pinot/spi/utils/BooleanUtils.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQm9vbGVhblV0aWxzLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../java/org/apache/pinot/core/data/table/Record.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL3RhYmxlL1JlY29yZC5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../java/org/apache/pinot/core/util/GroupByUtils.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS91dGlsL0dyb3VwQnlVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/data/MetricFieldSpec.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvZGF0YS9NZXRyaWNGaWVsZFNwZWMuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...va/org/apache/pinot/spi/utils/BigDecimalUtils.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQmlnRGVjaW1hbFV0aWxzLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/common/tier/TierFactory.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdGllci9UaWVyRmFjdG9yeS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/common/utils/StringUtil.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvU3RyaW5nVXRpbC5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [1248 more](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [3734ee2...837d152](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang merged pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang merged pull request #7654:
URL: https://github.com/apache/pinot/pull/7654


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a change in pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#discussion_r738937559



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/pruner/ColumnValueSegmentPruner.java
##########
@@ -164,7 +274,73 @@ private boolean pruneEqPredicate(IndexSegment segment, EqPredicate eqPredicate,
   }
 
   /**
-   * For RANGE predicate, prune the segment based on:
+   * For IN predicate, prune the segments based on:
+   * <ul>
+   *   <li>Column min/max value</li>
+   *   <li>Column bloom filter</li>
+   * </ul>
+   * <p>NOTE: segments will not be pruned if the number of values is greater than the threshold.
+   */
+  private boolean pruneInPredicate(IndexSegment segment, InPredicate inPredicate,

Review comment:
       I moved some methods around to keep the similar methods together, which is identified as delete and add by github




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a change in pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#discussion_r739482293



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/pruner/ColumnValueSegmentPruner.java
##########
@@ -72,29 +80,127 @@ public void init(PinotConfiguration config) {
   }
 
   @Override
-  public boolean prune(IndexSegment segment, QueryContext query) {
+  public List<IndexSegment> prune(List<IndexSegment> segments, QueryContext query) {
+    if (segments.isEmpty()) {
+      return segments;
+    }
     FilterContext filter = query.getFilter();
     if (filter == null) {
-      return false;
+      return segments;
+    }
+
+    // Extract EQ/IN/RANGE predicate columns
+    Set<String> eqInColumns = new HashSet<>();
+    Set<String> rangeColumns = new HashSet<>();
+    extractPredicateColumns(filter, eqInColumns, rangeColumns);
+
+    if (eqInColumns.isEmpty() && rangeColumns.isEmpty()) {
+      return segments;
+    }
+
+    int numSegments = segments.size();
+    List<IndexSegment> selectedSegments = new ArrayList<>(numSegments);
+    if (!eqInColumns.isEmpty() && query.isEnablePrefetch()) {
+      Map[] dataSourceCaches = new Map[numSegments];
+      FetchContext[] fetchContexts = new FetchContext[numSegments];
+      try {
+        // Prefetch bloom filter for columns within the EQ/IN predicate if exists
+        for (int i = 0; i < numSegments; i++) {
+          IndexSegment segment = segments.get(i);
+          Map<String, DataSource> dataSourceCache = new HashMap<>();
+          Map<String, List<ColumnIndexType>> columnToIndexList = new HashMap<>();
+          for (String column : eqInColumns) {
+            DataSource dataSource = segment.getDataSource(column);
+            // NOTE: Column must exist after DataSchemaSegmentPruner
+            assert dataSource != null;
+            dataSourceCache.put(column, dataSource);
+            if (dataSource.getBloomFilter() != null) {
+              columnToIndexList.put(column, Collections.singletonList(ColumnIndexType.BLOOM_FILTER));
+            }
+          }
+          dataSourceCaches[i] = dataSourceCache;
+          if (!columnToIndexList.isEmpty()) {
+            FetchContext fetchContext =
+                new FetchContext(UUID.randomUUID(), segment.getSegmentName(), columnToIndexList);
+            segment.prefetch(fetchContext);
+            fetchContexts[i] = fetchContext;
+          }
+        }
+
+        // Prune segments
+        for (int i = 0; i < numSegments; i++) {
+          IndexSegment segment = segments.get(i);
+          if (!pruneSegment(segment, filter, dataSourceCaches[i], fetchContexts[i])) {
+            selectedSegments.add(segment);

Review comment:
       Good point! Let me see if I can reduce the window between acquire and release




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#issuecomment-953597423


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#7654](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (837d152) into [master](https://codecov.io/gh/apache/pinot/commit/3734ee2021a2141a359aaff28a8ec37345bc80f7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3734ee2) will **decrease** coverage by `3.05%`.
   > The diff coverage is `59.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/7654/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #7654      +/-   ##
   ============================================
   - Coverage     71.70%   68.65%   -3.06%     
   + Complexity     3998     3916      -82     
   ============================================
     Files          1576     1181     -395     
     Lines         79989    57710   -22279     
     Branches      11854     8865    -2989     
   ============================================
   - Hits          57357    39621   -17736     
   + Misses        18767    15278    -3489     
   + Partials       3865     2811    -1054     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | integration2 | `?` | |
   | unittests1 | `68.65% <59.00%> (-0.05%)` | :arrow_down: |
   | unittests2 | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ot/core/query/pruner/ColumnValueSegmentPruner.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9wcnVuZXIvQ29sdW1uVmFsdWVTZWdtZW50UHJ1bmVyLmphdmE=) | `69.56% <59.00%> (-24.61%)` | :arrow_down: |
   | [...a/org/apache/pinot/common/metrics/MinionMeter.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9NaW5pb25NZXRlci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...g/apache/pinot/common/metrics/ControllerMeter.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Db250cm9sbGVyTWV0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../apache/pinot/common/metrics/BrokerQueryPhase.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Ccm9rZXJRdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../apache/pinot/common/metrics/MinionQueryPhase.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9NaW5pb25RdWVyeVBoYXNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...he/pinot/common/messages/SegmentReloadMessage.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWVzc2FnZXMvU2VnbWVudFJlbG9hZE1lc3NhZ2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...pinot/core/data/manager/realtime/TimerService.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvVGltZXJTZXJ2aWNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...not/common/exception/HttpErrorStatusException.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZXhjZXB0aW9uL0h0dHBFcnJvclN0YXR1c0V4Y2VwdGlvbi5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...t/core/startree/plan/StarTreeDocIdSetPlanNode.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zdGFydHJlZS9wbGFuL1N0YXJUcmVlRG9jSWRTZXRQbGFuTm9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ot/common/restlet/resources/TableMetadataInfo.java](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vcmVzdGxldC9yZXNvdXJjZXMvVGFibGVNZXRhZGF0YUluZm8uamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [598 more](https://codecov.io/gh/apache/pinot/pull/7654/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [3734ee2...837d152](https://codecov.io/gh/apache/pinot/pull/7654?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] npawar commented on a change in pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch

Posted by GitBox <gi...@apache.org>.
npawar commented on a change in pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#discussion_r739432087



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/pruner/ColumnValueSegmentPruner.java
##########
@@ -72,29 +80,127 @@ public void init(PinotConfiguration config) {
   }
 
   @Override
-  public boolean prune(IndexSegment segment, QueryContext query) {
+  public List<IndexSegment> prune(List<IndexSegment> segments, QueryContext query) {
+    if (segments.isEmpty()) {
+      return segments;
+    }
     FilterContext filter = query.getFilter();
     if (filter == null) {
-      return false;
+      return segments;
+    }
+
+    // Extract EQ/IN/RANGE predicate columns
+    Set<String> eqInColumns = new HashSet<>();
+    Set<String> rangeColumns = new HashSet<>();
+    extractPredicateColumns(filter, eqInColumns, rangeColumns);
+
+    if (eqInColumns.isEmpty() && rangeColumns.isEmpty()) {
+      return segments;
+    }
+
+    int numSegments = segments.size();
+    List<IndexSegment> selectedSegments = new ArrayList<>(numSegments);
+    if (!eqInColumns.isEmpty() && query.isEnablePrefetch()) {
+      Map[] dataSourceCaches = new Map[numSegments];
+      FetchContext[] fetchContexts = new FetchContext[numSegments];
+      try {
+        // Prefetch bloom filter for columns within the EQ/IN predicate if exists
+        for (int i = 0; i < numSegments; i++) {
+          IndexSegment segment = segments.get(i);
+          Map<String, DataSource> dataSourceCache = new HashMap<>();
+          Map<String, List<ColumnIndexType>> columnToIndexList = new HashMap<>();
+          for (String column : eqInColumns) {
+            DataSource dataSource = segment.getDataSource(column);
+            // NOTE: Column must exist after DataSchemaSegmentPruner
+            assert dataSource != null;
+            dataSourceCache.put(column, dataSource);
+            if (dataSource.getBloomFilter() != null) {
+              columnToIndexList.put(column, Collections.singletonList(ColumnIndexType.BLOOM_FILTER));
+            }
+          }
+          dataSourceCaches[i] = dataSourceCache;
+          if (!columnToIndexList.isEmpty()) {
+            FetchContext fetchContext =
+                new FetchContext(UUID.randomUUID(), segment.getSegmentName(), columnToIndexList);
+            segment.prefetch(fetchContext);
+            fetchContexts[i] = fetchContext;
+          }
+        }
+
+        // Prune segments
+        for (int i = 0; i < numSegments; i++) {
+          IndexSegment segment = segments.get(i);
+          if (!pruneSegment(segment, filter, dataSourceCaches[i], fetchContexts[i])) {
+            selectedSegments.add(segment);

Review comment:
       it looks like we'll acquire for all segments, and then later release for all. Unlike the segment execution where we are guaranteed to release an acquired segment. Will this not deadlock if all acquire are not able to complete?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #7654: Enhance ColumnValueSegmentPruner and support bloom filter prefetch

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #7654:
URL: https://github.com/apache/pinot/pull/7654#discussion_r738868832



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/pruner/ColumnValueSegmentPruner.java
##########
@@ -164,7 +274,73 @@ private boolean pruneEqPredicate(IndexSegment segment, EqPredicate eqPredicate,
   }
 
   /**
-   * For RANGE predicate, prune the segment based on:
+   * For IN predicate, prune the segments based on:
+   * <ul>
+   *   <li>Column min/max value</li>
+   *   <li>Column bloom filter</li>
+   * </ul>
+   * <p>NOTE: segments will not be pruned if the number of values is greater than the threshold.
+   */
+  private boolean pruneInPredicate(IndexSegment segment, InPredicate inPredicate,

Review comment:
       Maybe this is just github thing that it is showing both functions as deleted and rewritten ? Apart from fetchcontext, nothing else seems to have changed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org