You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "xiangfu0 (via GitHub)" <gi...@apache.org> on 2023/02/11 08:34:28 UTC

[GitHub] [pinot] xiangfu0 opened a new pull request, #10270: Adding partition pruner in QueryContext

xiangfu0 opened a new pull request, #10270:
URL: https://github.com/apache/pinot/pull/10270

   This is the support for querying a specific data partition.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] 61yao commented on pull request #10270: Adding partition pruner in QueryContext

Posted by "61yao (via GitHub)" <gi...@apache.org>.
61yao commented on PR #10270:
URL: https://github.com/apache/pinot/pull/10270#issuecomment-1427155386

   Does the partition have some metadata/index to help further proving? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] gortiz commented on a diff in pull request #10270: Adding partition pruner in QueryContext

Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #10270:
URL: https://github.com/apache/pinot/pull/10270#discussion_r1112963482


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/config/QueryOptionsUtils.java:
##########
@@ -190,4 +191,25 @@ public static Integer getGroupTrimThreshold(Map<String, String> queryOptions) {
     String groupByTrimThreshold = queryOptions.get(QueryOptionKey.GROUP_TRIM_THRESHOLD);
     return groupByTrimThreshold != null ? Integer.parseInt(groupByTrimThreshold) : null;
   }
+
+  // Sample input: k1:1/k1:2/k1:3/k2:4/k2:5/k2:6
+  // Sample output: {k1=[1, 2, 3], k2=[4, 5, 6]}
+  public static Map<String, Set<Integer>> getColumnPartitionMap(Map<String, String> queryOptions) {

Review Comment:
   Couldn't we use other standard format (like json) here? Or json without quotes or something like that. Adding more formats makes it complex for the writer and introduce new bugs. For example, this method does not allow to use `:` or `/` in the values.
   
   By using json without quotes we can implement the method with jackson (changing the config as explained [here](https://stackoverflow.com/questions/28788331/creating-json-without-quotes)) and the input would be something like:
   
   ```js
   {k1: [1, 2, 3], k2: [4,5,6]}
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] GSharayu commented on a diff in pull request #10270: Adding partition pruner in QueryContext

Posted by "GSharayu (via GitHub)" <gi...@apache.org>.
GSharayu commented on code in PR #10270:
URL: https://github.com/apache/pinot/pull/10270#discussion_r1119699319


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/config/QueryOptionsUtils.java:
##########
@@ -190,4 +191,25 @@ public static Integer getGroupTrimThreshold(Map<String, String> queryOptions) {
     String groupByTrimThreshold = queryOptions.get(QueryOptionKey.GROUP_TRIM_THRESHOLD);
     return groupByTrimThreshold != null ? Integer.parseInt(groupByTrimThreshold) : null;
   }
+
+  // Sample input: k1:1/k1:2/k1:3/k2:4/k2:5/k2:6
+  // Sample output: {k1=[1, 2, 3], k2=[4, 5, 6]}
+  public static Map<String, Set<Integer>> getColumnPartitionMap(Map<String, String> queryOptions) {

Review Comment:
   +1, just curious why this format?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 commented on pull request #10270: Adding partition pruner in QueryContext

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 commented on PR #10270:
URL: https://github.com/apache/pinot/pull/10270#issuecomment-1499490100

   closed as not useful


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #10270: Adding partition pruner in QueryContext

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10270:
URL: https://github.com/apache/pinot/pull/10270#issuecomment-1426670666

   # [Codecov](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#10270](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7904492) into [master](https://codecov.io/gh/apache/pinot/commit/cb952899d4472984665f6051d0dd14c08aee9811?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cb95289) will **decrease** coverage by `45.79%`.
   > The diff coverage is `13.72%`.
   
   ```diff
   @@              Coverage Diff              @@
   ##             master   #10270       +/-   ##
   =============================================
   - Coverage     70.39%   24.61%   -45.79%     
   + Complexity     5771       48     -5723     
   =============================================
     Files          2017     2006       -11     
     Lines        109295   108993      -302     
     Branches      16615    16589       -26     
   =============================================
   - Hits          76940    26830    -50110     
   - Misses        26956    79348    +52392     
   + Partials       5399     2815     -2584     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `24.61% <13.72%> (-0.08%)` | :arrow_down: |
   | integration2 | `?` | |
   | unittests1 | `?` | |
   | unittests2 | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...not/core/query/pruner/PartitionsSegmentPruner.java](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9wcnVuZXIvUGFydGl0aW9uc1NlZ21lbnRQcnVuZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [.../pinot/core/query/pruner/SegmentPrunerService.java](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9wcnVuZXIvU2VnbWVudFBydW5lclNlcnZpY2UuamF2YQ==) | `76.78% <0.00%> (-6.55%)` | :arrow_down: |
   | [...not/core/query/pruner/SegmentPrunerStatistics.java](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9wcnVuZXIvU2VnbWVudFBydW5lclN0YXRpc3RpY3MuamF2YQ==) | `76.92% <0.00%> (-23.08%)` | :arrow_down: |
   | [...va/org/apache/pinot/spi/utils/CommonConstants.java](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQ29tbW9uQ29uc3RhbnRzLmphdmE=) | `0.00% <ø> (-22.50%)` | :arrow_down: |
   | [...e/pinot/common/utils/config/QueryOptionsUtils.java](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvY29uZmlnL1F1ZXJ5T3B0aW9uc1V0aWxzLmphdmE=) | `60.56% <25.00%> (-7.24%)` | :arrow_down: |
   | [...pinot/core/query/request/context/QueryContext.java](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9yZXF1ZXN0L2NvbnRleHQvUXVlcnlDb250ZXh0LmphdmE=) | `94.03% <75.00%> (-5.03%)` | :arrow_down: |
   | [...pinot/core/query/pruner/SegmentPrunerProvider.java](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9wcnVuZXIvU2VnbWVudFBydW5lclByb3ZpZGVyLmphdmE=) | `64.28% <100.00%> (-20.33%)` | :arrow_down: |
   | [...in/java/org/apache/pinot/spi/utils/BytesUtils.java](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQnl0ZXNVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/BaseRecording.java](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvQmFzZVJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/NoOpRecording.java](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvTm9PcFJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [1491 more](https://codecov.io/gh/apache/pinot/pull/10270?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] xiangfu0 closed pull request #10270: Adding partition pruner in QueryContext

Posted by "xiangfu0 (via GitHub)" <gi...@apache.org>.
xiangfu0 closed pull request #10270: Adding partition pruner in QueryContext
URL: https://github.com/apache/pinot/pull/10270


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] gortiz commented on a diff in pull request #10270: Adding partition pruner in QueryContext

Posted by "gortiz (via GitHub)" <gi...@apache.org>.
gortiz commented on code in PR #10270:
URL: https://github.com/apache/pinot/pull/10270#discussion_r1112964704


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/config/QueryOptionsUtils.java:
##########
@@ -190,4 +191,25 @@ public static Integer getGroupTrimThreshold(Map<String, String> queryOptions) {
     String groupByTrimThreshold = queryOptions.get(QueryOptionKey.GROUP_TRIM_THRESHOLD);
     return groupByTrimThreshold != null ? Integer.parseInt(groupByTrimThreshold) : null;
   }
+
+  // Sample input: k1:1/k1:2/k1:3/k2:4/k2:5/k2:6
+  // Sample output: {k1=[1, 2, 3], k2=[4, 5, 6]}
+  public static Map<String, Set<Integer>> getColumnPartitionMap(Map<String, String> queryOptions) {
+    Map<String, Set<Integer>> columnPartitionMap = new HashMap<>();
+    String columnPartitionConfig = queryOptions.get(QueryOptionKey.COLUMN_PARTITION_MAP);
+    if (columnPartitionConfig != null) {
+      String[] columnPartitions = columnPartitionConfig.split("/");

Review Comment:
   If we insist to support this syntax, I would include a preprocess phase where `\/` means to escape a `/`, `\:` means to skip a `:` and `\\` means to skip a `\`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] ankitsultana commented on a diff in pull request #10270: Adding partition pruner in QueryContext

Posted by "ankitsultana (via GitHub)" <gi...@apache.org>.
ankitsultana commented on code in PR #10270:
URL: https://github.com/apache/pinot/pull/10270#discussion_r1108914529


##########
pinot-core/src/main/java/org/apache/pinot/core/query/pruner/PartitionsSegmentPruner.java:
##########
@@ -0,0 +1,82 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.query.pruner;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.pinot.core.query.request.context.QueryContext;
+import org.apache.pinot.segment.spi.ColumnMetadata;
+import org.apache.pinot.segment.spi.IndexSegment;
+import org.apache.pinot.segment.spi.SegmentMetadata;
+import org.apache.pinot.spi.env.PinotConfiguration;
+
+
+/**
+ * The {@code PartitionSegmentPruner} is the segment pruner that prunes segments based on the partition info from the
+ * QueryContext.
+ */
+public class PartitionsSegmentPruner implements SegmentPruner {
+
+  @Override
+  public void init(PinotConfiguration config) {
+  }
+
+  @Override
+  public boolean isApplicableTo(QueryContext query) {
+    return query.getColumnPartitionMap() != null && !query.getColumnPartitionMap().isEmpty();
+  }
+
+  @Override
+  public List<IndexSegment> prune(List<IndexSegment> segments, QueryContext query) {

Review Comment:
   There are some caveats with using columnPartitionMap with multiple columns with realtime tables. If there's only 1 column in TableConfig::segmentPartitionConfig then a consuming segment assumes the partition for the configured column to be the same as the Kafka partition.
   
   If there is more than 1 column, then the columnPartitionMap for the segment is not set until the segment is committed.
   
   I think this pruner would always skip consuming segments when there's more than 1 column in the TableConfig::segmentPartitionConfig. Can you confirm?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org