You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/02/18 06:57:31 UTC

[GitHub] [pinot] kmozaid opened a new pull request #8224: Implemented BoundedColumnValue partition function

kmozaid opened a new pull request #8224:
URL: https://github.com/apache/pinot/pull/8224


   ## Description
   This PR adds a new implementation of `PartitionFunction` which is used to partition segments. The new partition function named `BoundedColumnValue` can be used to partition segments on column values and still generating partitionId of integer type.
   
   Example Usage -
   ```
     "segmentPartitionConfig": {
       "columnPartitionMap": {
         "subject": {
           "functionName": "BoundedColumnValue",
           "functionConfig": {
             "columnValues": "Maths|English|Chemistry"
           }
         }
       }
   ```
   
   PartitionId is generated based on position in columnValues. PartitionId would 1 for Maths, 2 for English and so on.
   PartitionId 0 is reserved for any other subject which are not present in given `columnValues`. The different column values can be specified using pipe (`|`) separation. This additional partition `functionConfig` is persisted in metadata.properties and segment metadata in zookeeper.  Broker can also use this function to prune segments.
   
   ## Upgrade Notes
   Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)
   * [ ] No
   
   Does this PR fix a zero-downtime upgrade introduced earlier?
   * [ ] No
   
   Does this PR otherwise need attention when creating release notes? Things to consider:
   - New configuration options
   - Deprecation of configurations
   - Signature changes to public methods/interfaces
   - New plugins added or old plugins removed
   * [ ] Yes, Adds new configuration option and changes to public interfaces.
   ## Release Notes
   Added new partition function called `BoundedColumnValue` to be able to partition segments based on column value.
   
   <!-- If you have a series of commits adding or enabling a feature, then
   add this section only in final commit that marks the feature completed.
   Refer to earlier release notes to see examples of text.
   -->
   ## Documentation
   Yes, will create another PR in pinot-docs repo.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1044093075


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#8224](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c876382) into [master](https://codecov.io/gh/apache/pinot/commit/de1cdf2190cf91e53f947735cd364a5fe6125829?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (de1cdf2) will **decrease** coverage by `56.90%`.
   > The diff coverage is `4.10%`.
   
   > :exclamation: Current head c876382 differs from pull request most recent head c840ebf. Consider uploading reports for the commit c840ebf to get more accurate results
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8224/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #8224       +/-   ##
   =============================================
   - Coverage     70.98%   14.08%   -56.91%     
   + Complexity     4313       81     -4232     
   =============================================
     Files          1626     1583       -43     
     Lines         85004    83177     -1827     
     Branches      12791    12599      -192     
   =============================================
   - Hits          60340    11715    -48625     
   - Misses        20514    70589    +50075     
   + Partials       4150      873     -3277     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | integration2 | `?` | |
   | unittests1 | `?` | |
   | unittests2 | `14.08% <4.10%> (-0.02%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../core/realtime/PinotLLCRealtimeSegmentManager.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL3JlYWx0aW1lL1Bpbm90TExDUmVhbHRpbWVTZWdtZW50TWFuYWdlci5qYXZh) | `66.10% <0.00%> (-11.87%)` | :arrow_down: |
   | [...manager/realtime/LLRealtimeSegmentDataManager.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvTExSZWFsdGltZVNlZ21lbnREYXRhTWFuYWdlci5qYXZh) | `0.00% <0.00%> (-71.49%)` | :arrow_down: |
   | [...processing/partitioner/TableConfigPartitioner.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvcGFydGl0aW9uZXIvVGFibGVDb25maWdQYXJ0aXRpb25lci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ocal/indexsegment/mutable/IntermediateSegment.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9JbnRlcm1lZGlhdGVTZWdtZW50LmphdmE=) | `0.00% <0.00%> (-70.07%)` | :arrow_down: |
   | [...local/indexsegment/mutable/MutableSegmentImpl.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9NdXRhYmxlU2VnbWVudEltcGwuamF2YQ==) | `0.00% <0.00%> (-58.70%)` | :arrow_down: |
   | [...ltime/converter/stats/MutableColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZUNvbHVtblN0YXRpc3RpY3MuamF2YQ==) | `0.00% <0.00%> (-55.00%)` | :arrow_down: |
   | [...verter/stats/MutableNoDictionaryColStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZU5vRGljdGlvbmFyeUNvbFN0YXRpc3RpY3MuamF2YQ==) | `0.00% <0.00%> (-45.00%)` | :arrow_down: |
   | [...ment/creator/impl/SegmentColumnarIndexCreator.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==) | `0.00% <0.00%> (-86.33%)` | :arrow_down: |
   | [.../impl/stats/AbstractColumnStatisticsCollector.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9zdGF0cy9BYnN0cmFjdENvbHVtblN0YXRpc3RpY3NDb2xsZWN0b3IuamF2YQ==) | `0.00% <0.00%> (-92.50%)` | :arrow_down: |
   | [.../loader/defaultcolumn/DefaultColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9kZWZhdWx0Y29sdW1uL0RlZmF1bHRDb2x1bW5TdGF0aXN0aWNzLmphdmE=) | `0.00% <0.00%> (-76.00%)` | :arrow_down: |
   | ... and [1318 more](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [de1cdf2...c840ebf](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811671607



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/metadata/ColumnPartitionMetadata.java
##########
@@ -52,11 +57,14 @@
    * @param functionName Name of the partition function
    * @param numPartitions Number of total partitions for this column
    * @param partitions Set of partitions the column contains
+   * @param functionConfig Configuration required by partition function.
    */
-  public ColumnPartitionMetadata(String functionName, int numPartitions, Set<Integer> partitions) {
+  public ColumnPartitionMetadata(String functionName, int numPartitions, Set<Integer> partitions,
+                                 Map<String, String> functionConfig) {

Review comment:
       @Jackie-Jiang  Should I another constructor for `ColumnPartitionMetadata` class as well? `functionConfig` is always passed when column metadata properties is read.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811731096



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/metadata/ColumnMetadataImpl.java
##########
@@ -279,8 +284,26 @@ public static ColumnMetadataImpl fromPropertiesConfiguration(String column, Prop
     String partitionFunctionName = config.getString(Column.getKeyFor(column, Column.PARTITION_FUNCTION), null);
     if (partitionFunctionName != null) {
       int numPartitions = config.getInt(Column.getKeyFor(column, Column.NUM_PARTITIONS));
+      Map<String, String> partitionFunctionConfigMap = null;
+      Configuration partitionFunctionConfig = config.subset(Column.getKeyFor(column, Column.PARTITION_FUNCTION_CONFIG));
+      if (!partitionFunctionConfig.isEmpty()) {
+        partitionFunctionConfigMap = new HashMap<>();
+        Iterator<String> partitionFunctionConfigKeysIter = partitionFunctionConfig.getKeys();
+        while (partitionFunctionConfigKeysIter.hasNext()) {
+          String functionConfigKey = partitionFunctionConfigKeysIter.next();
+          Object functionConfigValueObj = partitionFunctionConfig.getProperty(functionConfigKey);
+          String valueStr;
+          if (functionConfigValueObj instanceof List) {

Review comment:
       When `,` is used as column value separator, the value read from properties is of type List, hence this handling.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r814430261



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/StatsCollectorConfig.java
##########
@@ -76,6 +77,10 @@ public int getNumPartitions(String column) {
         : SegmentPartitionConfig.INVALID_NUM_PARTITIONS;
   }
 
+  public Map<String, String> getFunctionConfig(String column) {

Review comment:
       done.

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/metadata/ColumnMetadataImpl.java
##########
@@ -279,8 +284,26 @@ public static ColumnMetadataImpl fromPropertiesConfiguration(String column, Prop
     String partitionFunctionName = config.getString(Column.getKeyFor(column, Column.PARTITION_FUNCTION), null);
     if (partitionFunctionName != null) {
       int numPartitions = config.getInt(Column.getKeyFor(column, Column.NUM_PARTITIONS));
+      Map<String, String> partitionFunctionConfigMap = null;
+      Configuration partitionFunctionConfig = config.subset(Column.getKeyFor(column, Column.PARTITION_FUNCTION_CONFIG));
+      if (!partitionFunctionConfig.isEmpty()) {
+        partitionFunctionConfigMap = new HashMap<>();
+        Iterator<String> partitionFunctionConfigKeysIter = partitionFunctionConfig.getKeys();
+        while (partitionFunctionConfigKeysIter.hasNext()) {
+          String functionConfigKey = partitionFunctionConfigKeysIter.next();
+          Object functionConfigValueObj = partitionFunctionConfig.getProperty(functionConfigKey);
+          String valueStr;
+          if (functionConfigValueObj instanceof List) {

Review comment:
       done.

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,96 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r814430875



##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
##########
@@ -21,20 +21,29 @@
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
 import com.google.common.base.Preconditions;
+import java.util.Map;
+import javax.annotation.Nullable;
 import org.apache.pinot.spi.config.BaseJsonConfig;
 
 
 public class ColumnPartitionConfig extends BaseJsonConfig {
   private final String _functionName;
   private final int _numPartitions;
+  private final Map<String, String> _functionConfig;
+
+  public ColumnPartitionConfig(String functionName, int numPartitions) {
+    this(functionName, numPartitions, null);
+  }
 
   @JsonCreator
   public ColumnPartitionConfig(@JsonProperty(value = "functionName", required = true) String functionName,
-      @JsonProperty(value = "numPartitions", required = true) int numPartitions) {
+      @JsonProperty(value = "numPartitions", required = true) int numPartitions,
+      @JsonProperty(value = "functionConfig") Map<String, String> functionConfig) {

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811673150



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+  private static final int DEFAULT_PARTITION_ID = 0;
+  private static final String NAME = "BoundedColumnValue";
+  private static final String COLUMN_VALUES = "columnValues";
+  private final Map<String, String> _functionConfig;
+  private final String[] _values;

Review comment:
       I kept it array because arrays size would not be much and HashMap would only add extra memory overhead. I can update it to use Map if you think that would be faster even with small size of array :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1044436343


   > Hi @kmozaid thanks for taking the time to make this contribution. Can you explain what problem this solves? Is it because you already have a partitioning and you want to maintain locality within partitions?
   
   Hi @richardstartin , We have a table where data is being ingested from multiple sources. (these multiple sources pushes data to same kafka topic). Data is kept for 5 days in realtime table and then moved offline table by minion task. We want to keep data from these sources in separate segments for offline table. There is a column which identifies the source. `BoundedColumnValue` partition function provides capability to keep data from different sources in respective partitioned segments. Later if we want to backfill the data of just one source, then we will be able to do so because we would know what are the segments for given source and replace them by backfill. The main use case is to be able to backfill data of particular source. This is also discussed in slack thread - https://apache-pinot.slack.com/archives/CDRCA57FC/p1643286670255700
   
   <img width="982" alt="image" src="https://user-images.githubusercontent.com/8354145/154681296-cf689529-26fc-4d1f-a983-70e18db4f4bc.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin commented on pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
richardstartin commented on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1044376381


   Hi @kmozaid thanks for taking the time to make this contribution. Can you explain what problem this solves? Is it because  you already have a partitioning and you want to maintain locality within partitions?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1044093075


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#8224](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bec3162) into [master](https://codecov.io/gh/apache/pinot/commit/de1cdf2190cf91e53f947735cd364a5fe6125829?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (de1cdf2) will **decrease** coverage by `42.18%`.
   > The diff coverage is `7.04%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8224/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #8224       +/-   ##
   =============================================
   - Coverage     70.98%   28.80%   -42.19%     
   =============================================
     Files          1626     1616       -10     
     Lines         85004    84704      -300     
     Branches      12791    12765       -26     
   =============================================
   - Hits          60340    24395    -35945     
   - Misses        20514    58075    +37561     
   + Partials       4150     2234     -1916     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `28.80% <7.04%> (+0.10%)` | :arrow_up: |
   | integration2 | `?` | |
   | unittests1 | `?` | |
   | unittests2 | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ot/controller/helix/core/util/ZKMetadataUtils.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL3V0aWwvWktNZXRhZGF0YVV0aWxzLmphdmE=) | `67.85% <0.00%> (-32.15%)` | :arrow_down: |
   | [...processing/partitioner/TableConfigPartitioner.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvcGFydGl0aW9uZXIvVGFibGVDb25maWdQYXJ0aXRpb25lci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ocal/indexsegment/mutable/IntermediateSegment.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9JbnRlcm1lZGlhdGVTZWdtZW50LmphdmE=) | `0.00% <0.00%> (-70.07%)` | :arrow_down: |
   | [...local/indexsegment/mutable/MutableSegmentImpl.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9NdXRhYmxlU2VnbWVudEltcGwuamF2YQ==) | `0.00% <0.00%> (-58.70%)` | :arrow_down: |
   | [...ltime/converter/stats/MutableColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZUNvbHVtblN0YXRpc3RpY3MuamF2YQ==) | `0.00% <0.00%> (-55.00%)` | :arrow_down: |
   | [...verter/stats/MutableNoDictionaryColStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZU5vRGljdGlvbmFyeUNvbFN0YXRpc3RpY3MuamF2YQ==) | `0.00% <0.00%> (-45.00%)` | :arrow_down: |
   | [...ment/creator/impl/SegmentColumnarIndexCreator.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==) | `0.00% <0.00%> (-86.33%)` | :arrow_down: |
   | [.../impl/stats/AbstractColumnStatisticsCollector.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9zdGF0cy9BYnN0cmFjdENvbHVtblN0YXRpc3RpY3NDb2xsZWN0b3IuamF2YQ==) | `0.00% <0.00%> (-92.50%)` | :arrow_down: |
   | [.../loader/defaultcolumn/DefaultColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9kZWZhdWx0Y29sdW1uL0RlZmF1bHRDb2x1bW5TdGF0aXN0aWNzLmphdmE=) | `0.00% <0.00%> (-76.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/segment/spi/V1Constants.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL1YxQ29uc3RhbnRzLmphdmE=) | `0.00% <ø> (-14.29%)` | :arrow_down: |
   | ... and [1192 more](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [de1cdf2...bec3162](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811505110



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/ColumnStatistics.java
##########
@@ -98,5 +99,7 @@ default int getMaxRowLengthInBytes() {
 
   int getNumPartitions();
 
+  Map<String, String> getFunctionConfig();

Review comment:
       ```suggestion
     Map<String, String> getPartitionFunctionConfig();
   ```

##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
##########
@@ -46,6 +51,15 @@ public String getFunctionName() {
     return _functionName;
   }
 
+  /**
+   * Returns the partition function configuration for the column.
+   *
+   * @return Partition function configuration.
+   */
+  public Map<String, String> getFunctionConfig() {

Review comment:
       Annotate it with `@Nullabel`
   ```suggestion
     @Nullable
     public Map<String, String> getFunctionConfig() {
   ```

##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
##########
@@ -126,7 +126,8 @@ private PartitionInfo extractPartitionInfoFromSegmentZKMetadataZNRecord(String s
     }
 
     return new PartitionInfo(PartitionFunctionFactory.getPartitionFunction(columnPartitionMetadata.getFunctionName(),
-        columnPartitionMetadata.getNumPartitions()), columnPartitionMetadata.getPartitions());
+        columnPartitionMetadata.getNumPartitions(), columnPartitionMetadata.getFunctionConfig()),
+            columnPartitionMetadata.getPartitions());

Review comment:
       (code format) Can you please setup the [Pinot Style](https://docs.pinot.apache.org/developers/developers-and-contributors/code-setup#intellij) and reformat the changes?

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+  private static final int DEFAULT_PARTITION_ID = 0;
+  private static final String NAME = "BoundedColumnValue";
+  private static final String COLUMN_VALUES = "columnValues";
+  private final Map<String, String> _functionConfig;
+  private final String[] _values;
+  private final int _noOfValues;
+
+  public BoundedColumnValuePartitionFunction(Map<String, String> functionConfig) {
+    Preconditions.checkArgument(functionConfig != null && functionConfig.size() > 0,
+        "functionConfig should be present, specified", functionConfig);
+    Preconditions.checkState(functionConfig.get(COLUMN_VALUES) != null, "columnValues must be configured");
+    _functionConfig = functionConfig;
+    _values = functionConfig.get(COLUMN_VALUES).split("\\|");
+    _noOfValues = _values.length;
+  }
+
+  @Override
+  public int getPartition(Object valueIn) {
+    String value = valueIn instanceof String ? (String) valueIn : String.valueOf(valueIn);

Review comment:
       (minor) Same as `value.toString()`

##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
##########
@@ -21,20 +21,25 @@
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
 import com.google.common.base.Preconditions;
+import java.util.Map;
 import org.apache.pinot.spi.config.BaseJsonConfig;
 
 
 public class ColumnPartitionConfig extends BaseJsonConfig {
   private final String _functionName;
   private final int _numPartitions;
+  private final Map<String, String> _functionConfig;
 
   @JsonCreator
   public ColumnPartitionConfig(@JsonProperty(value = "functionName", required = true) String functionName,
-      @JsonProperty(value = "numPartitions", required = true) int numPartitions) {
+      @JsonProperty(value = "numPartitions", required = false) int numPartitions,

Review comment:
       Suggest not changing this to optional because it is used for segment management. Within the `BoundedColumnValuePartitionFunction`, we can validate whether it matches the column values provided

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+  private static final int DEFAULT_PARTITION_ID = 0;
+  private static final String NAME = "BoundedColumnValue";
+  private static final String COLUMN_VALUES = "columnValues";
+  private final Map<String, String> _functionConfig;
+  private final String[] _values;
+  private final int _noOfValues;
+
+  public BoundedColumnValuePartitionFunction(Map<String, String> functionConfig) {
+    Preconditions.checkArgument(functionConfig != null && functionConfig.size() > 0,
+        "functionConfig should be present, specified", functionConfig);
+    Preconditions.checkState(functionConfig.get(COLUMN_VALUES) != null, "columnValues must be configured");
+    _functionConfig = functionConfig;
+    _values = functionConfig.get(COLUMN_VALUES).split("\\|");

Review comment:
       Suggest making the delimiter configurable to support values with `|`

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+  private static final int DEFAULT_PARTITION_ID = 0;
+  private static final String NAME = "BoundedColumnValue";
+  private static final String COLUMN_VALUES = "columnValues";
+  private final Map<String, String> _functionConfig;
+  private final String[] _values;

Review comment:
       Make it a map to accelerate the lookup?

##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
##########
@@ -21,20 +21,25 @@
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
 import com.google.common.base.Preconditions;
+import java.util.Map;
 import org.apache.pinot.spi.config.BaseJsonConfig;
 
 
 public class ColumnPartitionConfig extends BaseJsonConfig {
   private final String _functionName;
   private final int _numPartitions;
+  private final Map<String, String> _functionConfig;
 
   @JsonCreator
   public ColumnPartitionConfig(@JsonProperty(value = "functionName", required = true) String functionName,

Review comment:
       Let's add another constructor without `functionConfig`, so that we don't need to change the existing usage of this method

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.

Review comment:
       Should we consider putting other values as the last partition in case all values are already configured, and no value goes into partition 0?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811665339



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/ColumnStatistics.java
##########
@@ -98,5 +99,7 @@ default int getMaxRowLengthInBytes() {
 
   int getNumPartitions();
 
+  Map<String, String> getFunctionConfig();

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1044093075


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#8224](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f2f131f) into [master](https://codecov.io/gh/apache/pinot/commit/fa0db64facb0f3102ebfe9052fdd83db9f532ce7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fa0db64) will **decrease** coverage by `49.88%`.
   > The diff coverage is `3.89%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8224/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #8224       +/-   ##
   =============================================
   - Coverage     63.98%   14.10%   -49.89%     
   + Complexity     4239       81     -4158     
   =============================================
     Files          1584     1586        +2     
     Lines         83250    83394      +144     
     Branches      12608    12640       +32     
   =============================================
   - Hits          53268    11761    -41507     
   - Misses        26144    70756    +44612     
   + Partials       3838      877     -2961     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests1 | `?` | |
   | unittests2 | `14.10% <3.89%> (+0.01%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../core/realtime/PinotLLCRealtimeSegmentManager.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL3JlYWx0aW1lL1Bpbm90TExDUmVhbHRpbWVTZWdtZW50TWFuYWdlci5qYXZh) | `66.10% <0.00%> (ø)` | |
   | [...manager/realtime/LLRealtimeSegmentDataManager.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvTExSZWFsdGltZVNlZ21lbnREYXRhTWFuYWdlci5qYXZh) | `0.00% <0.00%> (-45.45%)` | :arrow_down: |
   | [...processing/partitioner/TableConfigPartitioner.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvcGFydGl0aW9uZXIvVGFibGVDb25maWdQYXJ0aXRpb25lci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ocal/indexsegment/mutable/IntermediateSegment.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9JbnRlcm1lZGlhdGVTZWdtZW50LmphdmE=) | `0.00% <0.00%> (-70.07%)` | :arrow_down: |
   | [...ltime/converter/stats/MutableColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZUNvbHVtblN0YXRpc3RpY3MuamF2YQ==) | `0.00% <0.00%> (-55.00%)` | :arrow_down: |
   | [...verter/stats/MutableNoDictionaryColStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZU5vRGljdGlvbmFyeUNvbFN0YXRpc3RpY3MuamF2YQ==) | `0.00% <0.00%> (-45.00%)` | :arrow_down: |
   | [...ment/creator/impl/SegmentColumnarIndexCreator.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==) | `0.00% <0.00%> (-86.33%)` | :arrow_down: |
   | [.../impl/stats/AbstractColumnStatisticsCollector.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9zdGF0cy9BYnN0cmFjdENvbHVtblN0YXRpc3RpY3NDb2xsZWN0b3IuamF2YQ==) | `0.00% <0.00%> (-92.50%)` | :arrow_down: |
   | [.../loader/defaultcolumn/DefaultColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9kZWZhdWx0Y29sdW1uL0RlZmF1bHRDb2x1bW5TdGF0aXN0aWNzLmphdmE=) | `0.00% <0.00%> (-76.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/segment/spi/V1Constants.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL1YxQ29uc3RhbnRzLmphdmE=) | `0.00% <ø> (-14.29%)` | :arrow_down: |
   | ... and [1110 more](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [fa0db64...f2f131f](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811665777



##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
##########
@@ -21,20 +21,25 @@
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
 import com.google.common.base.Preconditions;
+import java.util.Map;
 import org.apache.pinot.spi.config.BaseJsonConfig;
 
 
 public class ColumnPartitionConfig extends BaseJsonConfig {
   private final String _functionName;
   private final int _numPartitions;
+  private final Map<String, String> _functionConfig;
 
   @JsonCreator
   public ColumnPartitionConfig(@JsonProperty(value = "functionName", required = true) String functionName,

Review comment:
       done. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811669564



##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
##########
@@ -46,6 +51,15 @@ public String getFunctionName() {
     return _functionName;
   }
 
+  /**
+   * Returns the partition function configuration for the column.
+   *
+   * @return Partition function configuration.
+   */
+  public Map<String, String> getFunctionConfig() {

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1044093075


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#8224](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f2f131f) into [master](https://codecov.io/gh/apache/pinot/commit/fa0db64facb0f3102ebfe9052fdd83db9f532ce7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fa0db64) will **increase** coverage by `5.54%`.
   > The diff coverage is `80.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8224/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #8224      +/-   ##
   ============================================
   + Coverage     63.98%   69.53%   +5.54%     
   - Complexity     4239     4241       +2     
   ============================================
     Files          1584     1631      +47     
     Lines         83250    85276    +2026     
     Branches      12608    12844     +236     
   ============================================
   + Hits          53268    59293    +6025     
   + Misses        26144    21847    -4297     
   - Partials       3838     4136     +298     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration2 | `27.57% <3.89%> (?)` | |
   | unittests1 | `66.97% <79.16%> (+0.03%)` | :arrow_up: |
   | unittests2 | `14.10% <3.89%> (+0.01%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ocal/indexsegment/mutable/IntermediateSegment.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9JbnRlcm1lZGlhdGVTZWdtZW50LmphdmE=) | `69.62% <0.00%> (-0.45%)` | :arrow_down: |
   | [...ltime/converter/stats/MutableColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZUNvbHVtblN0YXRpc3RpY3MuamF2YQ==) | `53.22% <0.00%> (-1.78%)` | :arrow_down: |
   | [...verter/stats/MutableNoDictionaryColStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZU5vRGljdGlvbmFyeUNvbFN0YXRpc3RpY3MuamF2YQ==) | `40.90% <0.00%> (-4.10%)` | :arrow_down: |
   | [.../loader/defaultcolumn/DefaultColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9kZWZhdWx0Y29sdW1uL0RlZmF1bHRDb2x1bW5TdGF0aXN0aWNzLmphdmE=) | `73.07% <0.00%> (-2.93%)` | :arrow_down: |
   | [...java/org/apache/pinot/segment/spi/V1Constants.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL1YxQ29uc3RhbnRzLmphdmE=) | `14.28% <ø> (ø)` | |
   | [...he/pinot/segment/spi/creator/ColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL2NyZWF0b3IvQ29sdW1uU3RhdGlzdGljcy5qYXZh) | `100.00% <ø> (ø)` | |
   | [.../core/realtime/PinotLLCRealtimeSegmentManager.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL3JlYWx0aW1lL1Bpbm90TExDUmVhbHRpbWVTZWdtZW50TWFuYWdlci5qYXZh) | `75.25% <50.00%> (+9.15%)` | :arrow_up: |
   | [...partition/BoundedColumnValuePartitionFunction.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL3BhcnRpdGlvbi9Cb3VuZGVkQ29sdW1uVmFsdWVQYXJ0aXRpb25GdW5jdGlvbi5qYXZh) | `70.58% <70.58%> (ø)` | |
   | [...pi/partition/metadata/ColumnPartitionMetadata.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL3BhcnRpdGlvbi9tZXRhZGF0YS9Db2x1bW5QYXJ0aXRpb25NZXRhZGF0YS5qYXZh) | `88.00% <83.33%> (+1.95%)` | :arrow_up: |
   | [.../routing/segmentpruner/PartitionSegmentPruner.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcm91dGluZy9zZWdtZW50cHJ1bmVyL1BhcnRpdGlvblNlZ21lbnRQcnVuZXIuamF2YQ==) | `61.24% <100.00%> (+0.30%)` | :arrow_up: |
   | ... and [361 more](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [fa0db64...f2f131f](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r812408932



##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
##########
@@ -21,20 +21,29 @@
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
 import com.google.common.base.Preconditions;
+import java.util.Map;
+import javax.annotation.Nullable;
 import org.apache.pinot.spi.config.BaseJsonConfig;
 
 
 public class ColumnPartitionConfig extends BaseJsonConfig {
   private final String _functionName;
   private final int _numPartitions;
+  private final Map<String, String> _functionConfig;
+
+  public ColumnPartitionConfig(String functionName, int numPartitions) {
+    this(functionName, numPartitions, null);
+  }
 
   @JsonCreator
   public ColumnPartitionConfig(@JsonProperty(value = "functionName", required = true) String functionName,
-      @JsonProperty(value = "numPartitions", required = true) int numPartitions) {
+      @JsonProperty(value = "numPartitions", required = true) int numPartitions,
+      @JsonProperty(value = "functionConfig") Map<String, String> functionConfig) {

Review comment:
       (minor)
   ```suggestion
         @JsonProperty(value = "functionConfig") @Nullable Map<String, String> functionConfig) {
   ```

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/PartitionFunction.java
##########
@@ -49,4 +50,8 @@
    * @return Number of possible partitions.
    */
   int getNumPartitions();
+
+  default Map<String, String> getFunctionConfig() {

Review comment:
       ```suggestion
     @Nullable
     default Map<String, String> getFunctionConfig() {
   ```

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/StatsCollectorConfig.java
##########
@@ -76,6 +77,10 @@ public int getNumPartitions(String column) {
         : SegmentPartitionConfig.INVALID_NUM_PARTITIONS;
   }
 
+  public Map<String, String> getFunctionConfig(String column) {

Review comment:
       ```suggestion
     @Nullable
     public Map<String, String> getFunctionConfig(String column) {
   ```

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+  private static final int DEFAULT_PARTITION_ID = 0;
+  private static final String NAME = "BoundedColumnValue";
+  private static final String COLUMN_VALUES = "columnValues";
+  private final Map<String, String> _functionConfig;
+  private final String[] _values;

Review comment:
       Sounds good, let's keep array then

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/PartitionFunctionFactory.java
##########
@@ -60,13 +60,15 @@ private PartitionFunctionFactory() {
    *
    * @param functionName Name of partition function
    * @param numPartitions Number of partitions.
+   * @param functionConfig The function configuration for given function.
    * @return Partition function
    */
   // TODO: introduce a way to inject custom partition function
   // a custom partition function could be used in the realtime stream partitioning or offline segment partitioning.
   // The PartitionFunctionFactory should be able to support these default implementations, as well as instantiate
   // based on config
-  public static PartitionFunction getPartitionFunction(String functionName, int numPartitions) {
+  public static PartitionFunction getPartitionFunction(String functionName, int numPartitions,
+                                                       Map<String, String> functionConfig) {

Review comment:
       Please reformat the file with [Pinot Style](https://docs.pinot.apache.org/developers/developers-and-contributors/code-setup#intellij)
   ```suggestion
         @Nullable Map<String, String> functionConfig) {
   ```

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/metadata/ColumnPartitionMetadata.java
##########
@@ -52,11 +57,14 @@
    * @param functionName Name of the partition function
    * @param numPartitions Number of total partitions for this column
    * @param partitions Set of partitions the column contains
+   * @param functionConfig Configuration required by partition function.
    */
-  public ColumnPartitionMetadata(String functionName, int numPartitions, Set<Integer> partitions) {
+  public ColumnPartitionMetadata(String functionName, int numPartitions, Set<Integer> partitions,
+                                 Map<String, String> functionConfig) {

Review comment:
       (code format)
   ```suggestion
         @Nullable Map<String, String> functionConfig) {
   ```

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,96 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+  private static final int DEFAULT_PARTITION_ID = 0;
+  private static final String NAME = "BoundedColumnValue";
+  private static final String COLUMN_VALUES = "columnValues";
+  private static final String COLUMN_VALUES_DELIMITER = "columnValuesDelimiter";
+  private final int _numPartitions;
+  private final Map<String, String> _functionConfig;
+  private final String[] _values;
+
+  public BoundedColumnValuePartitionFunction(int numPartitions, Map<String, String> functionConfig) {
+    Preconditions.checkArgument(functionConfig != null && functionConfig.size() > 0,
+        "'functionConfig' should be present, specified", functionConfig);
+    Preconditions.checkState(functionConfig.get(COLUMN_VALUES) != null, "columnValues must be configured");
+    Preconditions.checkState(functionConfig.get(COLUMN_VALUES_DELIMITER) != null,
+        "'columnValuesDelimiter' must be configured");
+    _functionConfig = functionConfig;
+    _values = functionConfig.get(COLUMN_VALUES).split(functionConfig.get(COLUMN_VALUES_DELIMITER));
+    Preconditions.checkState(_values.length == numPartitions,

Review comment:
       `numPartitions` should be `values.length + 1`

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/metadata/ColumnPartitionMetadata.java
##########
@@ -52,11 +57,14 @@
    * @param functionName Name of the partition function
    * @param numPartitions Number of total partitions for this column
    * @param partitions Set of partitions the column contains
+   * @param functionConfig Configuration required by partition function.
    */
-  public ColumnPartitionMetadata(String functionName, int numPartitions, Set<Integer> partitions) {
+  public ColumnPartitionMetadata(String functionName, int numPartitions, Set<Integer> partitions,
+                                 Map<String, String> functionConfig) {

Review comment:
       Either way is fine since `functionConfig` is always passed in non-testing code

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,96 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+  private static final int DEFAULT_PARTITION_ID = 0;
+  private static final String NAME = "BoundedColumnValue";
+  private static final String COLUMN_VALUES = "columnValues";
+  private static final String COLUMN_VALUES_DELIMITER = "columnValuesDelimiter";
+  private final int _numPartitions;
+  private final Map<String, String> _functionConfig;
+  private final String[] _values;
+
+  public BoundedColumnValuePartitionFunction(int numPartitions, Map<String, String> functionConfig) {
+    Preconditions.checkArgument(functionConfig != null && functionConfig.size() > 0,
+        "'functionConfig' should be present, specified", functionConfig);
+    Preconditions.checkState(functionConfig.get(COLUMN_VALUES) != null, "columnValues must be configured");
+    Preconditions.checkState(functionConfig.get(COLUMN_VALUES_DELIMITER) != null,
+        "'columnValuesDelimiter' must be configured");
+    _functionConfig = functionConfig;
+    _values = functionConfig.get(COLUMN_VALUES).split(functionConfig.get(COLUMN_VALUES_DELIMITER));

Review comment:
       Let's use `StringUtils.split()` to avoid regex matching. This way we don't need to escape special characters in delimiter, and also get better performance.

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/metadata/ColumnMetadataImpl.java
##########
@@ -279,8 +284,26 @@ public static ColumnMetadataImpl fromPropertiesConfiguration(String column, Prop
     String partitionFunctionName = config.getString(Column.getKeyFor(column, Column.PARTITION_FUNCTION), null);
     if (partitionFunctionName != null) {
       int numPartitions = config.getInt(Column.getKeyFor(column, Column.NUM_PARTITIONS));
+      Map<String, String> partitionFunctionConfigMap = null;
+      Configuration partitionFunctionConfig = config.subset(Column.getKeyFor(column, Column.PARTITION_FUNCTION_CONFIG));
+      if (!partitionFunctionConfig.isEmpty()) {
+        partitionFunctionConfigMap = new HashMap<>();
+        Iterator<String> partitionFunctionConfigKeysIter = partitionFunctionConfig.getKeys();
+        while (partitionFunctionConfigKeysIter.hasNext()) {
+          String functionConfigKey = partitionFunctionConfigKeysIter.next();
+          Object functionConfigValueObj = partitionFunctionConfig.getProperty(functionConfigKey);
+          String valueStr;
+          if (functionConfigValueObj instanceof List) {

Review comment:
       Let's add some comment here to describe this

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,96 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"

Review comment:
       Update the example to include `numPartitions` and `columnValuesDelimiter`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang merged pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang merged pull request #8224:
URL: https://github.com/apache/pinot/pull/8224


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r814430680



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/PartitionFunctionFactory.java
##########
@@ -60,13 +60,15 @@ private PartitionFunctionFactory() {
    *
    * @param functionName Name of partition function
    * @param numPartitions Number of partitions.
+   * @param functionConfig The function configuration for given function.
    * @return Partition function
    */
   // TODO: introduce a way to inject custom partition function
   // a custom partition function could be used in the realtime stream partitioning or offline segment partitioning.
   // The PartitionFunctionFactory should be able to support these default implementations, as well as instantiate
   // based on config
-  public static PartitionFunction getPartitionFunction(String functionName, int numPartitions) {
+  public static PartitionFunction getPartitionFunction(String functionName, int numPartitions,
+                                                       Map<String, String> functionConfig) {

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r814430771



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/metadata/ColumnPartitionMetadata.java
##########
@@ -52,11 +57,14 @@
    * @param functionName Name of the partition function
    * @param numPartitions Number of total partitions for this column
    * @param partitions Set of partitions the column contains
+   * @param functionConfig Configuration required by partition function.
    */
-  public ColumnPartitionMetadata(String functionName, int numPartitions, Set<Integer> partitions) {
+  public ColumnPartitionMetadata(String functionName, int numPartitions, Set<Integer> partitions,
+                                 Map<String, String> functionConfig) {

Review comment:
       ok.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1050503769


   Thanks @Jackie-Jiang and @richardstartin for your prompt reviews, appreciate it!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1044093075


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#8224](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c876382) into [master](https://codecov.io/gh/apache/pinot/commit/de1cdf2190cf91e53f947735cd364a5fe6125829?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (de1cdf2) will **decrease** coverage by `6.70%`.
   > The diff coverage is `58.90%`.
   
   > :exclamation: Current head c876382 differs from pull request most recent head c840ebf. Consider uploading reports for the commit c840ebf to get more accurate results
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8224/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #8224      +/-   ##
   ============================================
   - Coverage     70.98%   64.28%   -6.71%     
   - Complexity     4313     4315       +2     
   ============================================
     Files          1626     1583      -43     
     Lines         85004    83177    -1827     
     Branches      12791    12599     -192     
   ============================================
   - Hits          60340    53467    -6873     
   - Misses        20514    25865    +5351     
   + Partials       4150     3845     -305     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | integration2 | `?` | |
   | unittests1 | `67.32% <58.82%> (-0.02%)` | :arrow_down: |
   | unittests2 | `14.08% <4.10%> (-0.02%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../core/realtime/PinotLLCRealtimeSegmentManager.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL3JlYWx0aW1lL1Bpbm90TExDUmVhbHRpbWVTZWdtZW50TWFuYWdlci5qYXZh) | `66.10% <0.00%> (-11.87%)` | :arrow_down: |
   | [...manager/realtime/LLRealtimeSegmentDataManager.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvTExSZWFsdGltZVNlZ21lbnREYXRhTWFuYWdlci5qYXZh) | `45.44% <0.00%> (-26.05%)` | :arrow_down: |
   | [...ocal/indexsegment/mutable/IntermediateSegment.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9JbnRlcm1lZGlhdGVTZWdtZW50LmphdmE=) | `69.62% <0.00%> (-0.45%)` | :arrow_down: |
   | [...local/indexsegment/mutable/MutableSegmentImpl.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9NdXRhYmxlU2VnbWVudEltcGwuamF2YQ==) | `58.69% <0.00%> (ø)` | |
   | [...ltime/converter/stats/MutableColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZUNvbHVtblN0YXRpc3RpY3MuamF2YQ==) | `53.22% <0.00%> (-1.78%)` | :arrow_down: |
   | [...verter/stats/MutableNoDictionaryColStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZU5vRGljdGlvbmFyeUNvbFN0YXRpc3RpY3MuamF2YQ==) | `40.90% <0.00%> (-4.10%)` | :arrow_down: |
   | [...ment/creator/impl/SegmentColumnarIndexCreator.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==) | `85.18% <0.00%> (-1.15%)` | :arrow_down: |
   | [.../loader/defaultcolumn/DefaultColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9kZWZhdWx0Y29sdW1uL0RlZmF1bHRDb2x1bW5TdGF0aXN0aWNzLmphdmE=) | `73.07% <0.00%> (-2.93%)` | :arrow_down: |
   | [...java/org/apache/pinot/segment/spi/V1Constants.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL1YxQ29uc3RhbnRzLmphdmE=) | `14.28% <ø> (ø)` | |
   | [...he/pinot/segment/spi/creator/ColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL2NyZWF0b3IvQ29sdW1uU3RhdGlzdGljcy5qYXZh) | `100.00% <ø> (ø)` | |
   | ... and [392 more](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [de1cdf2...c840ebf](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1044093075


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#8224](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bec3162) into [master](https://codecov.io/gh/apache/pinot/commit/de1cdf2190cf91e53f947735cd364a5fe6125829?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (de1cdf2) will **decrease** coverage by `40.36%`.
   > The diff coverage is `8.45%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8224/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #8224       +/-   ##
   =============================================
   - Coverage     70.98%   30.62%   -40.37%     
   =============================================
     Files          1626     1616       -10     
     Lines         85004    84704      -300     
     Branches      12791    12765       -26     
   =============================================
   - Hits          60340    25937    -34403     
   - Misses        20514    56444    +35930     
   + Partials       4150     2323     -1827     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `28.80% <7.04%> (+0.10%)` | :arrow_up: |
   | integration2 | `27.62% <4.22%> (+0.05%)` | :arrow_up: |
   | unittests1 | `?` | |
   | unittests2 | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...processing/partitioner/TableConfigPartitioner.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvcGFydGl0aW9uZXIvVGFibGVDb25maWdQYXJ0aXRpb25lci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ocal/indexsegment/mutable/IntermediateSegment.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9JbnRlcm1lZGlhdGVTZWdtZW50LmphdmE=) | `0.00% <0.00%> (-70.07%)` | :arrow_down: |
   | [...local/indexsegment/mutable/MutableSegmentImpl.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9NdXRhYmxlU2VnbWVudEltcGwuamF2YQ==) | `0.00% <0.00%> (-58.70%)` | :arrow_down: |
   | [...ltime/converter/stats/MutableColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZUNvbHVtblN0YXRpc3RpY3MuamF2YQ==) | `0.00% <0.00%> (-55.00%)` | :arrow_down: |
   | [...verter/stats/MutableNoDictionaryColStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZU5vRGljdGlvbmFyeUNvbFN0YXRpc3RpY3MuamF2YQ==) | `0.00% <0.00%> (-45.00%)` | :arrow_down: |
   | [...ment/creator/impl/SegmentColumnarIndexCreator.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==) | `0.00% <0.00%> (-86.33%)` | :arrow_down: |
   | [.../impl/stats/AbstractColumnStatisticsCollector.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9zdGF0cy9BYnN0cmFjdENvbHVtblN0YXRpc3RpY3NDb2xsZWN0b3IuamF2YQ==) | `0.00% <0.00%> (-92.50%)` | :arrow_down: |
   | [.../loader/defaultcolumn/DefaultColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9kZWZhdWx0Y29sdW1uL0RlZmF1bHRDb2x1bW5TdGF0aXN0aWNzLmphdmE=) | `0.00% <0.00%> (-76.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/segment/spi/V1Constants.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL1YxQ29uc3RhbnRzLmphdmE=) | `0.00% <ø> (-14.29%)` | :arrow_down: |
   | [...t/segment/spi/creator/ColumnIndexCreationInfo.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL2NyZWF0b3IvQ29sdW1uSW5kZXhDcmVhdGlvbkluZm8uamF2YQ==) | `0.00% <0.00%> (-96.30%)` | :arrow_down: |
   | ... and [1141 more](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [de1cdf2...bec3162](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811677377



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.

Review comment:
       I wanted to keep a fixed value of partitionId for unknown column values. When putting other/unknown values as last partition, the partition id would different every time partition config is changed to add/remove another column value.
   
   When backfilling data for a particular source/tenant (in our example, a subject), we can easily ignore segments which has partitionId set to 0 in segment metadata because we know for sure that it could have data for many different subjects.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r814430478



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,96 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+  private static final int DEFAULT_PARTITION_ID = 0;
+  private static final String NAME = "BoundedColumnValue";
+  private static final String COLUMN_VALUES = "columnValues";
+  private static final String COLUMN_VALUES_DELIMITER = "columnValuesDelimiter";
+  private final int _numPartitions;
+  private final Map<String, String> _functionConfig;
+  private final String[] _values;
+
+  public BoundedColumnValuePartitionFunction(int numPartitions, Map<String, String> functionConfig) {
+    Preconditions.checkArgument(functionConfig != null && functionConfig.size() > 0,
+        "'functionConfig' should be present, specified", functionConfig);
+    Preconditions.checkState(functionConfig.get(COLUMN_VALUES) != null, "columnValues must be configured");
+    Preconditions.checkState(functionConfig.get(COLUMN_VALUES_DELIMITER) != null,
+        "'columnValuesDelimiter' must be configured");
+    _functionConfig = functionConfig;
+    _values = functionConfig.get(COLUMN_VALUES).split(functionConfig.get(COLUMN_VALUES_DELIMITER));

Review comment:
       done.

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,96 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+  private static final int DEFAULT_PARTITION_ID = 0;
+  private static final String NAME = "BoundedColumnValue";
+  private static final String COLUMN_VALUES = "columnValues";
+  private static final String COLUMN_VALUES_DELIMITER = "columnValuesDelimiter";
+  private final int _numPartitions;
+  private final Map<String, String> _functionConfig;
+  private final String[] _values;
+
+  public BoundedColumnValuePartitionFunction(int numPartitions, Map<String, String> functionConfig) {
+    Preconditions.checkArgument(functionConfig != null && functionConfig.size() > 0,
+        "'functionConfig' should be present, specified", functionConfig);
+    Preconditions.checkState(functionConfig.get(COLUMN_VALUES) != null, "columnValues must be configured");
+    Preconditions.checkState(functionConfig.get(COLUMN_VALUES_DELIMITER) != null,
+        "'columnValuesDelimiter' must be configured");
+    _functionConfig = functionConfig;
+    _values = functionConfig.get(COLUMN_VALUES).split(functionConfig.get(COLUMN_VALUES_DELIMITER));
+    Preconditions.checkState(_values.length == numPartitions,

Review comment:
       changed.

##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/PartitionFunction.java
##########
@@ -49,4 +50,8 @@
    * @return Number of possible partitions.
    */
   int getNumPartitions();
+
+  default Map<String, String> getFunctionConfig() {

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid edited a comment on pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid edited a comment on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1044436343


   > Hi @kmozaid thanks for taking the time to make this contribution. Can you explain what problem this solves? Is it because you already have a partitioning and you want to maintain locality within partitions?
   
   Hi @richardstartin , We have a table where data is being ingested from multiple sources. (these multiple sources pushes data to same kafka topic). Data is kept for 5 days in realtime table and then moved offline table by minion task. We want to keep data from these sources in separate segments for offline table. There is a column which identifies the source. `BoundedColumnValue` partition function provides capability to keep data from different sources in respective partitioned segments. Later if we want to backfill the data of just one source, then we will be able to do so because we would know what are the segments for given source and replace them by backfill. The main use case is to be able to backfill data of particular source. This is also discussed in slack thread - https://apache-pinot.slack.com/archives/CDRCA57FC/p1643286670255700
   
   <img width="982" alt="image" src="https://user-images.githubusercontent.com/8354145/154681296-cf689529-26fc-4d1f-a983-70e18db4f4bc.png">
   
   Just a note regarding the image - The partitionId mentioned in image are source1, source2 and source 3 although they would be integer value based on the their position in configured `columnValues` attributes. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811667117



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+  private static final int DEFAULT_PARTITION_ID = 0;
+  private static final String NAME = "BoundedColumnValue";
+  private static final String COLUMN_VALUES = "columnValues";
+  private final Map<String, String> _functionConfig;
+  private final String[] _values;
+  private final int _noOfValues;
+
+  public BoundedColumnValuePartitionFunction(Map<String, String> functionConfig) {
+    Preconditions.checkArgument(functionConfig != null && functionConfig.size() > 0,
+        "functionConfig should be present, specified", functionConfig);
+    Preconditions.checkState(functionConfig.get(COLUMN_VALUES) != null, "columnValues must be configured");
+    _functionConfig = functionConfig;
+    _values = functionConfig.get(COLUMN_VALUES).split("\\|");
+    _noOfValues = _values.length;
+  }
+
+  @Override
+  public int getPartition(Object valueIn) {
+    String value = valueIn instanceof String ? (String) valueIn : String.valueOf(valueIn);

Review comment:
       I read somewhere that casting is cheaper than toString but there was no backing theory :D Anyway thanks, I have changed it. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811665907



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
##########
@@ -126,7 +126,8 @@ private PartitionInfo extractPartitionInfoFromSegmentZKMetadataZNRecord(String s
     }
 
     return new PartitionInfo(PartitionFunctionFactory.getPartitionFunction(columnPartitionMetadata.getFunctionName(),
-        columnPartitionMetadata.getNumPartitions()), columnPartitionMetadata.getPartitions());
+        columnPartitionMetadata.getNumPartitions(), columnPartitionMetadata.getFunctionConfig()),
+            columnPartitionMetadata.getPartitions());

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] richardstartin commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
richardstartin commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811859445



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+  private static final int DEFAULT_PARTITION_ID = 0;
+  private static final String NAME = "BoundedColumnValue";
+  private static final String COLUMN_VALUES = "columnValues";
+  private final Map<String, String> _functionConfig;
+  private final String[] _values;

Review comment:
       Under ~10 values linear search in an array tends to be faster anyway (much more than 10 values when equality is cheaper to verify than strings)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811666049



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+  private static final int DEFAULT_PARTITION_ID = 0;
+  private static final String NAME = "BoundedColumnValue";
+  private static final String COLUMN_VALUES = "columnValues";
+  private final Map<String, String> _functionConfig;
+  private final String[] _values;
+  private final int _noOfValues;
+
+  public BoundedColumnValuePartitionFunction(Map<String, String> functionConfig) {
+    Preconditions.checkArgument(functionConfig != null && functionConfig.size() > 0,
+        "functionConfig should be present, specified", functionConfig);
+    Preconditions.checkState(functionConfig.get(COLUMN_VALUES) != null, "columnValues must be configured");
+    _functionConfig = functionConfig;
+    _values = functionConfig.get(COLUMN_VALUES).split("\\|");

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811665597



##########
File path: pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
##########
@@ -21,20 +21,25 @@
 import com.fasterxml.jackson.annotation.JsonCreator;
 import com.fasterxml.jackson.annotation.JsonProperty;
 import com.google.common.base.Preconditions;
+import java.util.Map;
 import org.apache.pinot.spi.config.BaseJsonConfig;
 
 
 public class ColumnPartitionConfig extends BaseJsonConfig {
   private final String _functionName;
   private final int _numPartitions;
+  private final Map<String, String> _functionConfig;
 
   @JsonCreator
   public ColumnPartitionConfig(@JsonProperty(value = "functionName", required = true) String functionName,
-      @JsonProperty(value = "numPartitions", required = true) int numPartitions) {
+      @JsonProperty(value = "numPartitions", required = false) int numPartitions,

Review comment:
       ok thanks, done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kmozaid commented on a change in pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
kmozaid commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811677377



##########
File path: pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based configured column values.
+ *
+ * "columnPartitionMap": {
+ *   "subject": {
+ *     "functionName": "BoundedColumnValue",
+ *     "functionConfig": {
+ *       "columnValues": "Maths|English|Chemistry"
+ *     }
+ *   }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for Maths, 2 for English and so on.

Review comment:
       I wanted to keep a fixed value of partitionId for unknown column values. When putting other/unknown values as last partition, the partition id would different every time partition config is changed to add/remove another column value.
   
   When backfilling data for a particular source/tenant (in our example, a subject), we can easily ignore segments which has partitionId set to 0 in segment metadata because we know for sure that it will have data for many different subjects.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1044093075


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#8224](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (695d215) into [master](https://codecov.io/gh/apache/pinot/commit/fa0db64facb0f3102ebfe9052fdd83db9f532ce7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fa0db64) will **decrease** coverage by `49.88%`.
   > The diff coverage is `3.84%`.
   
   > :exclamation: Current head 695d215 differs from pull request most recent head 6a56b0a. Consider uploading reports for the commit 6a56b0a to get more accurate results
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8224/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #8224       +/-   ##
   =============================================
   - Coverage     63.98%   14.10%   -49.89%     
   + Complexity     4239       81     -4158     
   =============================================
     Files          1584     1586        +2     
     Lines         83250    83312       +62     
     Branches      12608    12622       +14     
   =============================================
   - Hits          53268    11747    -41521     
   - Misses        26144    70686    +44542     
   + Partials       3838      879     -2959     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests1 | `?` | |
   | unittests2 | `14.10% <3.84%> (+<0.01%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../core/realtime/PinotLLCRealtimeSegmentManager.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL3JlYWx0aW1lL1Bpbm90TExDUmVhbHRpbWVTZWdtZW50TWFuYWdlci5qYXZh) | `66.10% <0.00%> (ø)` | |
   | [...manager/realtime/LLRealtimeSegmentDataManager.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvTExSZWFsdGltZVNlZ21lbnREYXRhTWFuYWdlci5qYXZh) | `0.00% <0.00%> (-45.45%)` | :arrow_down: |
   | [...processing/partitioner/TableConfigPartitioner.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3Byb2Nlc3NpbmcvcGFydGl0aW9uZXIvVGFibGVDb25maWdQYXJ0aXRpb25lci5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ocal/indexsegment/mutable/IntermediateSegment.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9JbnRlcm1lZGlhdGVTZWdtZW50LmphdmE=) | `0.00% <0.00%> (-70.07%)` | :arrow_down: |
   | [...ltime/converter/stats/MutableColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZUNvbHVtblN0YXRpc3RpY3MuamF2YQ==) | `0.00% <0.00%> (-55.00%)` | :arrow_down: |
   | [...verter/stats/MutableNoDictionaryColStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZU5vRGljdGlvbmFyeUNvbFN0YXRpc3RpY3MuamF2YQ==) | `0.00% <0.00%> (-45.00%)` | :arrow_down: |
   | [...ment/creator/impl/SegmentColumnarIndexCreator.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9TZWdtZW50Q29sdW1uYXJJbmRleENyZWF0b3IuamF2YQ==) | `0.00% <0.00%> (-86.33%)` | :arrow_down: |
   | [.../impl/stats/AbstractColumnStatisticsCollector.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2NyZWF0b3IvaW1wbC9zdGF0cy9BYnN0cmFjdENvbHVtblN0YXRpc3RpY3NDb2xsZWN0b3IuamF2YQ==) | `0.00% <0.00%> (-92.50%)` | :arrow_down: |
   | [.../loader/defaultcolumn/DefaultColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9kZWZhdWx0Y29sdW1uL0RlZmF1bHRDb2x1bW5TdGF0aXN0aWNzLmphdmE=) | `0.00% <0.00%> (-76.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/segment/spi/V1Constants.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL1YxQ29uc3RhbnRzLmphdmE=) | `0.00% <ø> (-14.29%)` | :arrow_down: |
   | ... and [1108 more](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [fa0db64...6a56b0a](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter edited a comment on pull request #8224: Implemented BoundedColumnValue partition function

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#issuecomment-1044093075


   # [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#8224](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f2f131f) into [master](https://codecov.io/gh/apache/pinot/commit/fa0db64facb0f3102ebfe9052fdd83db9f532ce7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fa0db64) will **increase** coverage by `0.04%`.
   > The diff coverage is `77.92%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/pinot/pull/8224/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #8224      +/-   ##
   ============================================
   + Coverage     63.98%   64.03%   +0.04%     
   - Complexity     4239     4241       +2     
   ============================================
     Files          1584     1586       +2     
     Lines         83250    83394     +144     
     Branches      12608    12640      +32     
   ============================================
   + Hits          53268    53399     +131     
   - Misses        26144    26154      +10     
   - Partials       3838     3841       +3     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | unittests1 | `66.97% <79.16%> (+0.03%)` | :arrow_up: |
   | unittests2 | `14.10% <3.89%> (+0.01%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../core/realtime/PinotLLCRealtimeSegmentManager.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL3JlYWx0aW1lL1Bpbm90TExDUmVhbHRpbWVTZWdtZW50TWFuYWdlci5qYXZh) | `66.10% <0.00%> (ø)` | |
   | [...manager/realtime/LLRealtimeSegmentDataManager.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL21hbmFnZXIvcmVhbHRpbWUvTExSZWFsdGltZVNlZ21lbnREYXRhTWFuYWdlci5qYXZh) | `45.44% <0.00%> (ø)` | |
   | [...ocal/indexsegment/mutable/IntermediateSegment.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9pbmRleHNlZ21lbnQvbXV0YWJsZS9JbnRlcm1lZGlhdGVTZWdtZW50LmphdmE=) | `69.62% <0.00%> (-0.45%)` | :arrow_down: |
   | [...ltime/converter/stats/MutableColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZUNvbHVtblN0YXRpc3RpY3MuamF2YQ==) | `53.22% <0.00%> (-1.78%)` | :arrow_down: |
   | [...verter/stats/MutableNoDictionaryColStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9yZWFsdGltZS9jb252ZXJ0ZXIvc3RhdHMvTXV0YWJsZU5vRGljdGlvbmFyeUNvbFN0YXRpc3RpY3MuamF2YQ==) | `40.90% <0.00%> (-4.10%)` | :arrow_down: |
   | [.../loader/defaultcolumn/DefaultColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1sb2NhbC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3Qvc2VnbWVudC9sb2NhbC9zZWdtZW50L2luZGV4L2xvYWRlci9kZWZhdWx0Y29sdW1uL0RlZmF1bHRDb2x1bW5TdGF0aXN0aWNzLmphdmE=) | `73.07% <0.00%> (-2.93%)` | :arrow_down: |
   | [...java/org/apache/pinot/segment/spi/V1Constants.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL1YxQ29uc3RhbnRzLmphdmE=) | `14.28% <ø> (ø)` | |
   | [...he/pinot/segment/spi/creator/ColumnStatistics.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL2NyZWF0b3IvQ29sdW1uU3RhdGlzdGljcy5qYXZh) | `100.00% <ø> (ø)` | |
   | [...partition/BoundedColumnValuePartitionFunction.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL3BhcnRpdGlvbi9Cb3VuZGVkQ29sdW1uVmFsdWVQYXJ0aXRpb25GdW5jdGlvbi5qYXZh) | `70.58% <70.58%> (ø)` | |
   | [...pi/partition/metadata/ColumnPartitionMetadata.java](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL3BhcnRpdGlvbi9tZXRhZGF0YS9Db2x1bW5QYXJ0aXRpb25NZXRhZGF0YS5qYXZh) | `88.00% <83.33%> (+1.95%)` | :arrow_up: |
   | ... and [32 more](https://codecov.io/gh/apache/pinot/pull/8224/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [fa0db64...f2f131f](https://codecov.io/gh/apache/pinot/pull/8224?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org