You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/03/16 19:04:41 UTC

[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #5147: Support default star-tree

kishoreg commented on a change in pull request #5147: Support default star-tree
URL: https://github.com/apache/incubator-pinot/pull/5147#discussion_r393247188
 
 

 ##########
 File path: pinot-core/src/main/java/org/apache/pinot/core/startree/v2/builder/StarTreeV2BuilderConfig.java
 ##########
 @@ -58,6 +69,73 @@ public static StarTreeV2BuilderConfig fromIndexConfig(StarTreeIndexConfig indexC
     return builder.build();
   }
 
+  /**
+   * Generates default config based on the segment metadata.
+   * <ul>
+   *   <li>
+   *     All dictionary-encoded single-value dimensions (including date-time columns) with cardinality smaller or equal
+   *     to the threshold will be included in the split order, sorted by their cardinality in descending order
+   *   </li>
+   *   <li>Time column (if exists and dictionary-encoded) will be appended to the split order as the last element</li>
+   *   <li>Use COUNT(*) and SUM for all numeric metrics as function column pairs</li>
+   *   <li>Use default value for max leaf records</li>
+   * </ul>
+   */
+  public static StarTreeV2BuilderConfig generateDefaultConfig(SegmentMetadataImpl segmentMetadata) {
+    Schema schema = segmentMetadata.getSchema();
+    List<ColumnMetadata> dimensionColumnMetadataList = new ArrayList<>();
+    String timeColumn = null;
+    List<String> numericMetrics = new ArrayList<>();
+
+    for (FieldSpec fieldSpec : schema.getAllFieldSpecs()) {
+      if (!fieldSpec.isSingleValueField() || fieldSpec.isVirtualColumn()) {
+        continue;
+      }
+      String column = fieldSpec.getName();
+      switch (fieldSpec.getFieldType()) {
+        case DIMENSION:
+        case DATE_TIME:
+          ColumnMetadata columnMetadata = segmentMetadata.getColumnMetadataFor(column);
 
 Review comment:
   why are we checking the cardinality threshold for date_time but not for time?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org