You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/07/06 22:15:27 UTC

[GitHub] [iceberg] szehon-ho commented on a diff in pull request #5215: Core: Update MetricsConfig to use a default for first 32 columns

szehon-ho commented on code in PR #5215:
URL: https://github.com/apache/iceberg/pull/5215#discussion_r915289730


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -195,6 +198,16 @@ private static MetricsMode sortedColumnDefaultMode(MetricsMode defaultMode) {
     }
   }
 
+  private static MetricsMode parseMode(String modeString, MetricsMode fallback, String context) {
+    try {
+      return MetricsModes.fromString(modeString);
+    } catch (IllegalArgumentException err) {
+      // User override was invalid, log the error and use the default
+      LOG.warn("Ignoring invalid metrics mode ({}): {}", context, modeString, err);

Review Comment:
   This may not log err?  (It looks like the parameterized version of log.warn)



##########
core/src/main/java/org/apache/iceberg/TableProperties.java:
##########
@@ -267,6 +267,10 @@ private TableProperties() {
   public static final String METADATA_DELETE_AFTER_COMMIT_ENABLED = "write.metadata.delete-after-commit.enabled";
   public static final boolean METADATA_DELETE_AFTER_COMMIT_ENABLED_DEFAULT = false;
 
+  public static final String METRICS_MAX_INFERRED_COLUMN_DEFAULTS =

Review Comment:
   Agree with this table property, initially I had made one but it was taken out during the discussions.  Indeed it's a bit of a confusing config, but I dont see any other great option.



##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -136,50 +131,58 @@ public static MetricsConfig forPositionDelete(Table table) {
   }
 
   /**
-   * Generate a MetricsConfig for all columns based on overrides, sortOrder, and defaultMode.
+   * Generate a MetricsConfig for all columns based on overrides, schema, and sort order.
+   *
    * @param props will be read for metrics overrides (write.metadata.metrics.column.*) and default
    *              (write.metadata.metrics.default)
+   * @param schema table schema
    * @param order sort order columns, will be promoted to truncate(16)
-   * @param defaultMode default, if not set by user property
    * @return metrics configuration
    */
-  private static MetricsConfig from(Map<String, String> props, SortOrder order, String defaultMode) {
+  private static MetricsConfig from(Map<String, String> props, Schema schema, SortOrder order) {
+    int maxInferredDefaultColumns = PropertyUtil.propertyAsInt(props,

Review Comment:
   Add precondition that its >= 0?



##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -136,50 +131,58 @@ public static MetricsConfig forPositionDelete(Table table) {
   }
 
   /**
-   * Generate a MetricsConfig for all columns based on overrides, sortOrder, and defaultMode.
+   * Generate a MetricsConfig for all columns based on overrides, schema, and sort order.
+   *
    * @param props will be read for metrics overrides (write.metadata.metrics.column.*) and default
    *              (write.metadata.metrics.default)
+   * @param schema table schema
    * @param order sort order columns, will be promoted to truncate(16)
-   * @param defaultMode default, if not set by user property
    * @return metrics configuration
    */
-  private static MetricsConfig from(Map<String, String> props, SortOrder order, String defaultMode) {
+  private static MetricsConfig from(Map<String, String> props, Schema schema, SortOrder order) {
+    int maxInferredDefaultColumns = PropertyUtil.propertyAsInt(props,
+        TableProperties.METRICS_MAX_INFERRED_COLUMN_DEFAULTS,
+        TableProperties.METRICS_MAX_INFERRED_COLUMN_DEFAULTS_DEFAULT);
     Map<String, MetricsMode> columnModes = Maps.newHashMap();
 
     // Handle user override of default mode
-    MetricsMode finalDefaultMode;
-    String defaultModeAsString = props.getOrDefault(DEFAULT_WRITE_METRICS_MODE, defaultMode);
-    try {
-      finalDefaultMode = MetricsModes.fromString(defaultModeAsString);
-    } catch (IllegalArgumentException err) {
-      // User override was invalid, log the error and use the default
-      LOG.warn("Ignoring invalid default metrics mode: {}", defaultModeAsString, err);
-      finalDefaultMode = MetricsModes.fromString(defaultMode);
+    MetricsMode defaultMode;
+    String configuredDefault = props.get(DEFAULT_WRITE_METRICS_MODE);
+    if (configuredDefault != null) {
+      // a user-configured default mode is applied for all columns
+      defaultMode = parseMode(configuredDefault, DEFAULT_MODE, "default");
+
+    } else if (schema == null || schema.columns().size() <= maxInferredDefaultColumns) {
+      // there are less than the inferred limit, so the default is used everywhere
+      defaultMode = DEFAULT_MODE;
+
+    } else {
+      // a inferred default mode is applied to the first few columns, up to the limit
+      Schema subSchema = new Schema(schema.columns().subList(0, maxInferredDefaultColumns));

Review Comment:
   OK so we may have still more than 32 type metrics (if chosen 32 columns are nested for example).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org