You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/02/17 19:50:52 UTC

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #2240: Auto promote sorted column metrics to full

aokolnychyi commented on a change in pull request #2240:
URL: https://github.com/apache/iceberg/pull/2240#discussion_r577899216



##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -120,6 +120,8 @@ private TableProperties() {
   public static final String METRICS_MODE_COLUMN_CONF_PREFIX = "write.metadata.metrics.column.";
   public static final String DEFAULT_WRITE_METRICS_MODE = "write.metadata.metrics.default";
   public static final String DEFAULT_WRITE_METRICS_MODE_DEFAULT = "truncate(16)";
+  public static final String SORTED_COL_DEFAULT_METRICS_MODE = "write.metadata.sorted.metrics.default";

Review comment:
       Actually, I think we should simplify it a bit. The use case I was talking about is when the user configures the default sort mode as `none` or `counts` but creates a table with a sort order. In that case, we should promote the metrics for sort columns to be at least `truncate(16)` unless the user sets a mode for sort columns explicitly. It is probably too dangerous to promote to `full` as the values may be too long. I guess `truncate(16)` is a reasonable default for sort columns and it will apply only to sort and binary columns. Longs/integers are not affected.
   
   Internally, we may want to change the default value to `counts` instead of `truncate(16)` for tables with many columns as we have a lot of tables with 100+ columns and we don't want to a ton of unnecessary metadata. But I am not sure that the community wants to do the same.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org