You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/30 02:23:29 UTC

[GitHub] [iceberg] chenjunjiedada opened a new pull request, #6313: Flink: use correct metric config for position deletes

chenjunjiedada opened a new pull request, #6313:
URL: https://github.com/apache/iceberg/pull/6313

   `FlinkAppenderFactory` uses the wrong metric configs for position deletes, leading to ineffective filtering.  Refactoring the whole write path to `FileWriterFactory` could fix this while we are using `FileAppenderFactory` for a long time. Not sure what other benefits can bring if we change that. So here is a simple fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1040423889


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   @chenjunjiedada hmm. I think we should revert the last commit. if we are going to pass in the `Table` object, it should be non-null. Then we can consistently use `forTable` everywhere.
   
   I would also be open with the `forProperties` approach, as you were saying that schema and sort order are for data rows/files.  I had some reservations because `MetricConfig` deprecated `forProperties` for data rows/files. It is a little weird to introduce it back for position delete. Flink can use `forPositionDelete(Table table)` API by passing a valid `SerializableTable` object to `FlinkAppenderFactory`. With `Table` object, we can remove the `Schema` and `PartitionSpec` from the current `FlinkAppenderFactory` constructor



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1041205774


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   BTW, Anton had a previous attempt for moving Flink to `SerializableTable` for Flink source/reader. https://github.com/apache/iceberg/pull/2987.
   
   We can revive that effort as a separate PR? this can be merged after that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1040423889


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   @chenjunjiedada hmm. I think we should revert the last commit. if we are going to pass in the `Table` object, it should be non-null. Then we can consistently use `forTable` everywhere.
   
   I would also be open with the `forProperties` approach, as you were saying that schema and sort order are for data rows/files.  I had some reservations because `MetricConfig` deprecated `forProperties` for data rows/files. It is a little weird to introduce it back for position delete. Flink can use `forPositionDelete(Table table)` API by passing a valid `Table` object to `FlinkAppenderFactory`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1040425132


##########
flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkAppenderFactory.java:
##########
@@ -160,7 +184,8 @@ public EqualityDeleteWriter<RowData> newEqDeleteWriter(
         eqDeleteRowSchema,
         "Equality delete row schema shouldn't be null when creating equality-delete writer");
 
-    MetricsConfig metricsConfig = MetricsConfig.fromProperties(props);
+    MetricsConfig metricsConfig =
+        table != null ? MetricsConfig.forTable(table) : MetricsConfig.fromProperties(props);

Review Comment:
   this actually makes a stronger case that we should pass in a valid `Table` object to the `FlinkAppenderFactory` as I mentioned in the other comment. we should also use `forTable` for position delete although it doesn't need schema or SortOrder as you said.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chenjunjiedada commented on pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
chenjunjiedada commented on PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#issuecomment-1331564600

   @stevenzwu @hililiwei You guys may have an interest in this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu merged pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu merged PR #6313:
URL: https://github.com/apache/iceberg/pull/6313


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1038640162


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   @chenjunjiedada what do you think of using the existing API? I know it would require changes in `FlinkAppenderFactory` to pass in the `Table` object.
   ```
   public static MetricsConfig forPositionDelete(Table table) {
   ```
   
   The benefit is that position delete file will get the consistent behavior as data file. unless we are saying position delete has its own schema, shared code path doesn't really make sense.
   ```
     private static MetricsConfig from(Map<String, String> props, Schema schema, SortOrder order) {
       ...
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chenjunjiedada commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
chenjunjiedada commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1045778493


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   @stevenzwu , That one (https://github.com/apache/iceberg/pull/6407) is for source. This PR is for the sink side. Let me take a look at the sink as well.  Maybe eventually we will change back to adopt `FileWriterFactory`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1035513997


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {
+    ImmutableMap.Builder<String, MetricsMode> columnModes = ImmutableMap.builder();
+
+    columnModes.put(MetadataColumns.DELETE_FILE_PATH.name(), MetricsModes.Full.get());
+    columnModes.put(MetadataColumns.DELETE_FILE_POS.name(), MetricsModes.Full.get());
+
+    MetricsConfig tableConfig = from(props, null, null);

Review Comment:
   since this is not `forTable` method, it shouldn't be called `tableConfig`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1046608479


##########
flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkAppenderFactory.java:
##########
@@ -216,7 +241,8 @@ public EqualityDeleteWriter<RowData> newEqDeleteWriter(
   @Override
   public PositionDeleteWriter<RowData> newPosDeleteWriter(
       EncryptedOutputFile outputFile, FileFormat format, StructLike partition) {
-    MetricsConfig metricsConfig = MetricsConfig.fromProperties(props);
+    MetricsConfig metricsConfig =
+        table != null ? MetricsConfig.forTable(table) : MetricsConfig.fromProperties(props);

Review Comment:
   this should be `forPositionDelete`



##########
flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkAppenderFactory.java:
##########
@@ -99,7 +122,8 @@ private RowType lazyPosDeleteFlinkSchema() {
 
   @Override
   public FileAppender<RowData> newAppender(OutputFile outputFile, FileFormat format) {
-    MetricsConfig metricsConfig = MetricsConfig.fromProperties(props);
+    MetricsConfig metricsConfig =
+        table != null ? MetricsConfig.forTable(table) : MetricsConfig.fromProperties(props);

Review Comment:
   `MetricsConfig.fromProperties` is deprecated. I am wondering if we should ensure `table` is never null? That would require changing the current 2 constructors (technically breaking) without adding a new constructor. but this class should be an `@Internal` class. what do you think? @pvary @hililiwei @chenjunjiedada 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chenjunjiedada commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
chenjunjiedada commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1041674918


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   Let me take a look at that one fisrtly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1035511519


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   should we use the method from line 91?
   ```
    public static MetricsConfig forPositionDelete(Table table) {
   ```
   
   Flink write properties allows override of table properties for write (e.g. compression configs). Do we have such use cases for metrics config?
   
   I am asking because the current `fromProperties` method (for data files) is deprecated in favor of the method `forTable`. Looks like deprecation is from this PR.
   ```
     /**
      * Creates a metrics config from table configuration.
      *
      * @param props table configuration
      * @deprecated use {@link MetricsConfig#forTable(Table)}
      */
     @Deprecated
     public static MetricsConfig fromProperties(Map<String, String> props) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1046609770


##########
flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkAppenderFactory.java:
##########
@@ -99,7 +122,8 @@ private RowType lazyPosDeleteFlinkSchema() {
 
   @Override
   public FileAppender<RowData> newAppender(OutputFile outputFile, FileFormat format) {
-    MetricsConfig metricsConfig = MetricsConfig.fromProperties(props);
+    MetricsConfig metricsConfig =
+        table != null ? MetricsConfig.forTable(table) : MetricsConfig.fromProperties(props);

Review Comment:
   `MetricsConfig.fromProperties` is deprecated. I am wondering if we should just ensure `table` is never null? That would require changing the current 2 constructors (technically breaking) without adding a new constructor. but this class should be an `@Internal` class. what do you think? @pvary @hililiwei @chenjunjiedada 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1041205774


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   BTW, Anton had a previous attempt for moving Flink to `SerializableTable` for Flink source/reader



##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   BTW, Anton had a previous attempt for moving Flink to `SerializableTable` for Flink source/reader. https://github.com/apache/iceberg/pull/2987



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chenjunjiedada commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
chenjunjiedada commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1038691073


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   Actually, positions delete does NOT share the same behavior as data files, `SortedPosDeleteWriter` sorts the content.
   
   What do you mean by has its own schema? Position deletes can have two schemas, one is [path, pos] and another is [path, pos, row]. Currently, it writes [path, pos] in `BaseTaskWriter`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1035511519


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   should we use the method from line 91?
   ```
    public static MetricsConfig forPositionDelete(Table table) {
   ```
   
   Flink write properties allows override of table properties for write (e.g. compression configs). Do we have such use cases for metrics config?
   
   I am asking because the current `fromProperties` method (for data files) is deprecated in favor of the method `forTable`. Looks like deprecation is from PR #2240 by @szehon-ho .
   ```
     /**
      * Creates a metrics config from table configuration.
      *
      * @param props table configuration
      * @deprecated use {@link MetricsConfig#forTable(Table)}
      */
     @Deprecated
     public static MetricsConfig fromProperties(Map<String, String> props) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chenjunjiedada commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
chenjunjiedada commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1035466304


##########
build.gradle:
##########
@@ -111,7 +111,7 @@ subprojects {
     revapi {
       oldGroup = project.group
       oldName = project.name
-      oldVersion = "1.1.0"
+      oldVersion = "1.0.0"

Review Comment:
   Just for build, will change back if the master pass the build.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chenjunjiedada commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
chenjunjiedada commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1040421576


##########
flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkAppenderFactory.java:
##########
@@ -160,7 +184,8 @@ public EqualityDeleteWriter<RowData> newEqDeleteWriter(
         eqDeleteRowSchema,
         "Equality delete row schema shouldn't be null when creating equality-delete writer");
 
-    MetricsConfig metricsConfig = MetricsConfig.fromProperties(props);
+    MetricsConfig metricsConfig =
+        table != null ? MetricsConfig.forTable(table) : MetricsConfig.fromProperties(props);

Review Comment:
   The data row in equality delete should share the same metrics config as the table, so I change this as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chenjunjiedada commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
chenjunjiedada commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1040420644


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   @stevenzwu I recheck the metric config, the schema and sort order are for the data row. Change to use `forTable` now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1035529726


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   At least, we can avoid code duplication by extracting common code btw these two methods, if we do need this new variant. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1035511519


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   should we use the method from line 91?
   ```
    public static MetricsConfig forPositionDelete(Table table) {
   ```
   
   Flink write properties allows override of table properties for write (e.g. compression configs). Do we have such use cases for metrics config?
   
   I am asking because the current `fromProperties` method (for data files) is deprecated in favor of the method `forTable`.
   ```
     /**
      * Creates a metrics config from table configuration.
      *
      * @param props table configuration
      * @deprecated use {@link MetricsConfig#forTable(Table)}
      */
     @Deprecated
     public static MetricsConfig fromProperties(Map<String, String> props) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chenjunjiedada commented on a diff in pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
chenjunjiedada commented on code in PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#discussion_r1035537907


##########
core/src/main/java/org/apache/iceberg/MetricsConfig.java:
##########
@@ -107,6 +107,30 @@ public static MetricsConfig forPositionDelete(Table table) {
     return new MetricsConfig(columnModes.build(), defaultMode);
   }
 
+  /**
+   * Creates a metrics config for a position delete file.
+   *
+   * @param props table configuration
+   */
+  public static MetricsConfig forPositionDelete(Map<String, String> props) {

Review Comment:
   Extraction done. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on pull request #6313: Flink: use correct metric config for position deletes

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on PR #6313:
URL: https://github.com/apache/iceberg/pull/6313#issuecomment-1351737859

   thanks @chenjunjiedada for the contribution


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org