You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "dpaani (via GitHub)" <gi...@apache.org> on 2023/05/11 14:32:43 UTC

[GitHub] [iceberg] dpaani opened a new pull request, #7585: Adding support for custom partition spec during rewrite

dpaani opened a new pull request, #7585:
URL: https://github.com/apache/iceberg/pull/7585

   Ref: https://github.com/apache/iceberg/issues/7557


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1192712144


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,99 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    Stream.of(Expressions.bucket("c1", 2), Expressions.bucket("c2", 4))
+        .forEach(expr -> table.updateSpec().addField(expr).commit());
+
+    long dataSizeBefore = testDataSize(table);
+
+    RewriteDataFiles.Result result =
+        basicRewrite(table)
+            .zOrder("c2", "c3")
+            .option(
+                SortStrategy.MAX_FILE_SIZE_BYTES,
+                Integer.toString((averageFileSize(table) / 2) + 2))
+            // Divide files in 2
+            .option(
+                RewriteDataFiles.TARGET_FILE_SIZE_BYTES,
+                Integer.toString(averageFileSize(table) / 2))
+            .option(SortStrategy.MIN_INPUT_FILES, "1")
+            .execute();
+
+    Assert.assertEquals("Should have 1 fileGroups", 1, result.rewriteResults().size());
+    assertThat(result.rewrittenBytesCount()).isEqualTo(dataSizeBefore);
+  }
+
+  @Test
+  public void testRewriteWithDifferentOutputSpecIds() {
+    Table table = createTable(10);
+    shouldHaveFiles(table, 10);
+
+    table.updateProperties().set("write.spark.fanout.enabled", "true").commit();

Review Comment:
   Why is this set? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1204808311


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,104 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    table
+        .updateSpec()
+        .addField(Expressions.bucket("c1", 2))
+        .addField(Expressions.bucket("c2", 2))
+        .commit();
+
+    long dataSizeBefore = testDataSize(table);
+
+    RewriteDataFiles.Result result =
+        basicRewrite(table)
+            .zOrder("c2", "c3")
+            .option(
+                SortStrategy.MAX_FILE_SIZE_BYTES,
+                Integer.toString((averageFileSize(table) / 2) + 2))
+            // Divide files in 2
+            .option(
+                RewriteDataFiles.TARGET_FILE_SIZE_BYTES,
+                Integer.toString(averageFileSize(table) / 2))
+            .option(SortStrategy.MIN_INPUT_FILES, "1")
+            .execute();
+
+    Assert.assertEquals("Should have 1 fileGroups", 1, result.rewriteResults().size());
+    assertThat(result.rewrittenBytesCount()).isEqualTo(dataSizeBefore);
+  }
+
+  @Test
+  public void testRewriteWithDifferentOutputSpecIds() {
+    Table table = createTable(10);
+    shouldHaveFiles(table, 10);
+
+    // simulate multiple partition specs with different commit
+    table.updateSpec().addField(Expressions.truncate("c2", 2)).commit();
+    table.updateSpec().addField(Expressions.bucket("c3", 2)).commit();
+
+    performRewriteAndAssertForAllTableSpecs(table, "bin-pack");
+    performRewriteAndAssertForAllTableSpecs(table, "sort");
+    performRewriteAndAssertForAllTableSpecs(table, "zOrder");
+  }
+
+  private void performRewriteAndAssertForAllTableSpecs(Table table, String strategy) {

Review Comment:
   I still think we are trying to hide too much in helper functions here. The function is difficult to read and assumes a lot about the current state of the system. Can we try to limit the state implied? For example asserting that the table.specs() has size 3 is a bit odd.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on pull request #7585: Adding support for custom partition spec during rewrite

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#issuecomment-1552162099

   hi @RussellSpitzer Could you please re-check this PR. I have addressed your earlier review comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1194195459


##########
core/src/main/java/org/apache/iceberg/actions/SizeBasedFileRewriter.java:
##########
@@ -103,13 +103,16 @@
 
   private static final long SPLIT_OVERHEAD = 5 * 1024;
 
+  public static final String OUTPUT_SPEC_ID = "output-spec-id";

Review Comment:
   Added javadoc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1204802749


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/actions/SparkShufflingDataRewriter.java:
##########
@@ -122,11 +136,12 @@ private Dataset<Row> transformPlan(Dataset<Row> df, Function<LogicalPlan, Logica
   }
 
   private org.apache.iceberg.SortOrder outputSortOrder(List<FileScanTask> group) {
-    boolean includePartitionColumns = !group.get(0).spec().equals(table().spec());
+    PartitionSpec spec = table().specs().get(outputSpecId());
+    boolean includePartitionColumns = !group.get(0).spec().equals(spec);

Review Comment:
   Let's rename this variable, since I think the original name is confusing here. Maybe just call it requiresRepartitioning



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1192713437


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,99 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    Stream.of(Expressions.bucket("c1", 2), Expressions.bucket("c2", 4))
+        .forEach(expr -> table.updateSpec().addField(expr).commit());
+
+    long dataSizeBefore = testDataSize(table);
+
+    RewriteDataFiles.Result result =
+        basicRewrite(table)
+            .zOrder("c2", "c3")
+            .option(
+                SortStrategy.MAX_FILE_SIZE_BYTES,
+                Integer.toString((averageFileSize(table) / 2) + 2))
+            // Divide files in 2
+            .option(
+                RewriteDataFiles.TARGET_FILE_SIZE_BYTES,
+                Integer.toString(averageFileSize(table) / 2))
+            .option(SortStrategy.MIN_INPUT_FILES, "1")
+            .execute();
+
+    Assert.assertEquals("Should have 1 fileGroups", 1, result.rewriteResults().size());
+    assertThat(result.rewrittenBytesCount()).isEqualTo(dataSizeBefore);
+  }
+
+  @Test
+  public void testRewriteWithDifferentOutputSpecIds() {
+    Table table = createTable(10);
+    shouldHaveFiles(table, 10);
+
+    table.updateProperties().set("write.spark.fanout.enabled", "true").commit();
+    table.replaceSortOrder().asc("c1").asc("c2").commit();
+
+    Stream.of(Expressions.bucket("c1", 2), Expressions.bucket("c2", 4))
+        .forEach(expr -> table.updateSpec().addField(expr).commit());
+
+    performRewriteAndAssertForAllTableSpecs(table, "bin-pack");
+    performRewriteAndAssertForAllTableSpecs(table, "sort");
+    performRewriteAndAssertForAllTableSpecs(table, "zOrder");
+  }
+
+  private void performRewriteAndAssertForAllTableSpecs(Table table, String strategy) {
+    assertThat(strategy).isIn("bin-pack", "sort", "zOrder");

Review Comment:
   seems unnecessary?  This is just checking that the lines directly above are correct?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1204805379


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,104 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    table
+        .updateSpec()
+        .addField(Expressions.bucket("c1", 2))
+        .addField(Expressions.bucket("c2", 2))
+        .commit();
+
+    long dataSizeBefore = testDataSize(table);
+
+    RewriteDataFiles.Result result =
+        basicRewrite(table)
+            .zOrder("c2", "c3")
+            .option(
+                SortStrategy.MAX_FILE_SIZE_BYTES,
+                Integer.toString((averageFileSize(table) / 2) + 2))
+            // Divide files in 2
+            .option(
+                RewriteDataFiles.TARGET_FILE_SIZE_BYTES,
+                Integer.toString(averageFileSize(table) / 2))
+            .option(SortStrategy.MIN_INPUT_FILES, "1")
+            .execute();
+
+    Assert.assertEquals("Should have 1 fileGroups", 1, result.rewriteResults().size());
+    assertThat(result.rewrittenBytesCount()).isEqualTo(dataSizeBefore);
+  }
+
+  @Test
+  public void testRewriteWithDifferentOutputSpecIds() {
+    Table table = createTable(10);
+    shouldHaveFiles(table, 10);
+
+    // simulate multiple partition specs with different commit
+    table.updateSpec().addField(Expressions.truncate("c2", 2)).commit();
+    table.updateSpec().addField(Expressions.bucket("c3", 2)).commit();
+
+    performRewriteAndAssertForAllTableSpecs(table, "bin-pack");
+    performRewriteAndAssertForAllTableSpecs(table, "sort");
+    performRewriteAndAssertForAllTableSpecs(table, "zOrder");
+  }
+
+  private void performRewriteAndAssertForAllTableSpecs(Table table, String strategy) {
+    assertThat(table.specs()).hasSize(3);
+
+    table
+        .specs()
+        .entrySet()
+        .forEach(
+            specEntry -> {
+              long dataSize = testDataSize(table);
+              long count = currentData().size();
+
+              RewriteDataFiles.Result result =
+                  executeRewriteStrategy(table, specEntry.getKey(), strategy);
+              assertThat(dataSize).isEqualTo(result.rewrittenBytesCount());
+
+              long postRewriteCount = currentData().size();
+              assertThat(postRewriteCount).isEqualTo(count);
+
+              assertSpecIdFromDataFiles(specEntry, currentDataFiles(table));
+            });
+  }
+
+  private RewriteDataFiles.Result executeRewriteStrategy(
+      Table table, Integer outputSpecId, String strategy) {
+
+    RewriteDataFiles rewriteDataFiles =
+        basicRewrite(table)
+            .option(SparkWriteOptions.OUTPUT_SPEC_ID, String.valueOf(outputSpecId))
+            .option(BinPackStrategy.REWRITE_ALL, "true");
+
+    RewriteDataFiles.Result result = null;
+    if (strategy.equals("bin-pack")) {
+      result = rewriteDataFiles.binPack().execute();
+    } else if (strategy.equals("sort")) {
+      result =
+          rewriteDataFiles
+              .sort(SortOrder.builderFor(table.schema()).asc("c2").asc("c3").build())
+              .execute();
+    } else if (strategy.equals("zOrder")) {
+      result = rewriteDataFiles.zOrder("c2", "c3").execute();
+    }
+    return result;
+  }
+
+  private void assertSpecIdFromDataFiles(
+      Map.Entry<Integer, PartitionSpec> specEntry, List<DataFile> filesPostRewriteAll) {
+    List<Integer> specIds =
+        filesPostRewriteAll.stream().map(DataFile::specId).distinct().collect(Collectors.toList());
+
+    assertThat(specIds).containsOnlyOnce(specEntry.getKey());

Review Comment:
   Not sure this is the right check. Should be true for any specId that exists since we called "distinct". In that case should just be "contains()". But that's also probably night right here because we expect there to only be a single element here as well. Let's make sure this is testing exactly what we want.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1194198647


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,99 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    Stream.of(Expressions.bucket("c1", 2), Expressions.bucket("c2", 4))
+        .forEach(expr -> table.updateSpec().addField(expr).commit());
+
+    long dataSizeBefore = testDataSize(table);
+
+    RewriteDataFiles.Result result =
+        basicRewrite(table)
+            .zOrder("c2", "c3")
+            .option(
+                SortStrategy.MAX_FILE_SIZE_BYTES,
+                Integer.toString((averageFileSize(table) / 2) + 2))
+            // Divide files in 2
+            .option(
+                RewriteDataFiles.TARGET_FILE_SIZE_BYTES,
+                Integer.toString(averageFileSize(table) / 2))
+            .option(SortStrategy.MIN_INPUT_FILES, "1")
+            .execute();
+
+    Assert.assertEquals("Should have 1 fileGroups", 1, result.rewriteResults().size());
+    assertThat(result.rewrittenBytesCount()).isEqualTo(dataSizeBefore);
+  }
+
+  @Test
+  public void testRewriteWithDifferentOutputSpecIds() {
+    Table table = createTable(10);
+    shouldHaveFiles(table, 10);
+
+    table.updateProperties().set("write.spark.fanout.enabled", "true").commit();
+    table.replaceSortOrder().asc("c1").asc("c2").commit();

Review Comment:
   since test was written for sort rewrite, I added sort order. Now i removed it and passing it as parameter during sort rewrite



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1204630847


##########
api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java:
##########
@@ -120,6 +120,14 @@
 
   String REWRITE_JOB_ORDER_DEFAULT = RewriteJobOrder.NONE.orderName();
 
+  /**
+   * Constant representing the target output partition specification ID.
+   *
+   * <p>This ID is used by the file rewriter during the rewrite operation to identify the specific
+   * output partition.

Review Comment:
   partition -> "partition spec. This allows rewriting files into a partitioning which is not the current schema's partitioning. When not specified the current partitioning is used."
   
   Or something like that?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1192709295


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/actions/SparkShufflingDataRewriter.java:
##########
@@ -67,6 +69,10 @@ protected SparkShufflingDataRewriter(SparkSession spark, Table table) {
 
   protected abstract org.apache.iceberg.SortOrder sortOrder();
 
+  protected Schema schema() {

Review Comment:
   Can we name this something sort specific? sortSchema? I want to be clear that this is not necessarily the output schema or the input schema. Javadoc may help



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1192710705


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,99 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    Stream.of(Expressions.bucket("c1", 2), Expressions.bucket("c2", 4))

Review Comment:
   Any reason why we are adding this in two separate commits? Should repo with a single update correct?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1192706550


##########
core/src/main/java/org/apache/iceberg/actions/SizeBasedFileRewriter.java:
##########
@@ -103,13 +103,16 @@
 
   private static final long SPLIT_OVERHEAD = 5 * 1024;
 
+  public static final String OUTPUT_SPEC_ID = "output-spec-id";

Review Comment:
   Needs Java doc, see above options like MAX_FILE_GROUP_SIZE
   Also this property is probably not SizeBasedFileRewriter Specific. I think it should live in RewriteDataFiles since it applies to all possible future rewrites.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1194196706


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,99 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    Stream.of(Expressions.bucket("c1", 2), Expressions.bucket("c2", 4))

Review Comment:
   I wanted to have 3 diff partition specs to test



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1194195792


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/actions/SparkShufflingDataRewriter.java:
##########
@@ -67,6 +69,10 @@ protected SparkShufflingDataRewriter(SparkSession spark, Table table) {
 
   protected abstract org.apache.iceberg.SortOrder sortOrder();
 
+  protected Schema schema() {

Review Comment:
   renamed to sortSchema and added javadoc



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1204749594


##########
core/src/main/java/org/apache/iceberg/actions/SizeBasedFileRewriter.java:
##########
@@ -250,6 +252,11 @@ protected long writeMaxFileSize() {
     return (long) (targetFileSize + ((maxFileSize - targetFileSize) * 0.5));
   }
 
+  /** Output spec id rewriter to use. Default to current spec id */

Review Comment:
   nvm, I think it probably makes more sense here. Although the javadoc comment is not accurate. This code only returns the outputSpecId saved by this class. The Default logic is in the other function.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1204806850


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,104 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    table
+        .updateSpec()
+        .addField(Expressions.bucket("c1", 2))
+        .addField(Expressions.bucket("c2", 2))
+        .commit();
+
+    long dataSizeBefore = testDataSize(table);
+
+    RewriteDataFiles.Result result =
+        basicRewrite(table)
+            .zOrder("c2", "c3")
+            .option(
+                SortStrategy.MAX_FILE_SIZE_BYTES,
+                Integer.toString((averageFileSize(table) / 2) + 2))
+            // Divide files in 2
+            .option(
+                RewriteDataFiles.TARGET_FILE_SIZE_BYTES,
+                Integer.toString(averageFileSize(table) / 2))
+            .option(SortStrategy.MIN_INPUT_FILES, "1")
+            .execute();
+
+    Assert.assertEquals("Should have 1 fileGroups", 1, result.rewriteResults().size());
+    assertThat(result.rewrittenBytesCount()).isEqualTo(dataSizeBefore);
+  }
+
+  @Test
+  public void testRewriteWithDifferentOutputSpecIds() {
+    Table table = createTable(10);
+    shouldHaveFiles(table, 10);
+
+    // simulate multiple partition specs with different commit
+    table.updateSpec().addField(Expressions.truncate("c2", 2)).commit();
+    table.updateSpec().addField(Expressions.bucket("c3", 2)).commit();
+
+    performRewriteAndAssertForAllTableSpecs(table, "bin-pack");
+    performRewriteAndAssertForAllTableSpecs(table, "sort");
+    performRewriteAndAssertForAllTableSpecs(table, "zOrder");
+  }
+
+  private void performRewriteAndAssertForAllTableSpecs(Table table, String strategy) {
+    assertThat(table.specs()).hasSize(3);
+
+    table
+        .specs()
+        .entrySet()
+        .forEach(
+            specEntry -> {
+              long dataSize = testDataSize(table);
+              long count = currentData().size();
+
+              RewriteDataFiles.Result result =
+                  executeRewriteStrategy(table, specEntry.getKey(), strategy);
+              assertThat(dataSize).isEqualTo(result.rewrittenBytesCount());
+
+              long postRewriteCount = currentData().size();
+              assertThat(postRewriteCount).isEqualTo(count);
+
+              assertSpecIdFromDataFiles(specEntry, currentDataFiles(table));
+            });
+  }
+
+  private RewriteDataFiles.Result executeRewriteStrategy(
+      Table table, Integer outputSpecId, String strategy) {
+
+    RewriteDataFiles rewriteDataFiles =
+        basicRewrite(table)
+            .option(SparkWriteOptions.OUTPUT_SPEC_ID, String.valueOf(outputSpecId))
+            .option(BinPackStrategy.REWRITE_ALL, "true");
+
+    RewriteDataFiles.Result result = null;
+    if (strategy.equals("bin-pack")) {
+      result = rewriteDataFiles.binPack().execute();
+    } else if (strategy.equals("sort")) {
+      result =
+          rewriteDataFiles
+              .sort(SortOrder.builderFor(table.schema()).asc("c2").asc("c3").build())
+              .execute();
+    } else if (strategy.equals("zOrder")) {
+      result = rewriteDataFiles.zOrder("c2", "c3").execute();
+    }
+    return result;
+  }
+
+  private void assertSpecIdFromDataFiles(

Review Comment:
   Not sure what this function name is checking. Probably needs a rename and defined purpose. Seems like at the moment it's checking whether a single id is used for any data file in a list of data files. We probably want a function that is just checking whether only a single partition spec id is used.
   
   The function arg is also confusing, why do we take "specEntry" when we don't use the PartitionSpec



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1204808629


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,104 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    table
+        .updateSpec()
+        .addField(Expressions.bucket("c1", 2))
+        .addField(Expressions.bucket("c2", 2))
+        .commit();
+
+    long dataSizeBefore = testDataSize(table);
+
+    RewriteDataFiles.Result result =
+        basicRewrite(table)
+            .zOrder("c2", "c3")
+            .option(
+                SortStrategy.MAX_FILE_SIZE_BYTES,
+                Integer.toString((averageFileSize(table) / 2) + 2))
+            // Divide files in 2
+            .option(
+                RewriteDataFiles.TARGET_FILE_SIZE_BYTES,
+                Integer.toString(averageFileSize(table) / 2))
+            .option(SortStrategy.MIN_INPUT_FILES, "1")
+            .execute();
+
+    Assert.assertEquals("Should have 1 fileGroups", 1, result.rewriteResults().size());
+    assertThat(result.rewrittenBytesCount()).isEqualTo(dataSizeBefore);
+  }
+
+  @Test
+  public void testRewriteWithDifferentOutputSpecIds() {
+    Table table = createTable(10);

Review Comment:
   I think we should just break this out into 3 tests, one for binpack one for sort and one for zorder.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1194197056


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,99 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    Stream.of(Expressions.bucket("c1", 2), Expressions.bucket("c2", 4))
+        .forEach(expr -> table.updateSpec().addField(expr).commit());
+
+    long dataSizeBefore = testDataSize(table);
+
+    RewriteDataFiles.Result result =
+        basicRewrite(table)
+            .zOrder("c2", "c3")
+            .option(
+                SortStrategy.MAX_FILE_SIZE_BYTES,
+                Integer.toString((averageFileSize(table) / 2) + 2))
+            // Divide files in 2
+            .option(
+                RewriteDataFiles.TARGET_FILE_SIZE_BYTES,
+                Integer.toString(averageFileSize(table) / 2))
+            .option(SortStrategy.MIN_INPUT_FILES, "1")
+            .execute();
+
+    Assert.assertEquals("Should have 1 fileGroups", 1, result.rewriteResults().size());
+    assertThat(result.rewrittenBytesCount()).isEqualTo(dataSizeBefore);
+  }
+
+  @Test
+  public void testRewriteWithDifferentOutputSpecIds() {
+    Table table = createTable(10);
+    shouldHaveFiles(table, 10);
+
+    table.updateProperties().set("write.spark.fanout.enabled", "true").commit();

Review Comment:
   not required. removed it 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1204752874


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/actions/SparkShufflingDataRewriter.java:
##########
@@ -67,6 +69,17 @@ protected SparkShufflingDataRewriter(SparkSession spark, Table table) {
 
   protected abstract org.apache.iceberg.SortOrder sortOrder();
 
+  /**
+   * Retrieves and returns the schema for the rewrite using the current table schema.

Review Comment:
   I think this should be rewritten. We don't need the explanation that it can be overridden and I think we should mention "The schema with all columns required for correctly sorting the table. This may include additional computed columns which are not written to the table but are used for sorting."



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1204748497


##########
core/src/main/java/org/apache/iceberg/actions/SizeBasedFileRewriter.java:
##########
@@ -250,6 +252,11 @@ protected long writeMaxFileSize() {
     return (long) (targetFileSize + ((maxFileSize - targetFileSize) * 0.5));
   }
 
+  /** Output spec id rewriter to use. Default to current spec id */

Review Comment:
   nit: why this is function here and not with the outputSpecId(config) method?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on pull request #7585: Adding support for custom partition spec during rewrite

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#issuecomment-1544102352

   hi @RussellSpitzer @gustavoatt Could you please review this PR. TIA
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] dpaani commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "dpaani (via GitHub)" <gi...@apache.org>.
dpaani commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1194198895


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,99 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    Stream.of(Expressions.bucket("c1", 2), Expressions.bucket("c2", 4))
+        .forEach(expr -> table.updateSpec().addField(expr).commit());
+
+    long dataSizeBefore = testDataSize(table);
+
+    RewriteDataFiles.Result result =
+        basicRewrite(table)
+            .zOrder("c2", "c3")
+            .option(
+                SortStrategy.MAX_FILE_SIZE_BYTES,
+                Integer.toString((averageFileSize(table) / 2) + 2))
+            // Divide files in 2
+            .option(
+                RewriteDataFiles.TARGET_FILE_SIZE_BYTES,
+                Integer.toString(averageFileSize(table) / 2))
+            .option(SortStrategy.MIN_INPUT_FILES, "1")
+            .execute();
+
+    Assert.assertEquals("Should have 1 fileGroups", 1, result.rewriteResults().size());
+    assertThat(result.rewrittenBytesCount()).isEqualTo(dataSizeBefore);
+  }
+
+  @Test
+  public void testRewriteWithDifferentOutputSpecIds() {
+    Table table = createTable(10);
+    shouldHaveFiles(table, 10);
+
+    table.updateProperties().set("write.spark.fanout.enabled", "true").commit();
+    table.replaceSortOrder().asc("c1").asc("c2").commit();
+
+    Stream.of(Expressions.bucket("c1", 2), Expressions.bucket("c2", 4))
+        .forEach(expr -> table.updateSpec().addField(expr).commit());
+
+    performRewriteAndAssertForAllTableSpecs(table, "bin-pack");
+    performRewriteAndAssertForAllTableSpecs(table, "sort");
+    performRewriteAndAssertForAllTableSpecs(table, "zOrder");
+  }
+
+  private void performRewriteAndAssertForAllTableSpecs(Table table, String strategy) {
+    assertThat(strategy).isIn("bin-pack", "sort", "zOrder");

Review Comment:
   Agree. removed it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7585: Adding support for custom partition spec during rewrite

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1192712614


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,99 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    Stream.of(Expressions.bucket("c1", 2), Expressions.bucket("c2", 4))
+        .forEach(expr -> table.updateSpec().addField(expr).commit());
+
+    long dataSizeBefore = testDataSize(table);
+
+    RewriteDataFiles.Result result =
+        basicRewrite(table)
+            .zOrder("c2", "c3")
+            .option(
+                SortStrategy.MAX_FILE_SIZE_BYTES,
+                Integer.toString((averageFileSize(table) / 2) + 2))
+            // Divide files in 2
+            .option(
+                RewriteDataFiles.TARGET_FILE_SIZE_BYTES,
+                Integer.toString(averageFileSize(table) / 2))
+            .option(SortStrategy.MIN_INPUT_FILES, "1")
+            .execute();
+
+    Assert.assertEquals("Should have 1 fileGroups", 1, result.rewriteResults().size());
+    assertThat(result.rewrittenBytesCount()).isEqualTo(dataSizeBefore);
+  }
+
+  @Test
+  public void testRewriteWithDifferentOutputSpecIds() {
+    Table table = createTable(10);
+    shouldHaveFiles(table, 10);
+
+    table.updateProperties().set("write.spark.fanout.enabled", "true").commit();
+    table.replaceSortOrder().asc("c1").asc("c2").commit();

Review Comment:
   Why are we configuring sort order?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org