You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "nastra (via GitHub)" <gi...@apache.org> on 2023/05/10 09:04:11 UTC

[GitHub] [iceberg] nastra commented on a diff in pull request #7361: API, Core, Spark:Add file groups failure in rewrite result

nastra commented on code in PR #7361:
URL: https://github.com/apache/iceberg/pull/7361#discussion_r1189564974


##########
api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java:
##########
@@ -197,6 +199,10 @@ default int rewrittenDataFilesCount() {
     default long rewrittenBytesCount() {
       return rewriteResults().stream().mapToLong(FileGroupRewriteResult::rewrittenBytesCount).sum();
     }
+
+    default int failedDataFilesCount() {

Review Comment:
   this should have `@Value.Default`



##########
core/src/main/java/org/apache/iceberg/actions/BaseRewriteDataFilesResult.java:
##########
@@ -30,12 +31,21 @@
 public class BaseRewriteDataFilesResult implements Result {
   private final List<FileGroupRewriteResult> rewriteResults;
 
-  public BaseRewriteDataFilesResult(List<FileGroupRewriteResult> rewriteResults) {
+  private final List<FileGroupFailureResult> rewriteFailures;

Review Comment:
   this class is deprecated and not used anymore, so no need to add changes to it. Not modifying this class also ensures that no API-breaking changes are introduced (as indicated by RevAPI)



##########
api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java:
##########
@@ -181,6 +181,8 @@ default RewriteDataFiles zOrder(String... columns) {
   interface Result {
     List<FileGroupRewriteResult> rewriteResults();
 
+    List<FileGroupFailureResult> rewriteFailures();

Review Comment:
   to avoid introducing API-breaking changes, what you'd rather want to do is 
   ```
       @Value.Default
       default List<FileGroupFailureResult> rewriteFailures() {
         return ImmutableList.of();
       }
   ```



##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -756,6 +756,8 @@ public void testPartialProgressWithRewriteFailure() {
     RewriteDataFiles.Result result = spyRewrite.execute();
 
     Assert.assertEquals("Should have 7 fileGroups", result.rewriteResults().size(), 7);
+    Assert.assertEquals("Should have 3 failed fileGroups", result.rewriteFailures().size(), 3);

Review Comment:
   I think it's better to change this to 
   ```
       assertThat(result.rewriteResults()).hasSize(7);
       assertThat(result.rewriteFailures()).hasSize(3);
       assertThat(result.failedDataFilesCount()).isEqualTo(6);
   ```
   
   the advantage here is that the content of `result.rewriteResults()` / `result.rewriteFailures()` will be shown if the assertion ever fails, which makes debugging a lot easier.
   
   Could you please do the same changes further below in L801ff



##########
.palantir/revapi.yml:
##########
@@ -426,6 +426,12 @@ acceptedBreaks:
     - code: "java.field.removedWithConstant"
       old: "field org.apache.iceberg.TableProperties.HMS_TABLE_OWNER"
       justification: "Removing deprecations for 1.3.0"
+    - code: "java.method.numberOfParametersChanged"

Review Comment:
   @waltczhang can you please revert all changes to the `revapi.yml`? Those shouldn't be necessary anymore once my other comments are applied



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org