You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "dpaani (via GitHub)" <gi...@apache.org> on 2023/05/11 15:02:32 UTC

[GitHub] [iceberg] dpaani opened a new issue, #7586: Rewrite with zORDER results in 'Cannot find field 'ICEZVALUE' in struct' Error

dpaani opened a new issue, #7586:
URL: https://github.com/apache/iceberg/issues/7586

   ### Apache Iceberg version
   
   1.2.1 (latest release)
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   When executing the following test case, which aims to rewrite files with ZORDER, Spark throws an error:
   
   ```
     @Test
     public void testZOrderSortPartitionEvolution() {
       int originalFiles = 20;
       Table table = createTable(originalFiles);
       shouldHaveLastCommitUnsorted(table, "c2");
       shouldHaveFiles(table, originalFiles);
   
       Stream.of(Expressions.bucket("c1", 2), Expressions.bucket("c2", 4))
               .forEach(expr -> table.updateSpec().addField(expr).commit());
   
       long dataSizeBefore = testDataSize(table);
   
       RewriteDataFiles.Result result =
               basicRewrite(table)
                       .zOrder("c2", "c3")
                       .option(
                               SortStrategy.MAX_FILE_SIZE_BYTES,
                               Integer.toString((averageFileSize(table) / 2) + 2))
                       // Divide files in 2
                       .option(
                               RewriteDataFiles.TARGET_FILE_SIZE_BYTES,
                               Integer.toString(averageFileSize(table) / 2))
                       .option(SortStrategy.MIN_INPUT_FILES, "1")
                       .execute();
   
       Assert.assertEquals("Should have 1 fileGroups", 1, result.rewriteResults().size());
       assertThat(result.rewrittenBytesCount()).isEqualTo(dataSizeBefore);
     }
   ```
   
   The error is as follows:
   ```
   Cannot find field 'ICEZVALUE' in struct: struct<1: c1: optional int, 2: c2: optional string, 3: c3: optional string>
   org.apache.iceberg.exceptions.ValidationException: Cannot find field 'ICEZVALUE' in struct: struct<1: c1: optional int, 2: c2: optional string, 3: c3: optional string>
   	at app//org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49)
   	at app//org.apache.iceberg.expressions.NamedReference.bind(NamedReference.java:45)
   	at app//org.apache.iceberg.expressions.NamedReference.bind(NamedReference.java:26)
   	at app//org.apache.iceberg.SortOrder$Builder.addSortField(SortOrder.java:249)
   	at app//org.apache.iceberg.SortOrder$Builder.sortBy(SortOrder.java:227)
   	at app//org.apache.iceberg.util.CopySortOrderFields.field(CopySortOrderFields.java:36)
   	at app//org.apache.iceberg.util.CopySortOrderFields.field(CopySortOrderFields.java:27)
   	at app//org.apache.iceberg.transforms.SortOrderVisitor.visit(SortOrderVisitor.java:76)
   	at app//org.apache.iceberg.util.SortOrderUtil.buildSortOrder(SortOrderUtil.java:104)
   	at app//org.apache.iceberg.util.SortOrderUtil.buildSortOrder(SortOrderUtil.java:46)
   	at app//org.apache.iceberg.spark.actions.SparkShufflingDataRewriter.outputSortOrder(SparkShufflingDataRewriter.java:130)
   	at app//org.apache.iceberg.spark.actions.SparkShufflingDataRewriter.sortFunction(SparkShufflingDataRewriter.java:110)
   	at app//org.apache.iceberg.spark.actions.SparkShufflingDataRewriter.doRewrite(SparkShufflingDataRewriter.java:96)
   	at app//org.apache.iceberg.spark.actions.SparkSizeBasedDataRewriter.rewrite(SparkSizeBasedDataRewriter.java:58)
   	at app//org.apache.iceberg.spark.actions.RewriteDataFilesSparkAction.lambda$rewriteFiles$2(RewriteDataFilesSparkAction.java:243)
   	at app//org.apache.iceberg.spark.actions.BaseSparkAction.withJobGroupInfo(BaseSparkAction.java:137)
   	at app//org.apache.iceberg.spark.actions.RewriteDataFilesSparkAction.rewriteFiles(RewriteDataFilesSparkAction.java:241)
   	at app//org.apache.iceberg.spark.actions.RewriteDataFilesSparkAction.lambda$doExecute$4(RewriteDataFilesSparkAction.java:285)
   	at app//org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
   	at app//org.apache.iceberg.util.Tasks$Builder.access$300(Tasks.java:69)
   	at app//org.apache.iceberg.util.Tasks$Builder$1.run(Tasks.java:315)
   	at java.base@11.0.4/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
   	at java.base@11.0.4/java.util.concurrent.FutureTask.run(FutureTask.java:264)
   	at java.base@11.0.4/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   	at java.base@11.0.4/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
   	at java.base@11.0.4/java.lang.Thread.run(Thread.java:834)
   
   ```
   
   This issue also persists in Spark version 3.3.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


Re: [I] Rewrite with zORDER results in 'Cannot find field 'ICEZVALUE' in struct' Error [iceberg]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #7586:
URL: https://github.com/apache/iceberg/issues/7586#issuecomment-1800646075

   This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


Re: [I] Rewrite with zORDER results in 'Cannot find field 'ICEZVALUE' in struct' Error [iceberg]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #7586: Rewrite with zORDER results in 'Cannot find field 'ICEZVALUE' in struct' Error
URL: https://github.com/apache/iceberg/issues/7586


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on issue #7586: Rewrite with zORDER results in 'Cannot find field 'ICEZVALUE' in struct' Error

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on issue #7586:
URL: https://github.com/apache/iceberg/issues/7586#issuecomment-1544179907

   The error is due to Partition Evolution. The ShufflingDataWriter attempts to do a partitioned based shuffle along with the sort order when dealing with files not written in the output partition spec.
   
   https://github.com/apache/iceberg/blob/master/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/SparkShufflingDataRewriter.java#L120
   
   The problem is that for ZOrder tables the SortOrder() is on a column which does not exist in "table()" because it is the computed Z value. This Causes the sort order builder to fail. It instead needs to use the schema of the table + the Zorder column to create the correct output order.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


Re: [I] Rewrite with zORDER results in 'Cannot find field 'ICEZVALUE' in struct' Error [iceberg]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #7586:
URL: https://github.com/apache/iceberg/issues/7586#issuecomment-1821890805

   This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org