You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/09/01 08:45:32 UTC

[GitHub] [iceberg] Fokko commented on a diff in pull request #5665: Core: Correctly project the partition fields

Fokko commented on code in PR #5665:
URL: https://github.com/apache/iceberg/pull/5665#discussion_r960384482


##########
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkDataFile.java:
##########
@@ -84,10 +93,31 @@ public SparkDataFile(Types.StructType type, StructType sparkType) {
     sortOrderIdPosition = positions.get("sort_order_id");
   }
 
+  private void wrapPartitionSpec(GenericRowWithSchema specRow) {
+    // We get all the partition fields, but want to project to the current one
+    StructType wrappedPartitionStruct = specRow.schema();
+
+    if (!wrappedPartitionStruct.equals(currentWrappedPartitionStruct)) {
+      this.currentWrappedPartitionStruct = wrappedPartitionStruct;
+
+      // The original IDs are lost in translation, therefore we apply the ones that we know

Review Comment:
   Thanks @aokolnychyi for chiming in here, much appreciated. I'm just digging into this code for the first time, so bear with me. Wouldn't it make more sense to either pass in all the historical specs (in the case of the metadata column), or just the latest spec when that's needed? It doesn't feel like a scalable approach to pass in a tuple of all the specs, and then only project the ones that we need.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org