You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/07/26 19:06:24 UTC

[GitHub] [iceberg] RussellSpitzer edited a comment on issue #2783: Metadata Table Empty Projection -Unknown type for int field. Type name: java.lang.String

RussellSpitzer edited a comment on issue #2783:
URL: https://github.com/apache/iceberg/issues/2783#issuecomment-886952345


   So I think I tracked this down, the basic issue is that Spark 3.1 correctly prunes nested structs and Spark 3.0 does not. You may wonder, if Spark3.1 correctly prunes nested structs why is this an issue?
   
   The issue is that we end up reading only 2 fields out of our metadata tables and correctly present them. But our create UnsafeProjection code assumes that if a nested struct is read, then all fields are read so we end up building a projection which requires all columns, rather than just the ones we have actually extracted. This means we build a broken projection.
   
   
   
   See RowDataReader projection, which only does a top-level pruning.
   https://github.com/apache/iceberg/blob/a79de571860a290f6e96ac562d616c9c6be2071e/spark/src/main/java/org/apache/iceberg/spark/source/RowDataReader.java#L208-L211
   
   If we never prune columns out of the struct this is fine, if we do then we have a problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org