You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/03/30 20:17:49 UTC

[GitHub] [iceberg] szehon-ho commented on pull request #1744: Fix Avro Pruning Bugs with ManifestEntries Table

szehon-ho commented on pull request #1744:
URL: https://github.com/apache/iceberg/pull/1744#issuecomment-810549378


   Yes , as we talked offline, I ended up debugging the same thing as @RussellSpitzer while hitting https://github.com/apache/iceberg/issues/1378 . My thought was that that changing the PruneColumns or BuildAvroProjection behaviour would be a bit involved (and maybe not right), so I took a different approach, which I put up as reference:  https://github.com/apache/iceberg/pull/2395
   
   It's a poor workaround of the problem by adding non-empty struct when reading the Metadata (manifest) Entries and All-Entries table, it seems to work for most cases.  I just put it as an option in case there is no good other fix.
   
   Of course if there is a proper fix, I'd be interested to follow/ help wherever I can, though I heard from Russell that it is not very easy :)
   
   In any case would love to see a solution to the bug.  We were trying to write an analytics job that calculated how many data lands in a table per time period, and thought to aggregate entries table and join with snapshot table (which has timestamp), but it seems this bug prevents any aggregates on entries table, without ugly workarounds.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org