You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "singhpk234 (via GitHub)" <gi...@apache.org> on 2023/04/29 01:20:00 UTC

[GitHub] [iceberg] singhpk234 commented on issue #7463: Spark: inconsistency in rewrite data and summary

singhpk234 commented on issue #7463:
URL: https://github.com/apache/iceberg/issues/7463#issuecomment-1528384960

   As per my understanding it's because of this @Fokko 
   - https://github.com/apache/iceberg/issues/4127
   
   RewriteDataFiles runs with 'use-starting-sequence-number’=true (the default!). 
   Example:
   A partition has Data File with sequence number 1, 
   and applicable Delete File with sequence number 2.
   RewriteDataFiles is run on this partition
   New Data Files are created with sequence number 2.  
   The now-invalid Delete File with sequence number 2 will not be removed.
   
   This is something @szehon-ho's proposal also mentions https://docs.google.com/document/d/11d-cIUR_89kRsMmWnEoxXGZCvp7L4TUmPJqUC60zB5M/edit
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org