You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/16 02:47:40 UTC

[GitHub] [iceberg] wypoon commented on issue #6042: Add delete file information to partitions table

wypoon commented on issue #6042:
URL: https://github.com/apache/iceberg/issues/6042#issuecomment-1316220819

   I'm trying to understand the proposed behavior.
   To go back to @ajantha-bhat's example: Suppose you have a partition `{A}` with `record_count`=6 and `file_count`=2 (3 records in each file). Suppose you now delete 3 records in one file. I understand that `pos_delete_file_count` will be 1 and `pos_delete_record_count` will be 3. But what about `record_count` and `file_count`? Will `file_count` be 3 (is it supposed to be the total number of data files, including delete files)? And `record_count`? When is it possible to correctly compute the `record_count` using metadata alone (without applying delete files)?
   
   Another example: Suppose you have two partitions `{A}` and `{B}`. Let's say `record_count`=1000 and `file_count`=1 for partition `{B}`. Suppose you rename `B` to `C` (using an `UPDATE <table> SET <partition column> = 'C' where <partition column = 'B'` where we use merge-on-read, resulting in 1 delete file and 1 new pure data file). If you do a `SELECT * FROM <table>.partitions` currently, you will get an entry for each of `{A}`, `{B}` and `{C}`. What should the behavior be (should there be an entry for `{B}` and if so, what should be shown for it? and for `{C}`?)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org