You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "cgpoh (via GitHub)" <gi...@apache.org> on 2023/04/20 15:22:36 UTC

[GitHub] [iceberg] cgpoh commented on issue #7383: After running ExpireSnapshots, metadata json and avro files still not deleted

cgpoh commented on issue #7383:
URL: https://github.com/apache/iceberg/issues/7383#issuecomment-1516526142

   > Manifest files are removed when there are no longer any snapshots referring to them, not when they are too old. For example, a manifest file might be 10 days old, but the current Snapshot may still refer to that file.
   > 
   > The JSON files are different,they are not actually tracked or removed by expired snapshots. At least if I can remember correctly. They could be removed by remove orphan files though, which would remove all metadata.json files not listed in the current metadata.json
   
   Thanks for the reply! Can help me understand, every update to the table, there will be new snapshot created. Let’s say I’m committing to the table every 2 mins, in order to keep the number of manifest files small, I should expire current snapshot time - 4mins?
   
   Another question is for deleteorphan action, we can only use Spark to do that, correct? I can’t find any Flink or table api that uses deleteorphan action.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org