You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by "steveloughran (via GitHub)" <gi...@apache.org> on 2023/03/18 12:32:40 UTC

[GitHub] [iceberg] steveloughran commented on pull request #7127: [Core][Spark] Improve DeleteOrphanFiles action to return additional details of deleted orphan files

steveloughran commented on PR #7127:
URL: https://github.com/apache/iceberg/pull/7127#issuecomment-1474837901

   > One of the common causes of delete failure in the public cloud is hitting the API quotas and unnecessary re-runs of the delete action
   
   Not really. What size of pages are you issuing delete requests to S3?  Each object, even in a bulk delete, is one Write IOP; if you send a full 1000 entry list then under certain conditions the sleep and retry of that request is it's own thundering herd. Since [HADOOP-16823](https://issues.apache.org/jira/browse/HADOOP-16823) we've had a default page size of 200 and nobody complains about 503-triggered deletion failures, even when partition rebalancing operations massively reduce IO capacity . If you are seeing problems the S3A connector then complain. If you are seeing it in your own code -why not fix it there rather than expose the failure to apps?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org