You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/06/12 21:25:39 UTC

[GitHub] [iceberg] rdblue commented on pull request #4792: Python: Add BotoFileIO, a FileIO that wraps boto3

rdblue commented on PR #4792:
URL: https://github.com/apache/iceberg/pull/4792#issuecomment-1153296951

   @samredai, to review this, I started looking more into the boto3 API. It looks like the API that you're using isn't a streaming API, which is what we typically want so that we can avoid things like buffering whole files in memory before writing them with a single PUT. When I went looking more into how to use boto3 for streaming reads and streaming writes, I quickly ran into `smart_open`, which appears to do everything that we want.
   
   I think you had a S3FileIO that used smart_open before. Is there a reason not to use that to wrap boto3 now? I think we would be able to avoid maintaining a lot of this code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org