You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "Samrose-Ahmed (via GitHub)" <gi...@apache.org> on 2023/03/02 04:41:38 UTC

[GitHub] [arrow-rs] Samrose-Ahmed opened a new issue, #3784: object_store: Why does builder take bucket?

Samrose-Ahmed opened a new issue, #3784:
URL: https://github.com/apache/arrow-rs/issues/3784

   **Which part is this question about**
   <!--
   Is it code base, library api, documentation or some other part?
   -->
   
   object_store
   
   **Describe your question**
   
   It is a poor API design to require the bucket to be set in the builder per client. That is not how e.g. an S3 client works, the bucket is just a runtime argument. 
   
   This makes it very hard to write code where you're dealing with dynamic S3 sources you don't know ahead of time. One is forced to create a lazy hashmap of buckets to ObjectStores, which is needlessly complicated for a simple task. With e.g. the AWS SDK Rust client, one would simply create one client and pass the bucket in as an argument at runtime.
   
   **Additional context**
   <!--
   Add any other context about the problem here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #3784: object_store: Why does builder take bucket?

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #3784:
URL: https://github.com/apache/arrow-rs/issues/3784#issuecomment-1451884876

   There a couple of reasons but the most compelling reason is so that the abstraction can then be mapped onto stores that don't have a similar namespacing concept, e.g. local filesystems. It allows it to expose a pure key-value store interface, without complicating it with buckets, regions, etc...
   
   > This makes it very hard to write code where you're dealing with dynamic S3 sources
   
   I accept that if solely dealing with S3 buckets in the same region with the same credentials, this is an overhead. Perhaps we could look to upstream something similar to DataFusion's [`ObjectStoreRegistry`](https://docs.rs/datafusion/latest/datafusion/datasource/object_store/struct.ObjectStoreRegistry.html) and [`ObjectStoreProvider`](https://docs.rs/datafusion/latest/datafusion/datasource/object_store/trait.ObjectStoreProvider.html)
   
   > It is a poor API design to require the bucket to be set in the builder per client
   
   Are you proposing including the bucket name as an argument for every function in the `ObjectStore` API? What would this map to for LocalFileSystem?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold closed issue #3784: object_store: Why does builder take bucket?

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold closed issue #3784: object_store: Why does builder take bucket?
URL: https://github.com/apache/arrow-rs/issues/3784


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org