You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/08 11:23:03 UTC

[GitHub] [arrow-datafusion] Cheappie commented on issue #2445: ObjectStore Directory Semantics

Cheappie commented on issue #2445:
URL: https://github.com/apache/arrow-datafusion/issues/2445#issuecomment-1120399875

   In my case existing design of ObjectStore interface forced me to re-engineer ListingTable in order to provide yet another way of listing data source.
   
   From my perspective It might be beneficial to push information about data source from TableProvider to ObjectStore. Then ObjectStore for a local file system, would combine data(table) location and strategy for listing that kind of storage. As a result listing methods present in ObjectStore could drop the concept of path as a way to access data.
   
   Then ObjectStore could offer more generic interface with two methods:
   * list(filters)
     * query filters should be available in ObjectStore list method, to let anyone provide their own predicate pushdown algorithm
   * file_reader(sized_file)
   
   Such interface should allow us to provide any kind of listing approach(dir, glob, etc), what do you think ?
   
   It's not a necessity but last component bound to a path is SizedFile, where actually outside of ObjectStore It could be treated as abstract blob with characteristics e.g. `size` because only ObjectStore should know how to access It via `file_reader`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org