You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/20 20:17:30 UTC

[GitHub] [arrow] westonpace commented on pull request #12625: ARROW-15587: [C++] Add support for all options specified by substrait::ReadRel::LocalFiles::FileOrFiles

westonpace commented on PR #12625:
URL: https://github.com/apache/arrow/pull/12625#issuecomment-1104419660

   Originally my goal was to avoid reimplenting the glob logic.  However, at this point, I think we have to do that for Windows anyways so it might not be as much complexity as I was concerned about.
   
   I'm not entirely sure I agree that local filesystems and remote filesystems would implement glob in the exact same way.  For example, if the glob is `/foo/bar*.txt` then a remote filesystem would probably issues a prefix request for `/foo/bar` and filter from there which might be more efficient than a directory-crawling approach that would issue a prefix request for `/foo` and then filter from there.
   
   However, I agree that it would be much simpler to have a user utility and not modify the filesystem API.  It would also give us glob support for all filesystem immediately instead of waiting for support being added one-by-one.  I also don't think the performance difference between the two approaches would matter in most cases.
   
   So yes, I think Antoine is right.  Apologies for not suggesting this approach sooner.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org