You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/17 17:04:22 UTC

[GitHub] [arrow-datafusion] tustvold opened a new issue, #2562: File URI Scheme Interpretation

tustvold opened a new issue, #2562:
URL: https://github.com/apache/arrow-datafusion/issues/2562

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   
   The file URI scheme implemented in `ObjectStoreRegistry` does not appear to follow the [specification](https://en.wikipedia.org/wiki/File_URI_scheme) described in [RFC 3986](https://www.rfc-editor.org/rfc/rfc3986).
   
   In particular:
   
   * It does not handle the host component
   * It accepts non-absolute paths
   
   **Describe the solution you'd like**
   
   The `file://` URIs should be handled in a spec-compliant way. Unfortunately this on its own would prevent the use of local paths, which would likely be annoying for users. 
   
   I would therefore propose we special case URIs without a scheme, and canonicalise them within `ObjectStoreRegistry`.
   
   **Describe alternatives you've considered**
   
   We could not do this
   
   **Additional context**
   
   I encountered this whilst working on #2489, as the object_store crate purposefully does not handle relative paths.
   
   Thoughts @yahoNanJing @thinkharderdev 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #2562: File URI Scheme Interpretation

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #2562:
URL: https://github.com/apache/arrow-datafusion/issues/2562#issuecomment-1129124407

   cc @andygrove  and @mingmwang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] thinkharderdev commented on issue #2562: File URI Scheme Interpretation

Posted by GitBox <gi...@apache.org>.
thinkharderdev commented on issue #2562:
URL: https://github.com/apache/arrow-datafusion/issues/2562#issuecomment-1129114399

   Makes sense. This also came up in https://github.com/apache/arrow-datafusion/issues/2546 where we had a regression when we stripped the schema from file URIs. We actually use the scheme to resolve the correct `ObjectStore` when deserializing plans in Ballista. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] tustvold closed issue #2562: File URI Scheme Interpretation

Posted by GitBox <gi...@apache.org>.
tustvold closed issue #2562: File URI Scheme Interpretation
URL: https://github.com/apache/arrow-datafusion/issues/2562


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org