You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2021/09/21 11:48:00 UTC
[jira] [Commented] (ARROW-7102) [Python] Make filesystems
compatible with fsspec
[ https://issues.apache.org/jira/browse/ARROW-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418059#comment-17418059 ]
Joris Van den Bossche commented on ARROW-7102:
----------------------------------------------
An fsspec wrapper around a pyarrow filesystem has now been added to {{fsspec}} (https://github.com/intake/filesystem_spec/pull/754, https://github.com/intake/filesystem_spec/issues/663). I will document this in the pyarrow docs, and then this issue (and the last child issue ARROW-8780) can be closed I think (since we already have the other way around, wrapping an fsspec filesystem in pyarrow)
> [Python] Make filesystems compatible with fsspec
> ------------------------------------------------
>
> Key: ARROW-7102
> URL: https://issues.apache.org/jira/browse/ARROW-7102
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Tom Augspurger
> Priority: Major
> Labels: dataset-dask-integration, filesystem
>
> Update: regarding compatibility with {{fsspec}}, there are two directions of wrapping possible:
> * Make a {{fsspec}} wrapper for {{pyarrow.fs}} (-> tracked in ARROW-8780, this can ensure {{pyarrow.fs}} filesystems can be used where {{fsspec}} filesytems are expected )
> * Make a {{pyarrow.fs}} wrapper for {{fsspec}} (-> tracked in ARROW-8766 + ARROW-9089 this can ensure {{fsspec}} filesystems can be used where {{pyarrow.fs}} filesytems are expected )
> ----
> [fsspec|https://filesystem-spec.readthedocs.io/en/latest] defines a common API for a variety filesystem implementations. I'm proposing a FSSpecWrapper, similar to S3FSWrapper, that works with any fsspec implementation.
>
> Right now, pyarrow has a pyarrow.filesystems.S3FSWrapper, which is specific to s3fs. [https://github.com/apache/arrow/blob/21ad7ac1162eab188a1e15923fb1de5b795337ec/python/pyarrow/filesystem.py#L320]. This implementation could be removed entirely once an FSSPecWrapper is done, or kept as an alias if it's part of the public API.
>
> This is realted to ARROW-3717, which requested a GCSFSWrapper for working with google cloud storage.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)