You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2021/04/09 14:20:00 UTC

[jira] [Commented] (ARROW-12299) S3FileSystem is an Unrecognized filesystem

    [ https://issues.apache.org/jira/browse/ARROW-12299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318025#comment-17318025 ] 

Joris Van den Bossche commented on ARROW-12299:
-----------------------------------------------

[~samsanders] To use the new filesystems with {{pq.write_to_dataset}}, you also need to use the new implementation ({{use_legacy_dataset=False}}, for which the default is still True). 

Now, that said, we should probably automatically do that if a new-style filesystem is passed (especially since the other filesystems are deprecated).

Alternatively, you can also use the {{pyarrow.dataset.write_dataset}} interface.

> S3FileSystem is an Unrecognized filesystem
> ------------------------------------------
>
>                 Key: ARROW-12299
>                 URL: https://issues.apache.org/jira/browse/ARROW-12299
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 3.0.0
>            Reporter: Samuel Sanders
>            Priority: Major
>
> {code:java}
>     pq.write_to_dataset(pa.concat_tables(pa_tables),
>   File "C:\venv\*\lib\site-packages\pyarrow\parquet.py", line 1914, in write_to_dataset
>     fs, root_path = legacyfs.resolve_filesystem_and_path(root_path, filesystem)
>   File "C:\venv\*\lib\site-packages\pyarrow\filesystem.py", line 474, in resolve_filesystem_and_path
>     filesystem = _ensure_filesystem(filesystem)
>   File "C:\venv\*\lib\site-packages\pyarrow\filesystem.py", line 457, in _ensure_filesystem
>     raise OSError('Unrecognized filesystem: {}'.format(fs_type))
> OSError: Unrecognized filesystem: <class 'pyarrow._s3fs.S3FileSystem'>
> {code}
> Creating the S3FileSystem these two ways produced the above error when invoking parquet.write_to_dataset with filesystem=s3_filesystem:
> {code:java}
>     s3_filesystem = file_system.S3FileSystem(region='us-east-1')
>     s3_filesystem, path = file_system.FileSystem.from_uri("s3://{0}".format(PARQUET_BUCKET))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)