You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ian Cook (Jira)" <ji...@apache.org> on 2021/06/25 16:36:00 UTC

[jira] [Updated] (ARROW-10998) [C++] Filesystems: detect if URI is passed where a file path is required and raise informative error

     [ https://issues.apache.org/jira/browse/ARROW-10998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ian Cook updated ARROW-10998:
-----------------------------
    Fix Version/s:     (was: 5.0.0)
                   6.0.0

> [C++] Filesystems: detect if URI is passed where a file path is required and raise informative error
> ----------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-10998
>                 URL: https://issues.apache.org/jira/browse/ARROW-10998
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Python
>            Reporter: Joris Van den Bossche
>            Assignee: Ian Cook
>            Priority: Major
>              Labels: filesystem
>             Fix For: 6.0.0
>
>
> Currently, when passing a URI to a filesystem method (except for {{from_uri}}) or other functions that accept a filesystem object, you can get a rather cryptic error message (eg in this case about "No response body" for S3, in the example below). 
> Ideally, the filesystem object knows its own prefix "scheme", and so can detect if a user is passing a URI instead of file path, and we can provide a nicer error message.
> Example with S3:
> {code:python}
> >>> from pyarrow.fs import S3FileSystem
> >>> fs = S3FileSystem(region="us-east-2")
> >>> fs.get_file_info('s3://ursa-labs-taxi-data/2016/01/')
> ...
> OSError: When getting information for key '/ursa-labs-taxi-data/2016/01' in bucket 's3:': AWS Error [code 100]: No response body.
> >>> import pyarrow.parquet as pq
> >>> table = pq.read_table('s3://ursa-labs-taxi-data/2016/01/data.parquet', filesystem=fs)
> ...
> OSError: When getting information for key '/ursa-labs-taxi-data/2016/01/data.parquet' in bucket 's3:': AWS Error [code 100]: No response body.
> {code}
> With a local filesystem, you actually get a not found file:
> {code: python}
> >>> fs = LocalFileSystem()
> >>> fs.get_file_info("file:///home")
> <FileInfo for 'file:///home': type=FileType.NotFound>
> {code}
> cc [~apitrou]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)