You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2020/04/01 18:06:00 UTC

[jira] [Commented] (ARROW-8213) [Python][Dataset] Opening a dataset with a local incorrect path gives confusing error message

    [ https://issues.apache.org/jira/browse/ARROW-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073051#comment-17073051 ] 

Joris Van den Bossche commented on ARROW-8213:
----------------------------------------------

Hmm, I would like to avoid that personally (eg also dask and pandas support both with the same function, so somewhere we need to detect which of the two is it is). I still think we should be able to distinguish those cases within the filesystem code.



> [Python][Dataset] Opening a dataset with a local incorrect path gives confusing error message
> ---------------------------------------------------------------------------------------------
>
>                 Key: ARROW-8213
>                 URL: https://issues.apache.org/jira/browse/ARROW-8213
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++ - Dataset, Python
>            Reporter: Joris Van den Bossche
>            Priority: Major
>             Fix For: 0.17.0
>
>
> Even after the previous PRs related to local paths (https://github.com/apache/arrow/pull/6643, https://github.com/apache/arrow/pull/6655), I don't think the user experience optimal in case you are working with local files, and pass a wrong, non-existent path (eg due to a typo).
> Currently, you get this error:
> {code}
> >>> dataset = ds.dataset("data_with_typo.parquet", format="parquet")
> ...
> ArrowInvalid: URI has empty scheme: 'data_with_typo.parquet'
> {code}
> where "URI has empty scheme" is rather confusing for the user in case of a non-existent path.  I think ideally we should raise a "No such file or directory" error.
> I am not fully sure what the best solution is, as {{FileSystem.from_uri}} can also give other errors that we do want to propagate to the user. 
> The most straightforward that I am now thinking of is checking if "URI has empty scheme" is in the error message, and then rewording it, but that's not very clean ..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)