You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2021/01/14 14:43:00 UTC
[jira] [Assigned] (ARROW-10264) [C++][Python] Parquet test failing
with HadoopFileSystem URI
[ https://issues.apache.org/jira/browse/ARROW-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Antoine Pitrou reassigned ARROW-10264:
--------------------------------------
Assignee: Antoine Pitrou
> [C++][Python] Parquet test failing with HadoopFileSystem URI
> ------------------------------------------------------------
>
> Key: ARROW-10264
> URL: https://issues.apache.org/jira/browse/ARROW-10264
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Reporter: Joris Van den Bossche
> Assignee: Antoine Pitrou
> Priority: Major
> Labels: filesystem, hdfs, pull-request-available
> Fix For: 3.0.0
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Follow-up on ARROW-10175. In the HDFS integration tests, there is a test using a URI failing if we use the new filesystem / dataset implementation:
> {code}
> FAILED opt/conda/envs/arrow/lib/python3.7/site-packages/pyarrow/tests/test_hdfs.py::TestLibHdfs::test_read_multiple_parquet_files_with_uri
> {code}
> fails with
> {code}
> pyarrow.lib.ArrowInvalid: Path '/tmp/pyarrow-test-838/multi-parquet-uri-48569714efc74397816722c9c6723191/0.parquet' is not relative to '/user/root'
> {code}
> while it is passing a URI (and not a filesystem object) to {{parquet.read_table}}, and the new filesystems/dataset implementation should be able to handle URIs.
> cc [~apitrou]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)