You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2022/04/05 13:19:00 UTC
[jira] [Resolved] (ARROW-15982) [Python] parquet.read_table fails to parse home directory path
[ https://issues.apache.org/jira/browse/ARROW-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joris Van den Bossche resolved ARROW-15982.
-------------------------------------------
Fix Version/s: 8.0.0
Resolution: Fixed
Issue resolved by pull request 12675
[https://github.com/apache/arrow/pull/12675]
> [Python] parquet.read_table fails to parse home directory path
> --------------------------------------------------------------
>
> Key: ARROW-15982
> URL: https://issues.apache.org/jira/browse/ARROW-15982
> Project: Apache Arrow
> Issue Type: Bug
> Affects Versions: 7.0.0
> Reporter: Colin Jermain
> Priority: Major
> Labels: pull-request-available
> Fix For: 8.0.0
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> {{pyarrow.parquet.read_table}} fails to parse a path with the home directory in it. For example {{"~/test.parquet"}} returns a {{{}FileNotFoundError{}}}, while {{"/home/user/test.parquet"}} reads the file correctly.
> {code:java}
> $ python -c "import pyarrow.parquet; pyarrow.parquet.read_table('~/test.parquet')"
> Traceback (most recent call last):
> File "<string>", line 1, in <module>
> File ".../lib/python3.8/site-packages/pyarrow/parquet.py", line 1960, in read_table
> dataset = _ParquetDatasetV2(
> File ".../lib/python3.8/site-packages/pyarrow/parquet.py", line 1781, in __init__
> self._dataset = ds.dataset(path_or_paths, filesystem=filesystem,
> File ".../lib/python3.8/site-packages/pyarrow/dataset.py", line 667, in dataset
> return _filesystem_dataset(source, **kwargs)
> File ".../lib/python3.8/site-packages/pyarrow/dataset.py", line 412, in _filesystem_dataset
> fs, paths_or_selector = _ensure_single_source(source, filesystem)
> File ".../lib/python3.8/site-packages/pyarrow/dataset.py", line 388, in _ensure_single_source
> raise FileNotFoundError(path)
> FileNotFoundError: ~/test.parquet
> {code}
> The fix for this issue should be as simple as applying {{os.path.expanduser}} in the right places.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)