You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/02/27 13:42:00 UTC

[jira] [Commented] (ARROW-2046) [Python] Add support for PEP519 - pathlib and similar objects

    [ https://issues.apache.org/jira/browse/ARROW-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378600#comment-16378600 ] 

ASF GitHub Bot commented on ARROW-2046:
---------------------------------------

pitrou opened a new pull request #1675: ARROW-2046: [Python] Support path-like objects
URL: https://github.com/apache/arrow/pull/1675
 
 
   IO functions accepting string filenames should also accept PEP 519 path objects such as pathlib.Path (on Python 3.6 and later).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> [Python] Add support for PEP519 - pathlib and similar objects
> -------------------------------------------------------------
>
>                 Key: ARROW-2046
>                 URL: https://issues.apache.org/jira/browse/ARROW-2046
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Victor Uriarte
>            Assignee: Antoine Pitrou
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.9.0
>
>
> Currently `pyarrow` doesn't seem to support reading from `pathlib.Path` or similar objects. [PEP519|https://www.python.org/dev/peps/pep-0519/] introduced `__fspath__` which could be used to transform any `Path` like object to a string.
> [Pandas|https://github.com/pandas-dev/pandas/blob/a9d8e04ab68f688f899b4164bfa1ac868c9c1c64/pandas/io/common.py#L120-L160] has a sample implementation, though I think a simpler implementation of it could be used.
>  
> {code:java}
> import pathlib
> import pandas as pd
> import pyarrow as pa
> import pyarrow.parquet as pq
> df = pd.DataFrame({
>  'Foo': ['A', 'A', 'B', 'B', 'C'],
>  'Bar': ['A1', 'A2', 'B2', 'D3', ''],
> })
> test_dir = pathlib.Path(__file__).parent / 'test'
> test_dir.mkdir(parents=True, exist_ok=True)
> table = pa.Table.from_pandas(df)
> path = test_dir / 'file1.parquet'
> # Doesn't work
> pq.write_table(table, path)
> # Works
> pq.write_table(table, str(path))
> {code}
>  
> [https://github.com/apache/arrow/issues/1522]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)