You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2020/10/29 20:33:00 UTC

[jira] [Created] (ARROW-10425) [Python] Support reading (compressed) CSV file from remote file / binary blob

Joris Van den Bossche created ARROW-10425:
---------------------------------------------

             Summary: [Python] Support reading (compressed) CSV file from remote file / binary blob
                 Key: ARROW-10425
                 URL: https://issues.apache.org/jira/browse/ARROW-10425
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Python
            Reporter: Joris Van den Bossche


From https://stackoverflow.com/questions/64588076/how-can-i-read-a-csv-gz-file-with-pyarrow-from-a-file-object

Currently {{pyarrow.csv.rad_csv}} happily takes a path to a compressed file and automatically decompresses it, but AFAIK this only works for local paths. 

It would be nice to in general support reading CSV from remote files (with URI / specifying a filesystem), and in that case also support compression. 

In addition we could also read a compressed file from a BytesIO / file-like object, but not sure we want that (as it would required a keyword to indicate the used compression).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)