You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Carl Boettiger (Jira)" <ji...@apache.org> on 2022/04/07 17:04:00 UTC

[jira] [Created] (ARROW-16144) Write compressed data streams (particularly over S3)

Carl Boettiger created ARROW-16144:
--------------------------------------

             Summary: Write compressed data streams (particularly over S3)
                 Key: ARROW-16144
                 URL: https://issues.apache.org/jira/browse/ARROW-16144
             Project: Apache Arrow
          Issue Type: Improvement
          Components: R
    Affects Versions: 7.0.0
            Reporter: Carl Boettiger


The python bindings have `CompressedOutputStream`, but  I don't see how we can do this on the R side (e.g. with `write_csv_arrow()`).  It would be wonderful if we could both read and write compressed streams, particularly for CSV and particularly for remote filesystems, where this can provide considerable performance improvements.  

(For comparison, readr will write a compressed stream automatically based on the extension for the given filename, e.g. `readr::write_csv(data, "file.csv.gz")` or `write_csv("data.file.xz")`  )



--
This message was sent by Atlassian Jira
(v8.20.1#820001)