You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Grant Nguyen (Jira)" <ji...@apache.org> on 2019/10/21 21:20:00 UTC

[jira] [Created] (ARROW-6960) Add information about zstd/lz4 codec installation and linkages for R users

Grant Nguyen created ARROW-6960:
-----------------------------------

             Summary: Add information about zstd/lz4 codec installation and linkages for R users
                 Key: ARROW-6960
                 URL: https://issues.apache.org/jira/browse/ARROW-6960
             Project: Apache Arrow
          Issue Type: Improvement
          Components: R
    Affects Versions: 0.15.0
         Environment: Windows 10
            Reporter: Grant Nguyen


When I attempt to write a parquet file using lz4, zstd, or brotli compression using R arrow 0.15.0, I am unable to do so due to the codec support not being built (example below).

 
{code:java}
> arrow::write_parquet(payout_strategy, sink = "records_test_lz4.parquet",compression = "lz4")
Error in parquet___arrow___FileWriter__WriteTable(self, table, chunk_size) : 
 Arrow error: IOError: Arrow error: NotImplemented: LZ4 codec support not built{code}
 

I believe that the error is generated through [https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/compression.cc#L124-L145], but I am not sure how to call 
{code:java}
install.packages("arrow"){code}
in R to enable the ARROW_WITH_ZSTD/LZ4/BROTLI flags, or whether I should be doing installing zstd separately from arrow and then doing something pre- or post-install to link zstd with arrow. From [https://github.com/apache/arrow/issues/1209], it appears that zstd support has been added to arrow and parquet in general, and the [R package readme|[https://github.com/apache/arrow/tree/master/r]] notes "On macOS and Windows, installing a binary package from CRAN will handle Arrow's C++ dependencies for you", but I get the sense that does not apply to zstd.

 

Is there guidance as to how to enable zstd and other compression codecs prior to or after downloading the R arrow package? Could this be added to the R documentation somewhere for future reference?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)