You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2021/02/16 13:33:00 UTC

[jira] [Commented] (ARROW-11381) [Rust] [Parquet] LZ4 compressed files written in Rust can't be opened with C++

    [ https://issues.apache.org/jira/browse/ARROW-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285196#comment-17285196 ] 

Antoine Pitrou commented on ARROW-11381:
----------------------------------------

No need to hurry on this, as LZ4 format in Parquet is unfortunately unspecified and the C++ Parquet implementation is also running into trouble trying to be compatible with the reference Java implementation (named "parquet-mr").

> [Rust] [Parquet] LZ4 compressed files written in Rust can't be opened with C++
> ------------------------------------------------------------------------------
>
>                 Key: ARROW-11381
>                 URL: https://issues.apache.org/jira/browse/ARROW-11381
>             Project: Apache Arrow
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: Neville Dipale
>            Priority: Major
>
> Parquet files that are written with LZ4 compression, cannot be read from pyarrow. It seems that the issue might be the LZ4 block vs frame, which we're also seeing in ARROW-8767.
> I'll update this JIRA with more info, as I'm struggling to get pyspark up on MacOS (Rosetta 2 issues)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)