You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Yosuke Shiro (Jira)" <ji...@apache.org> on 2020/02/27 04:36:00 UTC

[jira] [Resolved] (ARROW-7625) [GLib] Parquet GLib and Red Parquet (Ruby) do not allow specifying compression type

     [ https://issues.apache.org/jira/browse/ARROW-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yosuke Shiro resolved ARROW-7625.
---------------------------------
    Fix Version/s: 1.0.0
       Resolution: Fixed

Issue resolved by pull request 6336
[https://github.com/apache/arrow/pull/6336]

> [GLib] Parquet GLib and Red Parquet (Ruby) do not allow specifying compression type
> -----------------------------------------------------------------------------------
>
>                 Key: ARROW-7625
>                 URL: https://issues.apache.org/jira/browse/ARROW-7625
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: GLib
>         Environment: red-arrow 0.15.1
> red-parquet 0.15.1
> libarrow 0.15.1
> libparquet 0.15.1
>            Reporter: Keith Gable
>            Assignee: Yosuke Shiro
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.0.0
>
>          Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> It seems that the ArrowFileWriter being used by parquet-glib just uses the default writer properties ([https://github.com/apache/arrow/blob/master/c_glib/parquet-glib/arrow-file-writer.cpp#L184),] and does not offer the user the ability to override this. As a consumer of the GLib API in Ruby (red-parquet), I therefore have no way of compressing Parquet columns. Of course, I can compress the entire file by doing something like {{t.save('...', format: 'parquet', compression: 'GZIP')}}, but this is not compatible with most tools and isn't the correct way of compressing a Parquet file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)