You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Krisztian Szucs (JIRA)" <ji...@apache.org> on 2018/10/16 11:33:00 UTC

[jira] [Resolved] (ARROW-3514) [Python] zlib deflate exception when writing Parquet file

     [ https://issues.apache.org/jira/browse/ARROW-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Krisztian Szucs resolved ARROW-3514.
------------------------------------
       Resolution: Fixed
    Fix Version/s:     (was: 0.10.0)
                   0.12.0

Issue resolved by pull request 2771
[https://github.com/apache/arrow/pull/2771]

> [Python] zlib deflate exception when writing Parquet file
> ---------------------------------------------------------
>
>                 Key: ARROW-3514
>                 URL: https://issues.apache.org/jira/browse/ARROW-3514
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>    Affects Versions: 0.11.0
>         Environment: Amazon Linux, CentOS 7, Ubuntu 16.04, zlib 1.2.7/1.2.8, CPython 3.6.
>            Reporter: Adam Machanic
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.12.0
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> The below Python code throws an exception in 0.11.0, but not in 0.10.0.
> I was able to reproduce the issue in Amazon Linux, CentOS 7, and Ubuntu 16.04, but not in Windows 7.
> The Amazon and CentOS machines are both running zlib 1.2.7, and the Ubuntu machine is using 1.2.8.
> Tested with CPython 3.6 in all cases.
> {code:python}
> import io
> import pyarrow
> from pyarrow import parquet
> tbl = pyarrow.Table.from_arrays([pyarrow.array(['abc', 'def'])], ['some_col'])
> f = io.BytesIO()
> parquet.write_table(tbl, f, compression='gzip')
> {code}
> Following is the exception:
> {code}
> Traceback (most recent call last):
>   File "test_pyarrow.py", line 8, in <module>
>     parquet.write_table(tbl, f, compression='gzip')
>   File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py", line 1125, in write_table
>     writer.write_table(table, row_group_size=row_group_size)
>   File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py", line 376, in write_table
>     self.writer.write_table(table, row_group_size=row_group_size)
>   File "pyarrow/_parquet.pyx", line 934, in pyarrow._parquet.ParquetWriter.write_table
>   File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
> pyarrow.lib.ArrowIOError: Arrow error: IOError: zlib deflate failed, output buffer too small
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)