You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Marc Bernot (Jira)" <ji...@apache.org> on 2020/02/25 15:00:09 UTC

[jira] [Created] (ARROW-7939) Python crashes when reading parquet file compressed with snappy

Marc Bernot created ARROW-7939:
----------------------------------

             Summary: Python crashes when reading parquet file compressed with snappy
                 Key: ARROW-7939
                 URL: https://issues.apache.org/jira/browse/ARROW-7939
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 0.16.0
         Environment: Windows 7
python 3.6.9
pyarrow 0.16 from conda-forge
            Reporter: Marc Bernot


When I installed pyarrow 0.16, some parquet files created with pyarrow 0.15.1 would make python crash. I drilled down to the simplest example I could find.

It happens that some parquet files created with pyarrow 0.16 cannot either be read back. The example below works fine with arrays_ok but python crashes with arrays_nok.

Besides, it works fine with 'none', 'gzip' and 'brotli' compression. The problem seems to happen only with snappy.
{code:python}
import pyarrow.parquet as pq
import pyarrow as pa
arrays_ok = [[0,1]]
arrays_nok = [[0,1,2]]
table = pa.Table.from_arrays(arrays_nok,names=['a'])
pq.write_table(table,'foo.parquet',compression='snappy')
pq.read_table('foo.parquet')
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)