You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2021/01/07 14:58:00 UTC

[jira] [Created] (ARROW-11163) [C++][Python] Compressed Feather file written with pyarrow 0.17 not readable in pyarrow 2.0.0+

Joris Van den Bossche created ARROW-11163:
---------------------------------------------

             Summary: [C++][Python] Compressed Feather file written with pyarrow 0.17 not readable in pyarrow 2.0.0+
                 Key: ARROW-11163
                 URL: https://issues.apache.org/jira/browse/ARROW-11163
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Joris Van den Bossche
             Fix For: 3.0.0


Originally from https://stackoverflow.com/questions/65413407/reading-in-feather-file-in-pyarrow-error-arrowinvalid-unrecognized-compressio


Writing with pyarrow 0.17:

{code:python}
In [1]: pa.__version__
Out[1]: '0.17.0'

In [2]: table = pa.table({'a': range(100)})

In [3]: from pyarrow import feather

In [4]: feather.write_feather(table, "test_pa017_explicit.feather", compression="lz4", version=2)

# according to docstring, this should do the same, but apparently not
In [5]: feather.write_feather(table, "test_pa017_default.feather")
{code}

Reading with pyarrow 1.0.0 works for both files, but reading it with master (pyarrow 2.0.0 gives the same error):

{code:python}
In [121]: pa.__version__
Out[121]: '3.0.0.dev552+g634f993f4'

In [123]: feather.read_table("test_pa017_default.feather")
Out[123]:
pyarrow.Table
a: int64

In [124]: feather.read_table("test_pa017_explicit.feather")
---------------------------------------------------------------------------
ArrowInvalid                              Traceback (most recent call last)
<ipython-input-124-700e4b059ed5> in <module>
----> 1 feather.read_table("test_py017_explicit.feather")

~/scipy/repos/arrow/python/pyarrow/feather.py in read_table(source, columns, memory_map)
    238
    239     if columns is None:
--> 240         return reader.read()
    241
    242     column_types = [type(column) for column in columns]

~/scipy/repos/arrow/python/pyarrow/feather.pxi in pyarrow.lib.FeatherReader.read()

~/scipy/repos/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()

ArrowInvalid: Unrecognized compression type: LZ4
In ../src/arrow/ipc/reader.cc, line 538, code: (_error_or_value8).status()
In ../src/arrow/ipc/reader.cc, line 594, code: GetCompressionExperimental(message, &compression)
In ../src/arrow/ipc/reader.cc, line 942, code: (_error_or_value23).status()
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)