You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2021/01/07 14:58:00 UTC

[jira] [Updated] (ARROW-11163) [C++][Python] Compressed Feather file written with pyarrow 0.17 not readable in pyarrow 2.0.0+

     [ https://issues.apache.org/jira/browse/ARROW-11163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joris Van den Bossche updated ARROW-11163:
------------------------------------------
    Issue Type: Bug  (was: Improvement)

> [C++][Python] Compressed Feather file written with pyarrow 0.17 not readable in pyarrow 2.0.0+
> ----------------------------------------------------------------------------------------------
>
>                 Key: ARROW-11163
>                 URL: https://issues.apache.org/jira/browse/ARROW-11163
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Joris Van den Bossche
>            Priority: Major
>             Fix For: 3.0.0
>
>
> Originally from https://stackoverflow.com/questions/65413407/reading-in-feather-file-in-pyarrow-error-arrowinvalid-unrecognized-compressio
> Writing with pyarrow 0.17:
> {code:python}
> In [1]: pa.__version__
> Out[1]: '0.17.0'
> In [2]: table = pa.table({'a': range(100)})
> In [3]: from pyarrow import feather
> In [4]: feather.write_feather(table, "test_pa017_explicit.feather", compression="lz4", version=2)
> # according to docstring, this should do the same, but apparently not
> In [5]: feather.write_feather(table, "test_pa017_default.feather")
> {code}
> Reading with pyarrow 1.0.0 works for both files, but reading it with master (pyarrow 2.0.0 gives the same error):
> {code:python}
> In [121]: pa.__version__
> Out[121]: '3.0.0.dev552+g634f993f4'
> In [123]: feather.read_table("test_pa017_default.feather")
> Out[123]:
> pyarrow.Table
> a: int64
> In [124]: feather.read_table("test_pa017_explicit.feather")
> ---------------------------------------------------------------------------
> ArrowInvalid                              Traceback (most recent call last)
> <ipython-input-124-700e4b059ed5> in <module>
> ----> 1 feather.read_table("test_py017_explicit.feather")
> ~/scipy/repos/arrow/python/pyarrow/feather.py in read_table(source, columns, memory_map)
>     238
>     239     if columns is None:
> --> 240         return reader.read()
>     241
>     242     column_types = [type(column) for column in columns]
> ~/scipy/repos/arrow/python/pyarrow/feather.pxi in pyarrow.lib.FeatherReader.read()
> ~/scipy/repos/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()
> ArrowInvalid: Unrecognized compression type: LZ4
> In ../src/arrow/ipc/reader.cc, line 538, code: (_error_or_value8).status()
> In ../src/arrow/ipc/reader.cc, line 594, code: GetCompressionExperimental(message, &compression)
> In ../src/arrow/ipc/reader.cc, line 942, code: (_error_or_value23).status()
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)