You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2020/05/19 15:39:00 UTC
[jira] [Created] (ARROW-8860) [C++] Compressed Feather file with
struct array roundtrips incorrectly
Joris Van den Bossche created ARROW-8860:
--------------------------------------------
Summary: [C++] Compressed Feather file with struct array roundtrips incorrectly
Key: ARROW-8860
URL: https://issues.apache.org/jira/browse/ARROW-8860
Project: Apache Arrow
Issue Type: Bug
Components: C++
Reporter: Joris Van den Bossche
When writing a table with a Struct typed column, this is read back with garbage values when using compression (which is the default):
{code:python}
>>> table = pa.table({'col': pa.StructArray.from_arrays([[0,1,2], [1,2,3]], names=["f1", "f2"])})
>>> table.column("col")
<pyarrow.lib.ChunkedArray object at 0x7f0b0c4d7458>
[
-- is_valid: all not null
-- child 0 type: int64
[
0,
1,
2
]
-- child 1 type: int64
[
1,
2,
3
]
]
# roundtrip through feather
>>> feather.write_feather(table, "test_struct.feather")
>>> table2 = feather.read_table("test_struct.feather")
>>> table2.column("col")
<pyarrow.lib.ChunkedArray object at 0x7f0b0c4d7728>
[
-- is_valid: all not null
-- child 0 type: int64
[
24,
1261641627085906436,
1369095386551025664
]
-- child 1 type: int64
[
24,
1405756815161762308,
281479842103296
]
]
{code}
When not using compression, it is read back correctly:
{code:python}
>>> feather.write_feather(table, "test_struct.feather", compression="uncompressed")
>>> table2 = feather.read_table("test_struct.feather")
>>> table2.column("col")
<pyarrow.lib.ChunkedArray object at 0x7f0b0e466778>
[
-- is_valid: all not null
-- child 0 type: int64
[
0,
1,
2
]
-- child 1 type: int64
[
1,
2,
3
]
]
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)