You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2019/09/20 10:39:00 UTC
[jira] [Created] (ARROW-6642) [Python] chained access of
ParquetDataset's metadata segfaults
Joris Van den Bossche created ARROW-6642:
--------------------------------------------
Summary: [Python] chained access of ParquetDataset's metadata segfaults
Key: ARROW-6642
URL: https://issues.apache.org/jira/browse/ARROW-6642
Project: Apache Arrow
Issue Type: Bug
Components: Python
Reporter: Joris Van den Bossche
Creating and reading a parquet dataset:
{code}
table = pa.table({'a': [1, 2, 3]})
import pyarrow.parquet as pq
pq.write_table(table, '__test_statistics_segfault.parquet')
dataset = pq.ParquetDataset('__test_statistics_segfault.parquet')
dataset_piece = dataset.pieces[0]
{code}
If you access the metadata and a column's statistics in steps, this works fine:
{code}
meta = dataset_piece.get_metadata()
row = meta.row_group(0)
col = row.column(0)
{code}
but doing it chained in one step, this segfaults:
{code}
dataset_piece.get_metadata().row_group(0).column(0)
{code}
{{dataset_piece.get_metadata().row_group(0)}} still works, but additionally with {{.column(0)}} then it segfaults.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)