You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Dima Ryazanov (JIRA)" <ji...@apache.org> on 2018/05/16 21:43:00 UTC
[jira] [Created] (ARROW-2592) [Python] AssertionError in
to_pandas()
Dima Ryazanov created ARROW-2592:
------------------------------------
Summary: [Python] AssertionError in to_pandas()
Key: ARROW-2592
URL: https://issues.apache.org/jira/browse/ARROW-2592
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 0.9.0, 0.8.0
Reporter: Dima Ryazanov
Pyarrow 0.8 and 0.9 raises an AssertionError for one of the datasets I have (created using an older version of pyarrow). Repro steps:
{{In [1]: from pyarrow.parquet import ParquetDataset}}
{{In [2]: d = ParquetDataset(['bug.parq'])}}
{{In [3]: t = d.read()}}
{{In [4]: t.to_pandas()}}
{{---------------------------------------------------------------------------}}
{{AssertionError Traceback (most recent call last)}}
{{<ipython-input-4-d17c9e2818f1> in <module>()}}
{{----> 1 t.to_pandas()}}
{{table.pxi in pyarrow.lib.Table.to_pandas()}}
{{~/envs/cli3/lib/python3.6/site-packages/pyarrow/pandas_compat.py in table_to_blockmanager(options, table, memory_pool, nthreads, categories)}}
{{ 529 # There must be the same number of field names and physical names}}
{{ 530 # (fields in the arrow Table)}}
{{--> 531 assert len(logical_index_names) == len(index_columns_set)}}
{{ 532 }}
{{ 533 # It can never be the case in a released version of pyarrow that}}
{{AssertionError: }}
Here's the file: [https://www.dropbox.com/s/oja3khjsc5tycfh/bug.parq]
(I was not able to attach it here due to a "missing token", whatever that means.)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)