You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2022/01/19 16:48:00 UTC

[jira] [Updated] (ARROW-15370) [Python] Regression in empty table to_pandas conversion

     [ https://issues.apache.org/jira/browse/ARROW-15370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joris Van den Bossche updated ARROW-15370:
------------------------------------------
    Description: 
Nightly integration tests with kartothek are failing, see eg https://github.com/ursacomputing/crossbow/runs/4863725914?check_suite_focus=true

This seems something on our side, and a recent failure (the builds only started failing today, and I don't see other differences with the last working build yesterday)

Update, a reproducer:

{code}
In [4]: df = pd.DataFrame({'a': [1, 2], 'b': [0.1, 0.2]})

In [5]: table = pa.table(df)

In [6]: table.schema.empty_table().to_pandas()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-a03ecffc0af8> in <module>
----> 1 table.schema.empty_table().to_pandas()

~/scipy/repos/arrow/python/pyarrow/array.pxi in pyarrow.lib._PandasConvertible.to_pandas()

~/scipy/repos/arrow/python/pyarrow/table.pxi in pyarrow.lib.Table._to_pandas()

~/scipy/repos/arrow/python/pyarrow/pandas_compat.py in table_to_blockmanager(options, table, categories, ignore_metadata, types_mapper)
    790 
    791     axes = [columns, index]
--> 792     return BlockManager(blocks, axes)
    793 
    794 

~/miniconda3/envs/arrow-dev/lib/python3.8/site-packages/pandas/core/internals/managers.py in __init__(self, blocks, axes, verify_integrity)
    912                         pass
    913 
--> 914             self._verify_integrity()
    915 
    916     def _verify_integrity(self) -> None:

~/miniconda3/envs/arrow-dev/lib/python3.8/site-packages/pandas/core/internals/managers.py in _verify_integrity(self)
    919         for block in self.blocks:
    920             if block.shape[1:] != mgr_shape[1:]:
--> 921                 raise construction_error(tot_items, block.shape[1:], self.axes)
    922         if len(self.items) != tot_items:
    923             raise AssertionError(

ValueError: Empty data passed with indices specified.
{code}

  was:
Nightly integration tests with kartothek are failing, see eg https://github.com/ursacomputing/crossbow/runs/4863725914?check_suite_focus=true

This seems something on our side, and a recent failure (the builds only started failing today, and I don't see other differences with the last working build yesterday)


> [Python] Regression in empty table to_pandas conversion
> -------------------------------------------------------
>
>                 Key: ARROW-15370
>                 URL: https://issues.apache.org/jira/browse/ARROW-15370
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Joris Van den Bossche
>            Priority: Blocker
>             Fix For: 7.0.0
>
>
> Nightly integration tests with kartothek are failing, see eg https://github.com/ursacomputing/crossbow/runs/4863725914?check_suite_focus=true
> This seems something on our side, and a recent failure (the builds only started failing today, and I don't see other differences with the last working build yesterday)
> Update, a reproducer:
> {code}
> In [4]: df = pd.DataFrame({'a': [1, 2], 'b': [0.1, 0.2]})
> In [5]: table = pa.table(df)
> In [6]: table.schema.empty_table().to_pandas()
> ---------------------------------------------------------------------------
> ValueError                                Traceback (most recent call last)
> <ipython-input-6-a03ecffc0af8> in <module>
> ----> 1 table.schema.empty_table().to_pandas()
> ~/scipy/repos/arrow/python/pyarrow/array.pxi in pyarrow.lib._PandasConvertible.to_pandas()
> ~/scipy/repos/arrow/python/pyarrow/table.pxi in pyarrow.lib.Table._to_pandas()
> ~/scipy/repos/arrow/python/pyarrow/pandas_compat.py in table_to_blockmanager(options, table, categories, ignore_metadata, types_mapper)
>     790 
>     791     axes = [columns, index]
> --> 792     return BlockManager(blocks, axes)
>     793 
>     794 
> ~/miniconda3/envs/arrow-dev/lib/python3.8/site-packages/pandas/core/internals/managers.py in __init__(self, blocks, axes, verify_integrity)
>     912                         pass
>     913 
> --> 914             self._verify_integrity()
>     915 
>     916     def _verify_integrity(self) -> None:
> ~/miniconda3/envs/arrow-dev/lib/python3.8/site-packages/pandas/core/internals/managers.py in _verify_integrity(self)
>     919         for block in self.blocks:
>     920             if block.shape[1:] != mgr_shape[1:]:
> --> 921                 raise construction_error(tot_items, block.shape[1:], self.axes)
>     922         if len(self.items) != tot_items:
>     923             raise AssertionError(
> ValueError: Empty data passed with indices specified.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)