You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Alenka Frim (Jira)" <ji...@apache.org> on 2022/01/18 10:28:00 UTC
[jira] [Assigned] (ARROW-9664) [Python] Array/ChunkedArray.to_pandas do not support types_mapper keyword
[ https://issues.apache.org/jira/browse/ARROW-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alenka Frim reassigned ARROW-9664:
----------------------------------
Assignee: Alenka Frim
> [Python] Array/ChunkedArray.to_pandas do not support types_mapper keyword
> -------------------------------------------------------------------------
>
> Key: ARROW-9664
> URL: https://issues.apache.org/jira/browse/ARROW-9664
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 1.0.0
> Environment: pyarrow: 1.0.0
> pandas: 1.0.5
> python: sys.version_info(major=3, minor=8, micro=2, releaselevel='final', serial=0)
> Reporter: Adrien Hoarau
> Assignee: Alenka Frim
> Priority: Minor
>
> Arrow structures (Array, ChunkedArray, Table) have a types_mapper argument in their to_pandas method. It works for Table, but doesn't seem to get called for Array or ChunkedArray:
> {code:java}
> import pandas as pd
> import pyarrow
> data = pd.Series([0, None, 2], dtype=pd.Int32Dtype(), name='foo')
> def convert_types(arrow_type):
> raise ValueError("Function got called")
> pyarrow.Table.from_pandas(data.to_frame()).to_pandas(types_mapper=convert_types)
> Traceback (most recent call last):
> File "/home/adrien/.pyenv/versions/complete/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
> exec(code_obj, self.user_global_ns, self.user_ns)
> File "<ipython-input-6-d1e3e1f45f69>", line 1, in <module>
> pyarrow.Table.from_pandas(data.to_frame()).to_pandas(types_mapper=convert_types)
> File "pyarrow/array.pxi", line 715, in pyarrow.lib._PandasConvertible.to_pandas
> File "pyarrow/table.pxi", line 1565, in pyarrow.lib.Table._to_pandas
> File "/home/adrien/.pyenv/versions/complete/lib/python3.8/site-packages/pyarrow/pandas_compat.py", line 771, in table_to_blockmanager
> ext_columns_dtypes = _get_extension_dtypes(
> File "/home/adrien/.pyenv/versions/complete/lib/python3.8/site-packages/pyarrow/pandas_compat.py", line 840, in _get_extension_dtypes
> pandas_dtype = types_mapper(typ)
> File "<ipython-input-5-5a9760e8753f>", line 2, in convert_types
> raise ValueError("Function got called")
> ValueError: Function got called
> pyarrow.Int32Array.from_pandas(data).to_pandas(types_mapper=convert_types)
> 0 0.0
> 1 NaN
> 2 2.0
> dtype: float64{code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)