You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Benjamin (Jira)" <ji...@apache.org> on 2020/02/19 09:58:00 UTC
[jira] [Created] (ARROW-7883)
pyarrow-serialize-pandas-df-with-nullable-integer-type
Benjamin created ARROW-7883:
-------------------------------
Summary: pyarrow-serialize-pandas-df-with-nullable-integer-type
Key: ARROW-7883
URL: https://issues.apache.org/jira/browse/ARROW-7883
Project: Apache Arrow
Issue Type: Bug
Reporter: Benjamin
Serializing an IntegerArray doesn't seem to work with the latest version of pandas and pyarrow
{code:java}
import pandas as pd
import pyarrow # version 0.16
import pyarrow as pa
# workaround suggested in https://issues.apache.org/jira/browse/ARROW-5379
pd.arrays.IntegerArray.__arrow_array__ = lambda self, type: pyarrow.array(self._data, mask=self._mask, type=type)
df = pd.DataFrame([1, 2])
df = df.convert_dtypes()
# following https://arrow.apache.org/docs/python/ipc.html#serializing-pandas-objects
context = pa.default_serialization_context()
context.serialize(df) {code}
{{}}
{code:java}
SerializationCallbackError: pyarrow does not know how to serialize objects of type <class 'pandas.core.arrays.integer.IntegerArray'>{code}
xref https://stackoverflow.com/q/60285486/2146052
{{}}
{{}}{{}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)