You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Benjamin (Jira)" <ji...@apache.org> on 2020/02/19 09:58:00 UTC

[jira] [Created] (ARROW-7883) pyarrow-serialize-pandas-df-with-nullable-integer-type

Benjamin created ARROW-7883:
-------------------------------

             Summary: pyarrow-serialize-pandas-df-with-nullable-integer-type
                 Key: ARROW-7883
                 URL: https://issues.apache.org/jira/browse/ARROW-7883
             Project: Apache Arrow
          Issue Type: Bug
            Reporter: Benjamin


Serializing an IntegerArray doesn't seem to work with the latest version of pandas and pyarrow
{code:java}
import pandas as pd
import pyarrow  # version 0.16
import pyarrow as pa

# workaround suggested in https://issues.apache.org/jira/browse/ARROW-5379
pd.arrays.IntegerArray.__arrow_array__ = lambda self, type: pyarrow.array(self._data, mask=self._mask, type=type)

df = pd.DataFrame([1, 2])
df = df.convert_dtypes()

# following https://arrow.apache.org/docs/python/ipc.html#serializing-pandas-objects
context = pa.default_serialization_context()
context.serialize(df) {code}
{{}}

 
{code:java}
 SerializationCallbackError: pyarrow does not know how to serialize objects of type <class 'pandas.core.arrays.integer.IntegerArray'>{code}
xref https://stackoverflow.com/q/60285486/2146052

{{}}

{{}}{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)