You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Michael Milton (Jira)" <ji...@apache.org> on 2022/03/02 11:10:00 UTC
[jira] [Created] (ARROW-15826) Allow serializing arbitrary Python objects to parquet
Michael Milton created ARROW-15826:
--------------------------------------
Summary: Allow serializing arbitrary Python objects to parquet
Key: ARROW-15826
URL: https://issues.apache.org/jira/browse/ARROW-15826
Project: Apache Arrow
Issue Type: Improvement
Components: Parquet, Python
Reporter: Michael Milton
{code:java}
import pandas as pd
import pyarrow as pa
class Foo:
pass
df = pd.DataFrame({"a": [Foo(), Foo(), Foo()], "b": [1, 2, 3]})
table = pyarrow.Table.from_pandas(df)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pyarrow/table.pxi", line 1782, in pyarrow.lib.Table.from_pandas
File "/home/migwell/miniconda3/lib/python3.9/site-packages/pyarrow/pandas_compat.py", line 594, in dataframe_to_arrays
arrays = [convert_column(c, f)
File "/home/migwell/miniconda3/lib/python3.9/site-packages/pyarrow/pandas_compat.py", line 594, in <listcomp>
arrays = [convert_column(c, f)
File "/home/migwell/miniconda3/lib/python3.9/site-packages/pyarrow/pandas_compat.py", line 581, in convert_column
raise e
File "/home/migwell/miniconda3/lib/python3.9/site-packages/pyarrow/pandas_compat.py", line 575, in convert_column
result = pa.array(col, type=type_, from_pandas=True, safe=safe)
File "pyarrow/array.pxi", line 312, in pyarrow.lib.array
File "pyarrow/array.pxi", line 83, in pyarrow.lib._ndarray_to_array
File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: ('Could not convert <__main__.Foo object at 0x7fc23e38bfd0> with type Foo: did not recognize Python value type when inferring an Arrow data type', 'Conversion failed for column a with type object')
>>>
{code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)