You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Uwe L. Korn (JIRA)" <ji...@apache.org> on 2017/04/15 14:08:41 UTC
[jira] [Commented] (ARROW-823) [Python] Devise a means to serialize
arrays of arbitrary Python objects in Arrow IPC messages
[ https://issues.apache.org/jira/browse/ARROW-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969968#comment-15969968 ]
Uwe L. Korn commented on ARROW-823:
-----------------------------------
For me this issue depends also on how we represent custom/shared Pandas/Python metadata in Arrow and Parquet. While this will not fully solves the issue, I think that discussion is one of the requirements.
> [Python] Devise a means to serialize arrays of arbitrary Python objects in Arrow IPC messages
> ---------------------------------------------------------------------------------------------
>
> Key: ARROW-823
> URL: https://issues.apache.org/jira/browse/ARROW-823
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Python
> Reporter: Wes McKinney
>
> Practically speaking, this would involve a "custom" logical type that is "pyobject", represented physically as an array of 64-bit pointers. On serialization, this would need to be converted to a BinaryArray containing pickled objects as binary values
> At the moment, we don't yet have the machinery to deal with "custom" types where the in-memory representation is different from the on-wire representation. This would be a useful use case to work through the design issues
> Interestingly, if done properly, this would enable other Arrow implementations to manipulate (filter, etc.) serialized Python objects as binary blobs.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)