You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/02/09 05:47:00 UTC

[jira] [Commented] (ARROW-2121) Consider special casing object arrays in pandas serializers.

    [ https://issues.apache.org/jira/browse/ARROW-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16357951#comment-16357951 ] 

ASF GitHub Bot commented on ARROW-2121:
---------------------------------------

robertnishihara opened a new pull request #1581: [WIP] ARROW-2121: [Python] Handle object arrays directly in pandas serializer.
URL: https://github.com/apache/arrow/pull/1581
 
 
   The goal here is to get the best of both the `pandas_serialization_context` (speed at serializing pandas dataframes containing strings and other objects) and the `default_serialization_context` (correctly serializing a large class of numpy object arrays).
   
   This PR sort of messes up the function `pa.pandas_compat.dataframe_to_serialized_dict`. Is that function just a helper function for implementing the custom pandas serializers? Or is it intended to be used in other places.
   
   TODO in this PR (assuming you think this approach is reasonable):
   
   - [ ] remove `pandas_serialization_context`
   - [ ] make sure this code path is tested
   - [ ] double check that performance is good
   
   cc @wesm @pcmoritz @devin-petersohn 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Consider special casing object arrays in pandas serializers.
> ------------------------------------------------------------
>
>                 Key: ARROW-2121
>                 URL: https://issues.apache.org/jira/browse/ARROW-2121
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Robert Nishihara
>            Priority: Major
>              Labels: pull-request-available
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)