You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/04/22 10:15:00 UTC

[jira] [Commented] (ARROW-1993) [Python] Add function for determining implied Arrow schema from pandas.DataFrame

    [ https://issues.apache.org/jira/browse/ARROW-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16447190#comment-16447190 ] 

ASF GitHub Bot commented on ARROW-1993:
---------------------------------------

keechongtan opened a new pull request #1929: ARROW-1993: [Python] Add function for determining implied Arrow schema from pandas.DataFrame
URL: https://github.com/apache/arrow/pull/1929
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> [Python] Add function for determining implied Arrow schema from pandas.DataFrame
> --------------------------------------------------------------------------------
>
>                 Key: ARROW-1993
>                 URL: https://issues.apache.org/jira/browse/ARROW-1993
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Wes McKinney
>            Assignee: Uwe L. Korn
>            Priority: Major
>              Labels: beginner, pull-request-available
>             Fix For: 0.10.0
>
>
> Currently the only option is to use {{Table/Array.from_pandas}} which does significant unnecessary work and allocates memory. If only the schema is of interest, then we could do less work and not allocate memory.
> We should provide the user a function {{pyarrow.Schema.from_pandas}} which takes a DataFrame as an input and returns the respective Arrow schema. The functionality for determing the schema is already available in the Python code, it is at moment just very tightly bound to the conversion infrastructure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)