You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Neal Richardson (Jira)" <ji...@apache.org> on 2019/11/12 22:31:00 UTC

[jira] [Resolved] (ARROW-7066) [Python] support returning ChunkedArray from __arrow_array__ ?

     [ https://issues.apache.org/jira/browse/ARROW-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neal Richardson resolved ARROW-7066.
------------------------------------
    Resolution: Fixed

Issue resolved by pull request 5794
[https://github.com/apache/arrow/pull/5794]

> [Python] support returning ChunkedArray from __arrow_array__ ?
> --------------------------------------------------------------
>
>                 Key: ARROW-7066
>                 URL: https://issues.apache.org/jira/browse/ARROW-7066
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Joris Van den Bossche
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.0.0
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The {{\_\_arrow_array\_\_}} protocol was added so that custom objects can define how they should be converted to a pyarrow Array (similar to numpy's {{\_\_array\_\_}}). This is then also used to support converting pandas DataFrames with columns using pandas' ExtensionArrays to a pyarrow Table (if the pandas ExtensionArray, such as nullable integer type, implements this {{\_\_arrow_array\_\_}} method).
> This last use case could also be useful for fletcher (https://github.com/xhochy/fletcher/, a package that implements pandas ExtensionArrays that wrap pyarrow arrays, so they can be stored as is in a pandas DataFrame).  
> However, fletcher stores ChunkedArrays in ExtensionArry / the columns of a pandas DataFrame (to have a better mapping with a Table, where the columns also consist of chunked arrays). While we currently require that the return value of {{\_\_arrow_array\_\_}} is a pyarrow.Array.
> So I was wondering: could we relax this constraint and also allow ChunkedArray as return value? 
> However, this protocol is currently called in the {{pa.array(..)}} function, which probably should keep returning an Array (and not ChunkedArray in certain cases).
> cc [~uwe]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)