You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Paul Balanca (Jira)" <ji...@apache.org> on 2020/03/05 14:15:01 UTC

[jira] [Created] (ARROW-8010) [Python] Fixed size list not convertible to Numpy Array

Paul Balanca created ARROW-8010:
-----------------------------------

             Summary: [Python] Fixed size list not convertible to Numpy Array
                 Key: ARROW-8010
                 URL: https://issues.apache.org/jira/browse/ARROW-8010
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Python
    Affects Versions: 0.16.0
         Environment: Ubuntu 19.10 + python 3.7
            Reporter: Paul Balanca


Fixed size list of base types (i.e. int, float, ...) are not convertible to Numpy array.

The following code:
{code:java}
t = pa.list_(pa.float32(), 2)
arr = pa.array([[1, 2], [3, 4], [5, 6]], type=t)
arr.to_numpy(){code}
raises a not implemented Arrow error as there is no Pandas block equivalent.

It sounds reasonable that the conversion to Pandas fails, but I would expect a natural conversion to Numpy Array, as according to the Fixed Size List Layout ([https://arrow.apache.org/docs/format/Columnar.html#]), the former could be mapped to a 2-dimensional row major matrix (e.g. 3x2 in the previous example).

This form of memory representation is quite natural if ones wants to use Apache Arrow for in-memory collection of 2D/3D points, where we wish to have coordinates contiguous in memory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)