You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Joris Van den Bossche (JIRA)" <ji...@apache.org> on 2019/08/13 13:23:00 UTC

[jira] [Commented] (ARROW-5295) [Python] accept pyarrow values / scalars in constructor functions ?

    [ https://issues.apache.org/jira/browse/ARROW-5295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906190#comment-16906190 ] 

Joris Van den Bossche commented on ARROW-5295:
----------------------------------------------

Additional case (from ARROW-6222): pyarrow Arrays are also not recognized as list-like when converting/inferring a list array:

{code}
In [43]: pa.array([np.array([1, 1]), np.array([2, 2, 2])])                                                                                                                                                         
Out[43]: 
<pyarrow.lib.ListArray object at 0x7f258fa9d0a0>
[
  [
    1,
    1
  ],
  [
    2,
    2,
    2
  ]
]

In [44]: pa.array([pa.array([1, 1]), pa.array([2, 2, 2])])                                                                                                                                                         
---------------------------------------------------------------------------
ArrowInvalid                              Traceback (most recent call last)
<ipython-input-44-a22e0a500750> in <module>
----> 1 pa.array([pa.array([1, 1]), pa.array([2, 2, 2])])

~/scipy/repos/arrow/python/pyarrow/array.pxi in pyarrow.lib.array()

~/scipy/repos/arrow/python/pyarrow/array.pxi in pyarrow.lib._sequence_to_array()

~/scipy/repos/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()

ArrowInvalid: Could not convert [
  1,
  1
] with type pyarrow.lib.Int64Array: did not recognize Python value type when inferring an Arrow data type
{code}

So list (or array) of numpy arrays works, but list of pyarrow arrays not. Again, not the most typical use case of pyarrow Arrays, so not sure we should add this capability.

(although we might want to find a general solution for array-like objects (eg pytorch.Tensors, see ARROW-6222), and a solution for that (somehow trying to coerce to a numpy array?) might also solve the case of a list of arrow arrays)

> [Python] accept pyarrow values / scalars in constructor functions ?
> -------------------------------------------------------------------
>
>                 Key: ARROW-5295
>                 URL: https://issues.apache.org/jira/browse/ARROW-5295
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Joris Van den Bossche
>            Priority: Major
>
> Currently, functions like \{{pyarrow.array}} don't accept pyarrow Arrays, or also not scalars of it:
> {code}
> In [42]: arr = pa.array([1, 2, 3])
> In [43]: pa.array(arr)
> ...
> ArrowInvalid: Could not convert 1 with type pyarrow.lib.Int64Value: did not recognize Python value type when inferring an Arrow data type
> In [44]: pa.array(list(arr))
> ...
> ArrowInvalid: Could not convert 1 with type pyarrow.lib.Int64Value: did not recognize Python value type when inferring an Arrow data type
> {code}
> Do we want to allow those / recognize those here? (the first case could even have a fastpath, as we don't need to do it element by element).
> Also scalars are not supported:
> {code}
> In [46]: type(arr.sum())
> Out[46]: pyarrow.lib.Int64Scalar
> In [47]: pa.array([arr.sum()])
> ...
> ArrowInvalid: Could not convert 6 with type pyarrow.lib.Int64Scalar: did not recognize Python value type when inferring an Arrow data type
> {code}
> And also in other functions we don't accept arrow scalars / values:
> {code}
> In [48]: string = pa.array(['a'])[0]
> In [49]: type(string)
> Out[49]: pyarrow.lib.StringValue
> In [50]: pa.field(string, pa.int64())
> ...
> TypeError: expected bytes, pyarrow.lib.StringValue found
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)