You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2021/02/04 09:54:00 UTC

[jira] [Commented] (ARROW-11487) [Python] Can't create array from Categorical with numpy 1.20

    [ https://issues.apache.org/jira/browse/ARROW-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278722#comment-17278722 ] 

Joris Van den Bossche commented on ARROW-11487:
-----------------------------------------------

Hi [~jseabold], older pyarrow releases are not compatible with numpy 1.20, so if you want to use the latest numpy, you also need to use the latest pyarrow 3.0.0.

See ARROW-11450, ARROW-10833 and https://github.com/numpy/numpy/issues/17913 fore more details

> [Python] Can't create array from Categorical with numpy 1.20
> ------------------------------------------------------------
>
>                 Key: ARROW-11487
>                 URL: https://issues.apache.org/jira/browse/ARROW-11487
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 2.0.0
>         Environment: OSX, numpy 1.20.0, pyarrow 2.0. pandas 1.2.1
>            Reporter: Skipper Seabold
>            Priority: Major
>
> Upgraded to numpy 1.20, and some of my pipelines started to fail.
> With numpy 1.20
>  
> {code:java}
> In [16]: pa.lib.array(pd.Categorical(['a', 'b', 'c']))
> ---------------------------------------------------------------------------
> ArrowTypeError                            Traceback (most recent call last)
> <ipython-input-16-f1ab121f9533> in <module>
> ----> 1 pa.lib.array(pd.Categorical(['a', 'b', 'c']))~/.miniconda3/lib/python3.7/site-packages/pyarrow/array.pxi in pyarrow.lib.array()~/.miniconda3/lib/python3.7/site-packages/pyarrow/array.pxi in pyarrow.lib._codes_to_indices()~/.miniconda3/lib/python3.7/site-packages/pyarrow/array.pxi in pyarrow.lib.array()~/.miniconda3/lib/python3.7/site-packages/pyarrow/array.pxi in pyarrow.lib._ndarray_to_array()~/.miniconda3/lib/python3.7/site-packages/pyarrow/array.pxi in pyarrow.lib._ndarray_to_type()~/.miniconda3/lib/python3.7/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status()ArrowTypeError: Did not pass numpy.dtype object
> {code}
> With 1.19.1
> {code:java}
> In [13]: pa.lib.array(pd.Categorical(['a', 'b', 'c']))
> Out[13]:
> <pyarrow.lib.DictionaryArray object at 0x7febac27ad50>-- dictionary:
>   [
>     "a",
>     "b",
>     "c"
>   ]
> -- indices:
>   [
>     0,
>     1,
>     2
>   ]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)