You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Tom Augspurger (JIRA)" <ji...@apache.org> on 2018/06/02 18:32:00 UTC

[jira] [Commented] (ARROW-2667) [C++/Python] Add pandas-like take method to Array/Column/ChunkedArray

    [ https://issues.apache.org/jira/browse/ARROW-2667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499142#comment-16499142 ] 

Tom Augspurger commented on ARROW-2667:
---------------------------------------

Note that pandas' `take` is a bit complicated by trying to satisfy two APIs simultaneously.

 

There's the NumPy-style take from [https://docs.scipy.org/doc/numpy/reference/generated/numpy.take.html,] where negative indices mean slices from the end.

 

And then there's the "pandas" style `take` where `-1` means "indicator for missing values, which should be filled with the `na_value` parameter." Other negative numbers are not allowed.

 

I'm not sure which is more appropriate for Arrow, but wanted to share a bit of background.

> [C++/Python] Add pandas-like take method to Array/Column/ChunkedArray
> ---------------------------------------------------------------------
>
>                 Key: ARROW-2667
>                 URL: https://issues.apache.org/jira/browse/ARROW-2667
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Python
>            Reporter: Uwe L. Korn
>            Priority: Major
>
> We should add a {{take}} method to {{Array/ChunkedArray/Column}} that takes a list of indices and returns a reordered array.
> For reference, see Pandas' interface: https://github.com/pandas-dev/pandas/blob/2cbdd9a2cd19501c98582490e35c5402ae6de941/pandas/core/arrays/base.py#L466



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)