You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2021/12/02 14:26:00 UTC
[jira] [Commented] (ARROW-14946) [C++][Python] An operator for finding indices of a value
[ https://issues.apache.org/jira/browse/ARROW-14946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452435#comment-17452435 ]
Joris Van den Bossche commented on ARROW-14946:
-----------------------------------------------
This is also related to numpy's {{nonzero}} in combination with an equality comparison:
{code}
In [66]: values = np.array([1, 2, 2, 3, 4, 1])
In [67]: np.nonzero(values == 1)
Out[67]: (array([0, 5]),)
{code}
which is also being discussed in ARROW-13035.
Although for this case having to go through a boolean array to only find the indices might give an additional overhead (this might be worth experimenting with).
---
> This would be a binary vector kernel IMO.
For a scalar right-value (as in your example above), the expected behaviour is clear. But would it be limited to scalars? (the expected behaviour for non-scalars is not really obvious to me)
> [C++][Python] An operator for finding indices of a value
> ---------------------------------------------------------
>
> Key: ARROW-14946
> URL: https://issues.apache.org/jira/browse/ARROW-14946
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++, Python
> Reporter: Niranda Perera
> Priority: Major
>
> As discussed in this mail thread [1], it would be nice to have a search operator returning the indices of a Value.
> ex:
> {code:java}
> values = pa.array([1, 2, 2, 3, 4, 1])
> indices = find_indices(values, 1) # expected = [0, 5]{code}
> currently there is an option to get the "first index" of a value using aggregates.index method. This would be a binary vector kernel IMO.
> This is somewhat similar to `numpy.where` [2] but without a `y` input.
>
> [1] [https://lists.apache.org/thread/o8d4m905fxswcg0qjjx7gj3ql2d582k4]
> [2] https://numpy.org/doc/stable/reference/generated/numpy.where.html
--
This message was sent by Atlassian Jira
(v8.20.1#820001)