You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2022/10/11 13:53:00 UTC

[jira] [Commented] (ARROW-17989) [C++] Enable struct_field kernel to accept string field names

    [ https://issues.apache.org/jira/browse/ARROW-17989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17615821#comment-17615821 ] 

Joris Van den Bossche commented on ARROW-17989:
-----------------------------------------------

This seems somewhat duplicate / related to ARROW-17141 (cc [~rokm] [~lidavidm]), although I don't fully understand why that was closed. David mentioned:

> Yeah, if the Python bindings convert names to indices that makes sense.

I suppose if you call "struct_field" directly on an actual StructArray, it's indeed the binding that could do this string name -> index conversion (although it currently does not do that). But if you use this with expressions, at the moment when constructing the expression (eg with {{pc.struct_field(pc.field("my_struct"), ["my_field"])}} you don't know the schema and can't convert the field name to index. 

It _seems_ relatively straightforward in the kernel itself to also work with string field names (getting the index would be a {{struct_array->type->GetFieldIndex(name)}} away), unless I am missing some consequences?


> [C++] Enable struct_field kernel to accept string field names
> -------------------------------------------------------------
>
>                 Key: ARROW-17989
>                 URL: https://issues.apache.org/jira/browse/ARROW-17989
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Joris Van den Bossche
>            Priority: Major
>              Labels: compute
>
> Currently the "struct_field" kernel only works for integer indices for the child fields. From the StructFieldOption class (https://github.com/apache/arrow/blob/3d7f2f22a0fc441a41b8fa971e11c0f4290ebb24/cpp/src/arrow/compute/api_scalar.h#L283-L285):
> {code}
>   /// The child indices to extract. For instance, to get the 2nd child
>   /// of the 1st child of a struct or union, this would be {0, 1}.
>   std::vector<int> indices;
> {code}
> It would be nice if you could also refer to fields by name in addition to by position.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)