You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (Jira)" <ji...@apache.org> on 2020/05/11 13:19:00 UTC

[jira] [Commented] (ARROW-555) [C++] String algorithm library for StringArray/BinaryArray

    [ https://issues.apache.org/jira/browse/ARROW-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104443#comment-17104443 ] 

Wes McKinney commented on ARROW-555:
------------------------------------

Update: I'm in the middle of an overhaul of the API for implementing new Array functions / kernels, with the goal of making it much easier to add new functions (e.g. generating a string function given an inlineable implementation of computing a single value). Once that's done (since I'm working on it right now, it will be this month) I will probably ask someone from my team to make an initial cut at a precompiled string function set based on the functions that are already in Gandiva / LLVM codegen and add new functions (from e.g. Impala or other SQL engines) that are not yet present. The work need not be monolithic so as soon as the framework is in place it should be straightforward to add new functions and test them. Additionally, adding Python bindings for the new functions should also be easy (all you will need is the name of the function you're calling, so some of the Cython binding boilerplate that exists now should also go away). 

> [C++] String algorithm library for StringArray/BinaryArray
> ----------------------------------------------------------
>
>                 Key: ARROW-555
>                 URL: https://issues.apache.org/jira/browse/ARROW-555
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Wes McKinney
>            Priority: Major
>              Labels: Analytics
>
> This is a parent JIRA for starting a module for processing strings in-memory arranged in Arrow format. This will include using the re2 C++ regular expression library and other standard string manipulations (such as those found on Python's string objects)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)