You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Alessandro Molina (Jira)" <ji...@apache.org> on 2022/01/04 14:00:00 UTC

[jira] [Updated] (ARROW-13570) [C++][Compute] Additional scalar ASCII kernels can reuse original offsets buffer

     [ https://issues.apache.org/jira/browse/ARROW-13570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alessandro Molina updated ARROW-13570:
--------------------------------------
    Fix Version/s: 8.0.0
                       (was: 7.0.0)

> [C++][Compute] Additional scalar ASCII kernels can reuse original offsets buffer
> --------------------------------------------------------------------------------
>
>                 Key: ARROW-13570
>                 URL: https://issues.apache.org/jira/browse/ARROW-13570
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Eduardo Ponce
>            Priority: Major
>             Fix For: 8.0.0
>
>
> Some ASCII scalar string kernels are able to reuse the original offsets buffer, so they are not preallocated in the output (use *MemAllocation::NO_PREALLOCATE* during registration). Currently, only kernels that apply a transformation to each character independently via [StringDataTransform|https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_string.cc#L590-L631] support the no preallocation policy. But there are additional string kernels that do not modify the length (nor offsets) of the input string but apply scalar transforms that depend on neighboring characters.
> This issue should extend/create *StringDataTransform* to take multiple input transforms in order to support *MemAllocation::NO_PREALLOCATE* policy for additional scalar ASCII kernels (e.g., _ascii_title_).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)