You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Eduardo Ponce (Jira)" <ji...@apache.org> on 2022/01/04 17:44:00 UTC

[jira] [Comment Edited] (ARROW-15029) [C++] Split compute/kernels/scalar_string.cc

    [ https://issues.apache.org/jira/browse/ARROW-15029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468754#comment-17468754 ] 

Eduardo Ponce edited comment on ARROW-15029 at 1/4/22, 5:43 PM:
----------------------------------------------------------------

I agree with separating ASCII and UTF-8 string kernels for the following reasons:
* The symmetric split makes sense and can make it easier to compare corresponding kernels.
* Support for UTF-8 is controlled by CMake variable {{-DARROW_WITH_UTF8PROC}} so this will allow skipping {{scalar_string_utf8.cc}} at the CMake level and not with C++ {{#ifdef}} guards. Or even a single {{#ifdef}} guard for the entire file.
* If additional string encodings are to be added, then they can be placed in their own source files.


was (Author: edponce):
I agree with separating ASCII and UTF-8 string kernels for the following reasons:
* The symmetric split makes sense and can make it easier to compare corresponding kernels.
* Support for UTF-8 is controlled by CMake variable {{-DARROW_WITH_UTF8PROC}} so this will allow skipping {{scalar_string_utf8.cc}} at the CMake level and not with C++ {{#ifdef}}.
* If additional string encodings are to be added, then they can be placed in their own source files.

> [C++] Split compute/kernels/scalar_string.cc
> --------------------------------------------
>
>                 Key: ARROW-15029
>                 URL: https://issues.apache.org/jira/browse/ARROW-15029
>             Project: Apache Arrow
>          Issue Type: Task
>          Components: C++
>            Reporter: Antoine Pitrou
>            Assignee: Jeroen van Straten
>            Priority: Minor
>              Labels: good-first-issue, good-second-issue
>
> {{compute/kernels/scalar_string.cc}}, which defines scalar string kernels, is getting pretty large (and probably long-ish to compile). It would be nice to split it up thematically into 2 or 3 source files. Common utilities may be factored into a {{scalar_string_internal.h}} header, for example.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)