You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Eduardo Ponce (Jira)" <ji...@apache.org> on 2021/09/30 03:11:00 UTC

[jira] [Commented] (ARROW-12714) [C++] String title case kernel

    [ https://issues.apache.org/jira/browse/ARROW-12714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422508#comment-17422508 ] 

Eduardo Ponce commented on ARROW-12714:
---------------------------------------

Adding the following notes for reference purposes.

Converting words into a title form or capitalization is not a trivial task because of the complexities of natural language (acronyms, always capitalized words, special cases, non-alpha symbols, etc.). Also, there are no standard rules across libraries and different behaviors can be observed for certain inputs. In this PR, we chose to match the rules of Python's `title()` for its simplicity. Nevertheless, we note that its behavior differs from that of R's *stringr* library (`str_to_title`) when a word begins with numbers.
{code:python}
# R stringr
> str_to_title("1Foo1") # "1foo1"
# Python
>>> "1Foo1".title() # "1Foo1"
{code}

> [C++] String title case kernel
> ------------------------------
>
>                 Key: ARROW-12714
>                 URL: https://issues.apache.org/jira/browse/ARROW-12714
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Ian Cook
>            Assignee: Eduardo Ponce
>            Priority: Major
>              Labels: beginner, pull-request-available
>             Fix For: 6.0.0
>
>          Time Spent: 4h
>  Remaining Estimate: 0h
>
> Capitalizes the first character of each word in the string, like SQL {{initcap}} or Python {{str.title()}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)