You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/09/01 07:16:00 UTC

[jira] [Updated] (ARROW-13810) [C++][Compute] Predicate IsAsciiCharacter allows invalid types and values

     [ https://issues.apache.org/jira/browse/ARROW-13810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated ARROW-13810:
-----------------------------------
    Labels: pull-request-available  (was: )

> [C++][Compute] Predicate IsAsciiCharacter allows invalid types and values
> -------------------------------------------------------------------------
>
>                 Key: ARROW-13810
>                 URL: https://issues.apache.org/jira/browse/ARROW-13810
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Eduardo Ponce
>            Assignee: Eduardo Ponce
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 6.0.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The *IsAsciiCharacter* predicate has a template type for its input parameter but it only checks if the value is less than 128. This check is unbounded in its lower bound, so it is prone to receive non-ASCII characters (negative value) and return true. There are three solutions in order of my preference:
>  # Remove the template and make the type uint8_t. All its use cases use uint8_t and this is the type used across the ASCII operations. Also, the other string-related utility functions are not templates.
>  # Constrain the template so that it only accepts unsigned integers.
>  # Add a check to validate that argument is non-negative. This check adds unnecessary overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)