You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Neal Richardson (Jira)" <ji...@apache.org> on 2021/02/01 19:42:00 UTC

[jira] [Commented] (ARROW-11455) [R] Improve handling of -2^31 in 32-bit integer fields

    [ https://issues.apache.org/jira/browse/ARROW-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276603#comment-17276603 ] 

Neal Richardson commented on ARROW-11455:
-----------------------------------------

We have a function we use to determine whether an int64 can fix in an int32: https://github.com/apache/arrow/blob/master/r/src/array_to_vector.cpp#L1043

But I'm don't think we can easily use that though, given how it is implemented in the C++ library. We may need to write something custom. cc [~bkietz] [~romainfrancois] 

> [R] Improve handling of -2^31 in 32-bit integer fields
> ------------------------------------------------------
>
>                 Key: ARROW-11455
>                 URL: https://issues.apache.org/jira/browse/ARROW-11455
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>    Affects Versions: 3.0.0
>            Reporter: Ian Cook
>            Assignee: Ian Cook
>            Priority: Major
>
> R’s {{integer}} range is 1 smaller than the normal 32-bit integer range of C++, Java, etc. In R, it’s {{-2^31 + 1}} to {{2^31 - 1}}. Elsewhere, it’s {{-2^31}} to {{2^31 - 1}}. So R's native {{integer}} type cannot represent {{-2^31}} ({{-2147483648}}). I believe this is because R uses {{-2^31}} as the sentinel value to represent {{NA_integer_}}.
> If you run {{-2147483648L}} in R, it converts the vector it to {{numeric}} type and issues a warning:
> {code:java}
> Warning message:
> non-integer value 2147483648L qualified with L; using numeric value 
> {code}
> In the {{arrow}} R package, when a 32-bit integer Arrow field containing the value {{-2147483648}} is converted to an R {{integer}} vector, that value becomes {{NA_integer_}}. No warning is given. Consider whether we should handle this case differently and whether it is feasible to do so without performance regressions. Other possible behaviors might be:
>  * Converting the value to {{NA_integer_}} with a warning
>  * Converting the field to {{bit64::integer64}} with a warning
>  * Converting the field to {{base::numeric}} with a warning
>  * Allowing the user to specify an argument or option to control the behavior



--
This message was sent by Atlassian Jira
(v8.3.4#803005)