You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ian Cook (Jira)" <ji...@apache.org> on 2021/10/28 18:34:00 UTC
[jira] [Comment Edited] (ARROW-14509) as_vector() downgrades int64
even when arrow.int64_downcast = TRUE
[ https://issues.apache.org/jira/browse/ARROW-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435617#comment-17435617 ]
Ian Cook edited comment on ARROW-14509 at 10/28/21, 6:33 PM:
-------------------------------------------------------------
[~mcolosimo-p4] the option defaults to {{TRUE}}. To set it to {{FALSE}} within package code, you could use [{{withr::with_options()}}|https://withr.r-lib.org/reference/with_options.html]. Alternatively you could call {{bit64::as.integer64()}} in your package code (as it looks like you're already doing). Do those solutions work sufficiently for you?
was (Author: icook):
[~mcolosimo-p4] the option defaults to {{TRUE}}. To set it to {{FALSE}} within package code, you could use [{{withr::with_options()}}|https://withr.r-lib.org/reference/with_options.html]. Alternatively you could call {{bit64::as.integer64()}} in your package code. Do those solutions work sufficiently for you?
> as_vector() downgrades int64 even when arrow.int64_downcast = TRUE
> ------------------------------------------------------------------
>
> Key: ARROW-14509
> URL: https://issues.apache.org/jira/browse/ARROW-14509
> Project: Apache Arrow
> Issue Type: Bug
> Components: R
> Affects Versions: 5.0.0
> Environment: linux
> Reporter: Marc Colosimo
> Priority: Major
>
> Using as_vector() on a Table or Array when the type is Int64 and arrow.int64_downcast = TRUE still downgrades, unless there is a value greater than Int32 can store (actually it switches over at some lower value; guessing the integer to numeric switch over in R).
> {quote}library(arrow)
> options(arrow.int64_downcast = TRUE)
> int64s <- bit64::as.integer64(c(1:10, 101:110, 201:205))
> y <- Array$create(int64s)
> y$type
> yv <- y$as_vector()
> class(yv)
> int64s <- c(int64s, bit64::as.integer64("68719476735")) # 0xF FFFFFFFF
> y <- Array$create(int64s)
> y$type
> yv <- y$as_vector()
> class(yv)
> {quote}
> Outputs:
> {quote}Int64
> int64
> [1] "integer"
> Int64
> int64
> [1] "integer64"
> {quote}
> This can cause an unexpected overflow
> {quote}int64s <- bit64::as.integer64(c(1:10, 101:110, 201:205, 268435454, 2147483632)) # 0xFFFFFFE, 0x7FFFFFF0
> cumsum(int64s)
> y <- Array$create(int64s)
> y$type
> yv <- y$as_vector()
> class(yv)
> cumsum(yv)
> {quote}
> as shown in the second cumsum
> {quote}integer64
> [1] 1 3 6 10 15 21
> [7] 28 36 45 55 156 258
> [13] 361 465 570 676 783 891
> [19] 1000 1110 1311 1513 1716 1920
> [25] 2125 268437579 2415921211
> Int64
> int64
> [1] "integer"
> [1] 1 3 6 10 15 21 28
> [8] 36 45 55 156 258 361 465
> [15] 570 676 783 891 1000 1110 1311
> [22] 1513 1716 1920 2125 268437579 NA
> Warning message:
> integer overflow in 'cumsum'; use 'cumsum(as.numeric(.))'
> {quote}
> The actual version is Version: 5.0.0.9000 running under R 3.6.3
--
This message was sent by Atlassian Jira
(v8.3.4#803005)