You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Nicola Crane (Jira)" <ji...@apache.org> on 2022/05/03 14:57:00 UTC

[jira] [Created] (ARROW-16447) [R] Integer overflow causes error - (in dplyr we get an NA with a warning)

Nicola Crane created ARROW-16447:
------------------------------------

             Summary: [R] Integer overflow causes error - (in dplyr we get an NA with a warning)
                 Key: ARROW-16447
                 URL: https://issues.apache.org/jira/browse/ARROW-16447
             Project: Apache Arrow
          Issue Type: Bug
          Components: R
            Reporter: Nicola Crane


{code:java}
library(dplyr)
library(arrow)

.input = tibble::tibble(
  x = .Machine$integer.max
)

# in dplyr
.input %>%
      mutate(x2 = x + 6L) %>%
      collect()
#> Warning in x + 6L: NAs produced by integer overflow
#> # A tibble: 1 × 2
#>            x    x2
#>        <int> <int>
#> 1 2147483647    NA

# in Arrow via arrow
.input %>%
      arrow_table() %>%
      mutate(x2 = x + 6L) %>%
      collect()
#> Error in `collect()`:
#> ! Invalid: overflow
#> /home/nic2/arrow/cpp/src/arrow/compute/exec.cc:701  kernel_->exec(kernel_ctx_, batch, &out)
#> /home/nic2/arrow/cpp/src/arrow/compute/exec.cc:642  ExecuteBatch(batch, listener)
#> /home/nic2/arrow/cpp/src/arrow/compute/exec/expression.cc:547  executor->Execute(arguments, &listener)
#> /home/nic2/arrow/cpp/src/arrow/compute/exec/project_node.cc:90  ExecuteScalarExpression(simplified_expr, target, plan()->exec_context())
#> /home/nic2/arrow/cpp/src/arrow/compute/exec/exec_plan.cc:463  iterator_.Next()
#> /home/nic2/arrow/cpp/src/arrow/record_batch.cc:337  ReadNext(&batch)
#> /home/nic2/arrow/cpp/src/arrow/record_batch.cc:351  ToRecordBatches()
{code}
Do we want to enable the return of NAs on integer overflow, or just give the user a more specific hint in the error message?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)