You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Nicola Crane (Jira)" <ji...@apache.org> on 2022/08/22 08:00:05 UTC
[jira] [Created] (ARROW-17490) [R] Differing results in log bindings
Nicola Crane created ARROW-17490:
------------------------------------
Summary: [R] Differing results in log bindings
Key: ARROW-17490
URL: https://issues.apache.org/jira/browse/ARROW-17490
Project: Apache Arrow
Issue Type: Improvement
Components: R
Reporter: Nicola Crane
We get different results for dplyr versus Acero if we call log on a column that contains 0, i.e.
{code:r}
``` r
library(arrow)
library(dplyr)
df <- tibble(x = 0:10)
df %>%
mutate(y = log(x)) %>%
collect()
#> # A tibble: 11 × 2
#> x y
#> <int> <dbl>
#> 1 0 -Inf
#> 2 1 0
#> 3 2 0.693
#> 4 3 1.10
#> 5 4 1.39
#> 6 5 1.61
#> 7 6 1.79
#> 8 7 1.95
#> 9 8 2.08
#> 10 9 2.20
#> 11 10 2.30
df %>%
arrow_table() %>%
mutate(y = log(x)) %>%
collect()
#> Error in `collect()`:
#> ! Invalid: logarithm of zero
```
{code}
This is because R defines {{log(0)}} as {{-Inf}} whereas Acero defines it as an error. Not sure what the solution is here; do we want to request the addition of an Acero option to define behaviour for this?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)