You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "David Li (Jira)" <ji...@apache.org> on 2021/11/19 16:35:00 UTC
[jira] [Commented] (ARROW-14778) [C++] mean on a decimal truncates and does not round
[ https://issues.apache.org/jira/browse/ARROW-14778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17446568#comment-17446568 ]
David Li commented on ARROW-14778:
----------------------------------
Ah, it's because we perform all the computations at the input decimal precision/scale (so only 1 decimal digit here). We could perhaps promote it to the max precision/scale, then round it back down? (e.g. for decimal128(5, 1), do computations at decimal128(38, 2) or something then round back down to (5, 1), I haven't thought this though too much, also this would apply to many of the other decimal kernels).
> [C++] mean on a decimal truncates and does not round
> ----------------------------------------------------
>
> Key: ARROW-14778
> URL: https://issues.apache.org/jira/browse/ARROW-14778
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Jonathan Keane
> Priority: Major
> Labels: query-engine
>
> {code}
> library(arrow, warn.conflicts = FALSE)
> library(dplyr, warn.conflicts = FALSE)
> df <- data.frame(
> x = c(0.1, 0.2, 0.2, 0.2, 0.2)
> )
> tab <- Table$create(df)
> tab %>%
> summarise(mean(x)) %>%
> collect()
> #> # A tibble: 1 × 1
> #> `mean(x)`
> #> <dbl>
> #> 1 0.18
> tab %>%
> summarise(x = mean(x)) %>%
> mutate(x = cast(x, decimal(5, 1))) %>%
> collect()
> #> # A tibble: 1 × 1
> #> x
> #> <dbl>
> #> 1 0.2
> tab %>%
> mutate(x = cast(x, decimal(5, 1))) %>%
> summarise(x = mean(x)) %>%
> collect()
> #> # A tibble: 1 × 1
> #> x
> #> <dbl>
> #> 1 0.1
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)