You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Dewey Dunnington (Jira)" <ji...@apache.org> on 2021/11/24 17:38:00 UTC
[jira] [Created] (ARROW-14855) [R] build_expr() should check that non-expression inputs have vec_size() == 1L
Dewey Dunnington created ARROW-14855:
----------------------------------------
Summary: [R] build_expr() should check that non-expression inputs have vec_size() == 1L
Key: ARROW-14855
URL: https://issues.apache.org/jira/browse/ARROW-14855
Project: Apache Arrow
Issue Type: Improvement
Components: R
Reporter: Dewey Dunnington
What I’m trying to do is error to prevent code like this from working (since row order isn’t guaranteed in Arrow but is in R):
{code:R}
# remotes::install_github("apache/arrow/r#11690")
library(arrow, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)
record_batch(a = c("something1", "something2")) %>%
mutate(df_col = data.frame(a, b = c("other1", "other2")))
#> InMemoryDataset (query)
#> a: string
#> df_col: struct<a: string, b: list<item: string>> ({a=a, b=...})
#>
#> See $.data for the source Arrow object
tibble(a = c("something1", "something2")) %>%
mutate(df_col = data.frame(a, b = c("other1", "other2"))) %>%
arrow:::arrow_dplyr_query()
#> InMemoryDataset (query)
#> a: string
#> df_col: struct<a: string, b: string>
#>
#> See $.data for the source Arrow object
{code}
This shows up elsewhere too with a confusing error:
{code:R}
record_batch(a = 1:2) %>% mutate(a + 3:4)
#> Error: NotImplemented: Function add_checked has no kernel matching input types (array[int32], scalar[list<item: int32>])
#> /Users/deweydunnington/Desktop/rscratch/arrow/cpp/src/arrow/compute/exec/expression.cc:340 call.function->DispatchBest(&descrs)
{code}
I think we need slightly different rules than {{Scalar$create()}} uses when interpreting user expressions, since we want to error rather than wrap values that aren’t {{vctrs::vec_size() != 1}} in {{list()}} (thus changing the type that the user specified).
Relevant section of {{build_expr()}}: <https://github.com/apache/arrow/blob/4b1135ccfd3075a175667c38dc6326865288caf6/r/R/expression.R#L204-L209>
Relevant section of {{Scalar$create()}}: <https://github.com/apache/arrow/blob/4b1135ccfd3075a175667c38dc6326865288caf6/r/R/scalar.R#L75-L83>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)