You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Andrew C Thomas (Jira)" <ji...@apache.org> on 2022/04/29 20:48:00 UTC
[jira] [Created] (ARROW-16423) R arrow/dplyr: simple join and collect crashes session
Andrew C Thomas created ARROW-16423:
---------------------------------------
Summary: R arrow/dplyr: simple join and collect crashes session
Key: ARROW-16423
URL: https://issues.apache.org/jira/browse/ARROW-16423
Project: Apache Arrow
Issue Type: Bug
Components: R
Affects Versions: 7.0.0
Reporter: Andrew C Thomas
Trying to do an inner join style filter on an open_dataset, and R crashes, but not reliably the first time. Sometimes takes a couple of tries until it does.
Reprex follows.
------------------------------------------------------
library (arrow)
library (dplyr)
library (tidyr)
DataSet <- expand_grid (A = 1:10, B = 1:10, C = 1:10000) %>%
group_by (A, B)
write_dataset(DataSet, "TestBreakData")
for (DoThisUntilItBreaks in 1:100) {
message (DoThisUntilItBreaks)
D2 <- open_dataset("TestBreakData") %>% inner_join (data.frame (A=1L, B=1:5)) %>% collect
}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)