You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "MvR (Jira)" <ji...@apache.org> on 2020/12/15 11:51:00 UTC
[jira] [Created] (ARROW-10916) gapply fails executing with rbind
error
MvR created ARROW-10916:
---------------------------
Summary: gapply fails executing with rbind error
Key: ARROW-10916
URL: https://issues.apache.org/jira/browse/ARROW-10916
Project: Apache Arrow
Issue Type: Bug
Components: R
Affects Versions: 2.0.0
Environment: Databricks runtime 7.3 LTS ML
Reporter: MvR
Attachments: Rerror.log
Executing following code on databricks runtime 7.3 LTS ML errors out showing some rbind error whereas it is successfully executed without enabling Arrow in Spark session. Full error message attached.
```
library(dplyr)
library(SparkR)
SparkR::sparkR.session(sparkConfig = list(spark.sql.execution.arrow.sparkr.enabled = "true"))
mtcars %>%
SparkR::as.DataFrame() %>%
SparkR::gapply(x = .,
cols = c("cyl", "vs"),
func = function(key,
data){
dt <- data[,c("mpg", "qsec")]
res <- apply(dt, 2, mean)
df <- data.frame(firstGroupKey = key[1],
secondGroupKey = key[2],
mean_mpg = res[1],
mean_cyl = res[2])
return(df)
},
schema = structType(structField("cyl", "double"),
structField("vs", "double"),
structField("mpg_mean", "double"),
structField("qsec_mean", "double"))
) %>%
display()
```
--
This message was sent by Atlassian Jira
(v8.3.4#803005)