You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hossein Falaki (Jira)" <ji...@apache.org> on 2019/11/06 21:24:00 UTC
[jira] [Updated] (SPARK-29777) SparkR::cleanClosure aggressively
removes a function required by user function
[ https://issues.apache.org/jira/browse/SPARK-29777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hossein Falaki updated SPARK-29777:
-----------------------------------
Description:
Following code block reproduces the issue:
{code:java}
library(SparkR)
sparkR.session()
spark_df <- createDataFrame(na.omit(airquality))
cody_local2 <- function(param2) {
10 + param2
}
cody_local1 <- function(param1) {
cody_local2(param1)
}
result <- cody_local2(5)
calc_df <- dapplyCollect(spark_df, function(x) {
cody_local2(20)
cody_local1(5)
})
print(result)
{code}
We get following error message:
{code:java}
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 12.0 failed 4 times, most recent failure: Lost task 0.3 in stage 12.0 (TID 27, 10.0.174.239, executor 0): org.apache.spark.SparkException: R computation failed with
Error in cody_local2(param1) : could not find function "cody_local2"
Calls: compute -> computeFunc -> cody_local1
{code}
Compare that to this code block with succeeds:
{code:java}
calc_df <- dapplyCollect(spark_df, function(x) {
cody_local2(20)
#cody_local1(5)
})
{code}
was:
Following code block reproduces the issue:
{code}
library(SparkR)
spark_df <- createDataFrame(na.omit(airquality))
cody_local2 <- function(param2) {
10 + param2
}
cody_local1 <- function(param1) {
cody_local2(param1)
}
result <- cody_local2(5)
calc_df <- dapplyCollect(spark_df, function(x) {
cody_local2(20)
cody_local1(5)
})
print(result)
{code}
We get following error message:
{code}
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 12.0 failed 4 times, most recent failure: Lost task 0.3 in stage 12.0 (TID 27, 10.0.174.239, executor 0): org.apache.spark.SparkException: R computation failed with
Error in cody_local2(param1) : could not find function "cody_local2"
Calls: compute -> computeFunc -> cody_local1
{code}
Compare that to this code block with succeeds:
{code}
calc_df <- dapplyCollect(spark_df, function(x) {
cody_local2(20)
#cody_local1(5)
})
{code}
> SparkR::cleanClosure aggressively removes a function required by user function
> ------------------------------------------------------------------------------
>
> Key: SPARK-29777
> URL: https://issues.apache.org/jira/browse/SPARK-29777
> Project: Spark
> Issue Type: Bug
> Components: SparkR
> Affects Versions: 2.4.4
> Reporter: Hossein Falaki
> Priority: Major
>
> Following code block reproduces the issue:
> {code:java}
> library(SparkR)
> sparkR.session()
> spark_df <- createDataFrame(na.omit(airquality))
> cody_local2 <- function(param2) {
> 10 + param2
> }
> cody_local1 <- function(param1) {
> cody_local2(param1)
> }
> result <- cody_local2(5)
> calc_df <- dapplyCollect(spark_df, function(x) {
> cody_local2(20)
> cody_local1(5)
> })
> print(result)
> {code}
> We get following error message:
> {code:java}
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 12.0 failed 4 times, most recent failure: Lost task 0.3 in stage 12.0 (TID 27, 10.0.174.239, executor 0): org.apache.spark.SparkException: R computation failed with
> Error in cody_local2(param1) : could not find function "cody_local2"
> Calls: compute -> computeFunc -> cody_local1
> {code}
> Compare that to this code block with succeeds:
> {code:java}
> calc_df <- dapplyCollect(spark_df, function(x) {
> cody_local2(20)
> #cody_local1(5)
> })
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org