You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jeet (Jira)" <ji...@apache.org> on 2021/03/18 16:53:00 UTC
[jira] [Updated] (SPARK-34791) SparkR throws node stack overflow
[ https://issues.apache.org/jira/browse/SPARK-34791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeet updated SPARK-34791:
-------------------------
Description:
SparkR throws "node stack overflow" error upon running code (sample below) on R-4.0.2 with Spark 3.0.1.
Same piece of code works on R-3.3.3 with Spark 2.2.1 (& SparkR 2.4.5)
{{}}
{code:java}
source('sample.R')
myclsr = myclosure_func()
myclsr$get_some_date('2021-01-01')
## spark.lapply throws node stack overflow
result = spark.lapply(c('2021-01-01', '2021-01-02'), function (rdate) {
source('sample.R')
another_closure = myclosure_func()
return(another_closure$get_some_date(rdate))
})
{code}
{{}}
Sample.R
{{}}
{code:java}
## util function, which calls itself
getPreviousBusinessDate <- function(asofdate) {
asdt <- asofdate;
asdt <- as.Date(asofdate)-1;
wd <- format(as.Date(asdt),"%A")
if(wd == "Saturday" | wd == "Sunday") {
return (getPreviousBusinessDate(asdt));
}
return (asdt);
}
## closure which calls util function
myclosure_func = function() {
myclosure = list()
get_some_date = function (random_date) {
return (getPreviousBusinessDate(random_date))
}
myclosure$get_some_date = get_some_date
return(myclosure)
}
{code}
This seems to have caused by sourcing sample.R twice. Once before invoking Spark session and another within Spark session.
was:
SparkR throws "node stack overflow" error upon running code (sample below) on R-4.0.2 with Spark 3.0.1.
Same piece of code works on R-3.3.3 with Spark 2.2.1 (& SparkR 2.4.5)
{{}}
{code:java}
source('sample.R')
myclsr = myclosure_func()
myclsr$get_some_date('2021-01-01')
## spark.lapply throws node stack overflow
result = spark.lapply(c('2021-01-01', '2021-01-02'), function (rdate) {
source('sample.R')
another_closure = myclosure_func()
return(another_closure$get_some_date(rdate))
})
{code}
{{}}
Sample.R
{{}}
{code:java}
## util function, which calls itself
getPreviousBusinessDate <- function(asofdate) {
asdt <- asofdate;
asdt <- as.Date(asofdate)-1;
wd <- format(as.Date(asdt),"%A")
if(wd == "Saturday" | wd == "Sunday") {
return (getPreviousBusinessDate(asdt));
}
return (asdt);
}
## closure which calls util function
myclosure_func = function() {
myclosure = list()
get_some_date = function (random_date) {
return (getPreviousBusinessDate(random_date))
}
myclosure$get_some_date = get_some_date
return(myclosure)
}
{code}
{{}}
{{}}
This seems to have caused by sourcing sample.R twice. Once before invoking Spark session and another within Spark session.
{{}}
{{}}
{{}}
> SparkR throws node stack overflow
> ---------------------------------
>
> Key: SPARK-34791
> URL: https://issues.apache.org/jira/browse/SPARK-34791
> Project: Spark
> Issue Type: Question
> Components: SparkR
> Affects Versions: 3.0.1
> Reporter: Jeet
> Priority: Major
>
> SparkR throws "node stack overflow" error upon running code (sample below) on R-4.0.2 with Spark 3.0.1.
> Same piece of code works on R-3.3.3 with Spark 2.2.1 (& SparkR 2.4.5)
> {{}}
> {code:java}
> source('sample.R')
> myclsr = myclosure_func()
> myclsr$get_some_date('2021-01-01')
> ## spark.lapply throws node stack overflow
> result = spark.lapply(c('2021-01-01', '2021-01-02'), function (rdate) {
> source('sample.R')
> another_closure = myclosure_func()
> return(another_closure$get_some_date(rdate))
> })
> {code}
> {{}}
> Sample.R
> {{}}
> {code:java}
> ## util function, which calls itself
> getPreviousBusinessDate <- function(asofdate) {
> asdt <- asofdate;
> asdt <- as.Date(asofdate)-1;
> wd <- format(as.Date(asdt),"%A")
> if(wd == "Saturday" | wd == "Sunday") {
> return (getPreviousBusinessDate(asdt));
> }
> return (asdt);
> }
> ## closure which calls util function
> myclosure_func = function() {
> myclosure = list()
> get_some_date = function (random_date) {
> return (getPreviousBusinessDate(random_date))
> }
> myclosure$get_some_date = get_some_date
> return(myclosure)
> }
> {code}
> This seems to have caused by sourcing sample.R twice. Once before invoking Spark session and another within Spark session.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org