You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Rahul Palamuttam (JIRA)" <ji...@apache.org> on 2016/03/03 04:30:18 UTC

[jira] [Created] (SPARK-13634) Assigning spark context to variable results in serialization error

Rahul Palamuttam created SPARK-13634:
----------------------------------------

             Summary: Assigning spark context to variable results in serialization error
                 Key: SPARK-13634
                 URL: https://issues.apache.org/jira/browse/SPARK-13634
             Project: Spark
          Issue Type: Bug
          Components: Spark Shell
            Reporter: Rahul Palamuttam


The following lines of code cause a task serialization error when executed in the spark-shell. Note that the error does not occur when submitting the code as a batch job - via spark-submit.

val temp = 10
val newSC = sc
val new RDD = newSC.parallelize(0 to 100).map(p => p + temp)

For some reason when temp is being pulled in to the referencing environment of the closure, so is the SparkContext. 

We originally hit this issue in the SciSpark project, when referencing a string variable inside of a lambda expression in RDD.map(...)

Any insight into how this could be resolved would be appreciated.
While the above code is trivial, SciSpark uses wrapper around the SparkContext to read from various file formats. We want to keep this class structure and also use it notebook and shell environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org