You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Nikhil Goyal <no...@gmail.com> on 2020/02/21 19:28:32 UTC

Serialization error when using scala kernel with Jupyter

Hi all,
I am trying to use almond scala kernel to run spark session on Jupyter. I
am using scala version 2.12.8. I am creating spark session with master set
to Yarn.
This is the code:

val rdd = spark.sparkContext.parallelize(Seq(1, 2, 4))
rdd.map(x => x + 1).collect()

Exception:

java.lang.ClassCastException: cannot assign instance of
java.lang.invoke.SerializedLambda to field
org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in
instance of org.apache.spark.rdd.MapPartitionsRDD


I was wondering if anyone has seen this before.

Thanks
Nikhil

Re: Serialization error when using scala kernel with Jupyter

Posted by "Apostolos N. Papadopoulos" <pa...@csd.auth.gr>.
collect() returns the contents of the RDD back to the Driver in a local 
variable. Where is the local variable?

Try

val result = rdd.map(x => x + 1).collect()

regards,

Apostolos



On 21/2/20 21:28, Nikhil Goyal wrote:
> Hi all,
> I am trying to use almond scala kernel to run spark session on 
> Jupyter. I am using scala version 2.12.8. I am creating spark session 
> with master set to Yarn.
> This is the code:
>
> val rdd = spark.sparkContext.parallelize(Seq(1, 2, 4))
> rdd.map(x => x + 1).collect()
>
> Exception:
> java.lang.ClassCastException: cannot assign instance of 
> java.lang.invoke.SerializedLambda to field 
> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in 
> instance of org.apache.spark.rdd.MapPartitionsRDD
>
> I was wondering if anyone has seen this before.
>
> Thanks
> Nikhil

-- 
Apostolos N. Papadopoulos, Associate Professor
Department of Informatics
Aristotle University of Thessaloniki
Thessaloniki, GREECE
tel: ++0030312310991918
email: papadopo@csd.auth.gr
twitter: @papadopoulos_ap
web: http://datalab.csd.auth.gr/~apostol