You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by santhoma <sa...@yahoo.com> on 2014/03/24 10:36:45 UTC

Re: Java API - Serialization Issue

I am also facing the same problem. I have implemented Serializable for my
code, but the exception is thrown from third party libraries on which I have
no control . 

Exception in thread "main" org.apache.spark.SparkException: Job aborted:
Task not serializable: java.io.NotSerializableException: (lib class name
here)
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028)

Is it mandatory that Serializable must be implemented for dependent jars as
well?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Java-API-Serialization-Issue-tp1460p3086.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Java API - Serialization Issue

Posted by santhoma <sa...@yahoo.com>.
This worked great. Thanks a lot



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Java-API-Serialization-Issue-tp1460p3178.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Java API - Serialization Issue

Posted by Sourav Chandra <so...@livestream.com>.
I can suggest two things:

1. While creating worker, submitting task make sure you are not keeping any
unwanted external class resource (which is not used in closure and not
serializable)
2. If this is ensured and you still get some issue from 3rd party library
you can make thet 3rd party variable reference as transient in your code
and define private static void readObject(is: ObjectInputStream) method to
initialize that particular variable.

e.g. class MyClass extends Serializable {
@transient
private var ref = initRef() this is a 3rd party variable which is not
serialzable
....

private <> initRef() {
  ref = ....
  return ref
}

private static void readObject(is: ObjectInputStream) {
  is.defaultReadObject() // this is to follow the java default serialzation
for all other parameters
  ref = initRef()
}
}

Thanks,
Sourav


On Mon, Mar 24, 2014 at 3:06 PM, santhoma <sa...@yahoo.com> wrote:

> I am also facing the same problem. I have implemented Serializable for my
> code, but the exception is thrown from third party libraries on which I
> have
> no control .
>
> Exception in thread "main" org.apache.spark.SparkException: Job aborted:
> Task not serializable: java.io.NotSerializableException: (lib class name
> here)
>         at
>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028)
>
> Is it mandatory that Serializable must be implemented for dependent jars as
> well?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Java-API-Serialization-Issue-tp1460p3086.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>



-- 

Sourav Chandra

Senior Software Engineer

· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

sourav.chandra@livestream.com

o: +91 80 4121 8723

m: +91 988 699 3746

skype: sourav.chandra

Livestream

"Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
Block, Koramangala Industrial Area,

Bangalore 560034

www.livestream.com