You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Peng Cheng <pc...@uow.edu.au> on 2014/11/04 03:38:43 UTC

How to make sure a ClassPath is always shipped to workers?

I have a spark application that deserialize an object 'Seq[Page]', save to
HDFS/S3, and read by another worker to be used elsewhere. The serialization
and deserialization use the same serializer as Spark itself. (Read from
SparkEnv.get.serializer.newInstance())

However I sporadically get this error:

java.lang.ClassNotFoundException: org.***.***.Page
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:274)
	at
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:59)
	at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
	at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
	at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
	at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
	at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
	at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
	at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)

It seems like Page class wasn't shipped with the Jar and executor and all
its information was erased in runtime.

The most weird thing: this error doesn't always happen, sometimes the old
Seq[Page] was get properly, sometimes it throws the exception, how could
this happen and how do I fix it?

Yours Peng



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-make-sure-a-ClassPath-is-always-shipped-to-workers-tp18018.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: How to make sure a ClassPath is always shipped to workers?

Posted by Peng Cheng <pc...@uow.edu.au>.
Thanks a lot! Unfortunately this is not my problem: The page class is already
in the jar that is shipped to every worker. (I've logged into workers and
unpacked the jar files, and see the class file right there as intended)
Also, this error only happens sporadically, not every time. the error was
sometimes automatically retried on a different worker and get overcome (but
it won't be overcome if retried manually in the same partition), which make
it hard to catch.

Yours Peng



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-make-sure-a-ClassPath-is-always-shipped-to-workers-tp18018p18107.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: How to make sure a ClassPath is always shipped to workers?

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
You can add your custom jar in the SPARK_CLASSPATH inside spark-env.sh file
and restart the cluster to get it shipped on all the workers. Also you can
use the .setJars option and add the jar while creating the sparkContext.

Thanks
Best Regards

On Tue, Nov 4, 2014 at 8:12 AM, Peng Cheng <pc...@uow.edu.au> wrote:

> Sorry its a timeout duplicate, please remove it
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-make-sure-a-ClassPath-is-always-shipped-to-workers-tp18018p18020.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: How to make sure a ClassPath is always shipped to workers?

Posted by Peng Cheng <pc...@uow.edu.au>.
Sorry its a timeout duplicate, please remove it



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-make-sure-a-ClassPath-is-always-shipped-to-workers-tp18018p18020.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org