You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Peter Aberline <pe...@gmail.com> on 2014/09/08 19:03:41 UTC
Spark-submit ClassNotFoundException with JAR!
Hi,
I'm having problems with a ClassNotFoundException using this simple example:
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import java.net.URLClassLoader
import scala.util.Marshal
class ClassToRoundTrip(val id: Int) extends scala.Serializable {
}
object RoundTripTester {
def test(id : Int) : ClassToRoundTrip = {
// Get the current classpath and output. Can we see simpleapp jar?
val cl = ClassLoader.getSystemClassLoader
val urls = cl.asInstanceOf[URLClassLoader].getURLs
urls.foreach(url => println("Executor classpath is:" + url.getFile))
// Simply instantiating an instance of object and using it works fine.
val testObj = new ClassToRoundTrip(id)
println("testObj.id: " + testObj.id)
val testObjBytes = Marshal.dump(testObj)
val testObjRoundTrip =
Marshal.load[ClassToRoundTrip](testObjBytes) // <<--
ClassNotFoundException here
testObjRoundTrip
}
}
object SimpleApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val cl = ClassLoader.getSystemClassLoader
val urls = cl.asInstanceOf[URLClassLoader].getURLs
urls.foreach(url => println("Driver classpath is: " + url.getFile))
val data = Array(1, 2, 3, 4, 5)
val distData = sc.parallelize(data)
distData.foreach(x=> RoundTripTester.test(x))
}
}
In local mode, submitting as per the docs generates a "ClassNotFound"
exception on line 31, where the ClassToRoundTrip object is
deserialized. Strangely, the earlier use on line 28 is okay:
spark-submit --class "SimpleApp" \
--master local[4] \
target/scala-2.10/simpleapp_2.10-1.0.jar
However, if I add extra parameters for "driver-class-path", and
"-jars", it works fine, on local.
spark-submit --class "SimpleApp" \
--master local[4] \
--driver-class-path
/home/xxxxxxx/workspace/SimpleApp/target/scala-2.10/simpleapp_2.10-1.0.jar
\
--jars
/home/xxxxxxx/workspace/SimpleApp/target/scala-2.10/SimpleApp.jar \
target/scala-2.10/simpleapp_2.10-1.0.jar
However, submitting to a local dev master, still generates the same issue:
spark-submit --class "SimpleApp" \
--master spark://localhost.localdomain:7077 \
--driver-class-path
/home/xxxxxxx/workspace/SimpleApp/target/scala-2.10/simpleapp_2.10-1.0.jar
\
--jars
/home/xxxxxxx/workspace/SimpleApp/target/scala-2.10/simpleapp_2.10-1.0.jar
\
target/scala-2.10/simpleapp_2.10-1.0.jar
I can see from the output that the JAR file is being fetched by the executor.
Logs for one of the executor's are here:
stdout: http://pastebin.com/raw.php?i=DQvvGhKm
stderr: http://pastebin.com/raw.php?i=MPZZVa0Q
I'm using Spark 1.0.2. The ClassToRoundTrip is included in the JAR.
I have a work around of copying the JAR to each of the machines and
setting the "spark.executor.extraClassPath" parameter but I would
rather not have to do that.
This is such a simple case, I must be doing something obviously wrong.
Can anyone help?
Thanks
Peter
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org