You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jan Algermissen <al...@icloud.com> on 2015/08/25 00:00:09 UTC

Strange ClassNotFoundException in spark-shell

Hi,

I am using spark 1.4 M1 with the Cassandra Connector and run into a strange error when using the spark shell.

This works:

sc.cassandraTable("events", "bid_events").select("bid","type").take(10).foreach(println)


But as soon as I put a map() in there (or filter):

sc.cassandraTable("events", "bid_events").select("bid","type").map(r => r).take(10).foreach(println)


I get the exception below.

The spark-shell call is:

/opt/spark/bin/spark-shell --master spark://xxxxx-1:7077 --conf spark.cassandra.connection.host=$(hostname -i) --driver-class-path $(echo /root/*.jar |sed 's/ /:/g') --jar spark-cassandra-connector-assembly-1.4.0-M1-SNAPSHOT.jar

Can anyone provide ideas how to approach debugging this?

Jan


15/08/24 23:54:43 INFO DAGScheduler: Job 0 failed: take at <console>:32, took 1.999875 s
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, 10.31.39.116): java.lang.ClassNotFoundException: $anonfun$1
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:66)
	at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
	at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
	at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58)
	at org.apache.spark.scheduler.Task.run(Task.scala:70)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
  at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263)
  at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
  at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
  at scala.Option.foreach(Option.scala:257)
  at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418)
  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

scala> 
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org