You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by moon soo Lee <mo...@apache.org> on 2015/09/09 17:19:04 UTC

Re: ZeppelinContext not found in spark executor classpath

Hi,

Recently https://github.com/apache/incubator-zeppelin/pull/270 is merged
into master branch and i believe it solves this problem.
Let me know if it helps.

Thanks,
moon

On Thu, Aug 20, 2015 at 2:36 AM David Salinas <da...@gmail.com>
wrote:

> Hi,
>
> I have the same error that is very ennoying and seems to be related with
> issues you have with UDF.
>
> Here is a reproducible example that happens when taking the closure
> (Zeppelin has been built with head of master with this command mvn install
> -DskipTests -Pspark-1.4 -Dspark.version=1.4.1 -Dhadoop.version=2.2.0
> -Dprotobuf.version=2.5.0).
>
>
>
>
> val textFile = sc.textFile("hdfs://somefile.txt")
>
> val f = (s: String) => s
> textFile.map(f).count
> //works fine
> //res145: Long = 407
>
>
> def f(s:String) = {
>     s+s
> }
> textFile.map(f).count
>
> //fails ->
>
> org.apache.spark.SparkException: Job aborted due to stage failure: Task
> 566 in stage 87.0 failed 4 times, most recent failure: Lost task 566.3 in
> stage 87.0 (TID 43396, XXX.com): java.lang.NoClassDefFoundError:
> Lorg/apache/zeppelin/spark/ZeppelinContext; at
> java.lang.Class.getDeclaredFields0(Native Method) at
> java.lang.Class.privateGetDeclaredFields(Class.java:2583) at
> java.lang.Class.getDeclaredField(Class.java:2068) at
> java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1659) at
> java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72) at
> java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:480) at
> java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468) at
> java.security.AccessController.doPrivileged(Native Method) at
> java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468) at
> java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365) at
> java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602) at
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623) at
> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
> at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58) at
> org.apache.spark.scheduler.Task.run(Task.scala:70) at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745) Caused by:
> java.lang.ClassNotFoundException: org.apache.zeppelin.spark.ZeppelinContext
> at
> org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:69)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 64 more Caused
> by: java.lang.ClassNotFoundException:
> org.apache.zeppelin.spark.ZeppelinContext at
> java.lang.ClassLoader.findClass(ClassLoader.java:530) at
> org.apache.spark.util.ParentClassLoader.findClass(ParentClassLoader.scala:26)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at
> org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:34)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at
> org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.scala:30)
> at
> org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:64)
> ... 66 more Driver stacktrace: at
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263)
> at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
> at scala.Option.foreach(Option.scala:236) at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>
>
> Best,
>
> David
>

Re: ZeppelinContext not found in spark executor classpath

Posted by moon soo Lee <mo...@apache.org>.
Hi David,

With the master branch (0.6.0-SNAPSHOT),
1) download and configure spark and make sure everything works with
bin/spark-shell command. If you have extra jar, you can add spark.files
property in SPARK_HOME/conf/spark-defaults.conf
2) export SPARK_HOME in conf/zeppelin-env.sh file.
3) Start Zeppelin and configure master, spark.executor.memory, etc in
Interpreter menu.

I think these steps are easiest way to set up Zeppelin with extra jars.

Thanks,
moon

On Thu, Sep 17, 2015 at 5:41 PM David Salinas <da...@gmail.com>
wrote:

> Hi,
>
> I have just tested without setting the classpath in the conf and 1/
> worked. Could you tell me what is the best way to set up your classpath/jar
> now?
>
> Best,
>
> David
>
> On Thu, Sep 17, 2015 at 10:34 AM, David Salinas <
> david.salinas.pro@gmail.com> wrote:
>
>> Hi,
>>
>> I have tried this example after
>> https://github.com/apache/incubator-zeppelin/pull/270.
>>
>> But it is not working for several reasons:
>>
>> 1/ Zeppelin context is not found (!):
>> val s = z.input("Foo")
>> <console>:21: error: not found: value z
>>
>> 2/ If I include my jar, the classpath is not communicated to slaves, the
>> code works only locally (it used to work on the cluster before this
>> change), I guess there is something wrong with the way I set the classpath
>> (which is also probably linked to 1/)
>>
>> I have added this line in zeppelin-env.sh to use one of my jar
>> export ZEPPELIN_JAVA_OPTS="-Dspark.driver.host=`hostname`
>> -Dspark.mesos.coarse=true -Dspark.executor.memory=20g -Dspark.cores.max=80
>> -Dspark.jars=${SOME_JAR} -cp ${SOME_CLASSPATH_FOR_THE_JAR}
>>
>> How can one add its extraclasspath jar with this new version? Could you
>> add an ZEPPELIN_EXTRA_JAR, ZEPPELIN_EXTRA_CLASSPATH to zeppelin-env.sh so
>> that the user can add easily his code?
>>
>> Best,
>>
>> David
>>
>
>

Re: ZeppelinContext not found in spark executor classpath

Posted by David Salinas <da...@gmail.com>.
Hi,

I have just tested without setting the classpath in the conf and 1/ worked.
Could you tell me what is the best way to set up your classpath/jar now?

Best,

David

On Thu, Sep 17, 2015 at 10:34 AM, David Salinas <david.salinas.pro@gmail.com
> wrote:

> Hi,
>
> I have tried this example after
> https://github.com/apache/incubator-zeppelin/pull/270.
>
> But it is not working for several reasons:
>
> 1/ Zeppelin context is not found (!):
> val s = z.input("Foo")
> <console>:21: error: not found: value z
>
> 2/ If I include my jar, the classpath is not communicated to slaves, the
> code works only locally (it used to work on the cluster before this
> change), I guess there is something wrong with the way I set the classpath
> (which is also probably linked to 1/)
>
> I have added this line in zeppelin-env.sh to use one of my jar
> export ZEPPELIN_JAVA_OPTS="-Dspark.driver.host=`hostname`
> -Dspark.mesos.coarse=true -Dspark.executor.memory=20g -Dspark.cores.max=80
> -Dspark.jars=${SOME_JAR} -cp ${SOME_CLASSPATH_FOR_THE_JAR}
>
> How can one add its extraclasspath jar with this new version? Could you
> add an ZEPPELIN_EXTRA_JAR, ZEPPELIN_EXTRA_CLASSPATH to zeppelin-env.sh so
> that the user can add easily his code?
>
> Best,
>
> David
>

Re: ZeppelinContext not found in spark executor classpath

Posted by David Salinas <da...@gmail.com>.
Hi,

I have tried this example after
https://github.com/apache/incubator-zeppelin/pull/270.

But it is not working for several reasons:

1/ Zeppelin context is not found (!):
val s = z.input("Foo")
<console>:21: error: not found: value z

2/ If I include my jar, the classpath is not communicated to slaves, the
code works only locally (it used to work on the cluster before this
change), I guess there is something wrong with the way I set the classpath
(which is also probably linked to 1/)

I have added this line in zeppelin-env.sh to use one of my jar
export ZEPPELIN_JAVA_OPTS="-Dspark.driver.host=`hostname`
-Dspark.mesos.coarse=true -Dspark.executor.memory=20g -Dspark.cores.max=80
-Dspark.jars=${SOME_JAR} -cp ${SOME_CLASSPATH_FOR_THE_JAR}

How can one add its extraclasspath jar with this new version? Could you add
an ZEPPELIN_EXTRA_JAR, ZEPPELIN_EXTRA_CLASSPATH to zeppelin-env.sh so that
the user can add easily his code?

Best,

David