You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@zeppelin.apache.org by Han-Cheol Cho <pr...@gmail.com> on 2015/10/28 09:28:25 UTC

interpreter.sh clears ZEPPELIN_CLASSPATH when $SPARK_HOME is set

Hello, Zeppelin mailing list members,


I have installed Zeppelin on the CDH 5.4.3 Hadoop cluster, but couldn't run
spark due to the NoClassDefFoundError for HiveConf.

...
15/10/26 19:38:29 WARN shortcircuit.DomainSocketFactory: The short-circuit
local reads feature cannot be used because libhadoop cannot be loaded.
15/10/26 19:38:30 INFO scheduler.EventLoggingListener: Logging events to
hdfs://mycluster/user/spark/applicationHistory/local-1445855908613
15/10/26 19:38:30 ERROR scheduler.Job: Job failed
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
        at java.lang.Class.getDeclaredConstructors0(Native Method)
        at java.lang.Class.privateGetDeclaredConstructors(Class.java:2663)
        at java.lang.Class.getConstructor0(Class.java:3067)
        at java.lang.Class.getConstructor(Class.java:1817)
        at
org.apache.zeppelin.spark.SparkInterpreter.getSQLContext(SparkInterpreter.java:210)
        at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:476)
...


Hive jar files are already added in conf/zeppelin-env.sh by setting the
following
ZEPPELIN_CLASSPATH property.

export ZEPPELIN_CLASSPATH="/usr/lib/hive/lib/*"


After digging the problem a few hours, I found that bin/interpreter.sh
clears ZEPPELIN_CLASSPATH
before attaching SPARK_APP_JAR to it.

# set spark related env variables
if [[ "${INTERPRETER_ID}" == "spark" ]]; then
  if [[ -n "${SPARK_HOME}" ]]; then
    export SPARK_SUBMIT="${SPARK_HOME}/bin/spark-submit"
    SPARK_APP_JAR="$(ls
${ZEPPELIN_HOME}/interpreter/spark/zeppelin-spark*.jar)"
    # This will evantually passes SPARK_APP_JAR to classpath of SparkIMain
    ZEPPELIN_CLASSPATH=${SPARK_APP_JAR}
    ...

And changing it to ZEPPELIN_CLASSPATH+=":${SPARK_APP_JAR}" solve the
problem.



I am wonder whether this is a bug or something necessary.


Best wishes,
Han-Cheol


-- 
Han-Cheol Cho (Ph.D)
Homepage: https://sites.google.com/site/priancho/

Re: interpreter.sh clears ZEPPELIN_CLASSPATH when $SPARK_HOME is set

Posted by Mina Lee <mi...@nflabs.com>.

Hi Han-Cheol,
if I bring up the conclusion first, that is expected behavior.
Once you set SPARK_HOME in zeppelin-env.sh, Zeppelin will run spark with
spark-submit.
And spark-submit will add $SPARK_HOME/lib/spark-assembly.*hadoop*.jar(which
includes `HiveConf` class) into spark launch class path.

In this case, spark should be installed on the same machine where Zeppelin
is installed.
Does your environment satisfy this condition?

And if you want to add external jar, we recommend you to edit
$SPARK_HOME/conf/spark-default.conf as below:
```
spark.master spark://masterhost:7077
spark.files /path/your.jar
```

Hope this helps and if you have more questions feel free to ask



On Wed, Oct 28, 2015 at 5:28 PM, Han-Cheol Cho <pr...@gmail.com> wrote:

> Hello, Zeppelin mailing list members,
>
>
> I have installed Zeppelin on the CDH 5.4.3 Hadoop cluster, but couldn't
> run
> spark due to the NoClassDefFoundError for HiveConf.
>
> ...
> 15/10/26 19:38:29 WARN shortcircuit.DomainSocketFactory: The short-circuit
> local reads feature cannot be used because libhadoop cannot be loaded.
> 15/10/26 19:38:30 INFO scheduler.EventLoggingListener: Logging events to
> hdfs://mycluster/user/spark/applicationHistory/local-1445855908613
> 15/10/26 19:38:30 ERROR scheduler.Job: Job failed
> java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
>         at java.lang.Class.getDeclaredConstructors0(Native Method)
>         at java.lang.Class.privateGetDeclaredConstructors(Class.java:2663)
>         at java.lang.Class.getConstructor0(Class.java:3067)
>         at java.lang.Class.getConstructor(Class.java:1817)
>         at
> org.apache.zeppelin.spark.SparkInterpreter.getSQLContext(SparkInterpreter.java:210)
>         at
> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:476)
> ...
>
>
> Hive jar files are already added in conf/zeppelin-env.sh by setting the
> following
> ZEPPELIN_CLASSPATH property.
>
> export ZEPPELIN_CLASSPATH="/usr/lib/hive/lib/*"
>
>
> After digging the problem a few hours, I found that bin/interpreter.sh
> clears ZEPPELIN_CLASSPATH
> before attaching SPARK_APP_JAR to it.
>
> # set spark related env variables
> if [[ "${INTERPRETER_ID}" == "spark" ]]; then
>   if [[ -n "${SPARK_HOME}" ]]; then
>     export SPARK_SUBMIT="${SPARK_HOME}/bin/spark-submit"
>     SPARK_APP_JAR="$(ls
> ${ZEPPELIN_HOME}/interpreter/spark/zeppelin-spark*.jar)"
>     # This will evantually passes SPARK_APP_JAR to classpath of SparkIMain
>     ZEPPELIN_CLASSPATH=${SPARK_APP_JAR}
>     ...
>
> And changing it to ZEPPELIN_CLASSPATH+=":${SPARK_APP_JAR}" solve the
> problem.
>
>
>
> I am wonder whether this is a bug or something necessary.
>
>
> Best wishes,
> Han-Cheol
>
>
> --
> Han-Cheol Cho (Ph.D)
> Homepage: https://sites.google.com/site/priancho/
>