You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@zeppelin.apache.org by Ben Vogan <be...@shopkick.com> on 2017/04/12 20:16:02 UTC

org.apache.spark.SparkException: Could not parse Master URL: 'yarn'

Hello all,

I am trying to install Zeppelin 0.7.1 on my CDH 5.7 Cluster.  I have been
following the instructions here:

https://zeppelin.apache.org/docs/0.7.1/install/install.html
https://zeppelin.apache.org/docs/0.7.1/install/configuration.html
https://zeppelin.apache.org/docs/0.7.1/interpreter/spark.html

I copied the zeppelin-env.sh.template into zeppelin-env.sh and made the
following changes:
export JAVA_HOME=/usr/java/latest
export MASTER=yarn-client

export ZEPPELIN_LOG_DIR=/var/log/services/zeppelin
export ZEPPELIN_PID_DIR=/services/zeppelin/data
export ZEPPELIN_WAR_TEMPDIR=/services/zeppelin/data/jetty_tmp
export ZEPPELIN_NOTEBOOK_DIR=/services/zeppelin/data/notebooks
export ZEPPELIN_NOTEBOOK_PUBLIC=true

export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
export HADOOP_CONF_DIR=/etc/spark/conf/yarn-conf
export PYSPARK_PYTHON=/usr/lib/python

I then start Zeppelin and hit the UI in my browser and create a spark note:

%spark
sqlContext.sql("select 1+1").collect().foreach(println)

And I get this error:

org.apache.spark.SparkException: Could not parse Master URL: 'yarn'
at
org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2746)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:533)
at
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_1(SparkInterpreter.java:484)
at
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:382)
at
org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:828)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:483)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

I specified "yarn-client" as indicated by the instructions so I'm not sure
where it is getting "yarn" from.  In my spark-defaults.conf it
spark.master=yarn-client as well.

Help would be greatly appreciated.

Thanks,
-- 
*BENJAMIN VOGAN* | Data Platform Team Lead

<http://www.shopkick.com/>
<https://www.facebook.com/shopkick> <https://www.instagram.com/shopkick/>
<https://www.pinterest.com/shopkick/> <https://twitter.com/shopkickbiz>
<https://www.linkedin.com/company-beta/831240/?pathWildcard=831240>

Re: org.apache.spark.SparkException: Could not parse Master URL: 'yarn'

Posted by Ben Vogan <be...@shopkick.com>.

I discovered that the interpreter.json had "master" : "yarn" and this seems
to take precedence over what is in the zeppelin-env.sh file.  Changing that
to yarn-client resolved my issue.

--Ben

On Wed, Apr 12, 2017 at 2:39 PM, Chaoran Yu <yu...@gmail.com> wrote:

> I suspect this is due to not setting SPARK_EXECUTOR_URI.
>
> I’ve run Zeppelin with Spark on Mesos. I ran into a similar exception
> where Zeppelin was not able to parse the MASTER URL, which is “
> mesos://leader.mesos:5050” in my case. Then I found out that I had the
> following setting:
> SPARK_EXECUTOR_URI=https://www.apache.org/dist/spark/
> spark-2.1.0/spark-2.1.0-bin-hadoop2.6.tgz
> which is not built for mesos.
>
> After changing it to the following
> SPARK_EXECUTOR_URI=https://downloads.mesosphere.com/
> spark/assets/spark-2.1.0-bin-2.6.tgz
> the exception was gone.
>
> In your case, you might want to look at this page: http://archive-primary.
> cloudera.com/cdh5/cdh/5/
> So I guess something like http://archive-primary.
> cloudera.com/cdh5/cdh/5/spark-1.6.0-cdh5.7.6.tar.gz should work as a
> value for SPARK_EXECUTOR_URI.
>
> --
> Chaoran Yu
>
> On Apr 12, 2017, at 4:16 PM, Ben Vogan <be...@shopkick.com> wrote:
>
> Hello all,
>
> I am trying to install Zeppelin 0.7.1 on my CDH 5.7 Cluster.  I have been
> following the instructions here:
>
> https://zeppelin.apache.org/docs/0.7.1/install/install.html
> https://zeppelin.apache.org/docs/0.7.1/install/configuration.html
> https://zeppelin.apache.org/docs/0.7.1/interpreter/spark.html
>
> I copied the zeppelin-env.sh.template into zeppelin-env.sh and made the
> following changes:
> export JAVA_HOME=/usr/java/latest
> export MASTER=yarn-client
>
> export ZEPPELIN_LOG_DIR=/var/log/services/zeppelin
> export ZEPPELIN_PID_DIR=/services/zeppelin/data
> export ZEPPELIN_WAR_TEMPDIR=/services/zeppelin/data/jetty_tmp
> export ZEPPELIN_NOTEBOOK_DIR=/services/zeppelin/data/notebooks
> export ZEPPELIN_NOTEBOOK_PUBLIC=true
>
> export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
> export HADOOP_CONF_DIR=/etc/spark/conf/yarn-conf
> export PYSPARK_PYTHON=/usr/lib/python
>
> I then start Zeppelin and hit the UI in my browser and create a spark note:
>
> %spark
> sqlContext.sql("select 1+1").collect().foreach(println)
>
> And I get this error:
>
> org.apache.spark.SparkException: Could not parse Master URL: 'yarn'
> at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$
> createTaskScheduler(SparkContext.scala:2746)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:533)
> at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_1(
> SparkInterpreter.java:484)
> at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(
> SparkInterpreter.java:382)
> at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(
> SparkInterpreter.java:146)
> at org.apache.zeppelin.spark.SparkInterpreter.open(
> SparkInterpreter.java:828)
> at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(
> LazyOpenInterpreter.java:70)
> at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$
> InterpretJob.jobRun(RemoteInterpreterServer.java:483)
> at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
> at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(
> FIFOScheduler.java:139)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> I specified "yarn-client" as indicated by the instructions so I'm not sure
> where it is getting "yarn" from.  In my spark-defaults.conf it
> spark.master=yarn-client as well.
>
> Help would be greatly appreciated.
>
> Thanks,
> --
> *BENJAMIN VOGAN* | Data Platform Team Lead
>
> <http://www.shopkick.com/>
> <https://www.facebook.com/shopkick> <https://www.instagram.com/shopkick/>
> <https://www.pinterest.com/shopkick/> <https://twitter.com/shopkickbiz>
> <https://www.linkedin.com/company-beta/831240/?pathWildcard=831240>
>
>
>


-- 
*BENJAMIN VOGAN* | Data Platform Team Lead

<http://www.shopkick.com/>
<https://www.facebook.com/shopkick> <https://www.instagram.com/shopkick/>
<https://www.pinterest.com/shopkick/> <https://twitter.com/shopkickbiz>
<https://www.linkedin.com/company-beta/831240/?pathWildcard=831240>

Re: org.apache.spark.SparkException: Could not parse Master URL: 'yarn'

Posted by Chaoran Yu <yu...@gmail.com>.

I suspect this is due to not setting SPARK_EXECUTOR_URI.

I’ve run Zeppelin with Spark on Mesos. I ran into a similar exception where Zeppelin was not able to parse the MASTER URL, which is “mesos://leader.mesos <mesos://leader.mesos>:5050” in my case. Then I found out that I had the following setting:
SPARK_EXECUTOR_URI=https://www.apache.org/dist/spark/spark-2.1.0/spark-2.1.0-bin-hadoop2.6.tgz <https://www.apache.org/dist/spark/spark-2.1.0/spark-2.1.0-bin-hadoop2.6.tgz>
which is not built for mesos.

After changing it to the following
SPARK_EXECUTOR_URI=https://downloads.mesosphere.com/spark/assets/spark-2.1.0-bin-2.6.tgz <https://downloads.mesosphere.com/spark/assets/spark-2.1.0-bin-2.6.tgz>
the exception was gone.

In your case, you might want to look at this page: http://archive-primary.cloudera.com/cdh5/cdh/5/ <http://archive-primary.cloudera.com/cdh5/cdh/5/>
So I guess something like http://archive-primary.cloudera.com/cdh5/cdh/5/spark-1.6.0-cdh5.7.6.tar.gz <http://archive-primary.cloudera.com/cdh5/cdh/5/spark-1.6.0-cdh5.7.6.tar.gz> should work as a value for SPARK_EXECUTOR_URI.

--
Chaoran Yu

> On Apr 12, 2017, at 4:16 PM, Ben Vogan <be...@shopkick.com> wrote:
> 
> Hello all,
> 
> I am trying to install Zeppelin 0.7.1 on my CDH 5.7 Cluster.  I have been following the instructions here:
> 
> https://zeppelin.apache.org/docs/0.7.1/install/install.html <https://zeppelin.apache.org/docs/0.7.1/install/install.html>
> https://zeppelin.apache.org/docs/0.7.1/install/configuration.html <https://zeppelin.apache.org/docs/0.7.1/install/configuration.html>
> https://zeppelin.apache.org/docs/0.7.1/interpreter/spark.html <https://zeppelin.apache.org/docs/0.7.1/interpreter/spark.html>
> 
> I copied the zeppelin-env.sh.template into zeppelin-env.sh and made the following changes:
> export JAVA_HOME=/usr/java/latest
> export MASTER=yarn-client
> 
> export ZEPPELIN_LOG_DIR=/var/log/services/zeppelin
> export ZEPPELIN_PID_DIR=/services/zeppelin/data
> export ZEPPELIN_WAR_TEMPDIR=/services/zeppelin/data/jetty_tmp
> export ZEPPELIN_NOTEBOOK_DIR=/services/zeppelin/data/notebooks
> export ZEPPELIN_NOTEBOOK_PUBLIC=true
> 
> export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
> export HADOOP_CONF_DIR=/etc/spark/conf/yarn-conf
> export PYSPARK_PYTHON=/usr/lib/python
> 
> I then start Zeppelin and hit the UI in my browser and create a spark note:
> 
> %spark
> sqlContext.sql("select 1+1").collect().foreach(println)
> 
> And I get this error:
> 
> org.apache.spark.SparkException: Could not parse Master URL: 'yarn'
> 	at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2746)
> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:533)
> 	at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_1(SparkInterpreter.java:484)
> 	at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:382)
> 	at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
> 	at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:828)
> 	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
> 	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:483)
> 	at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
> 	at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 
> I specified "yarn-client" as indicated by the instructions so I'm not sure where it is getting "yarn" from.  In my spark-defaults.conf it spark.master=yarn-client as well.
> 
> Help would be greatly appreciated.
> 
> Thanks,
> -- 
> BENJAMIN VOGAN | Data Platform Team Lead
> 
>  <http://www.shopkick.com/>
>  <https://www.facebook.com/shopkick> <https://www.instagram.com/shopkick/> <https://www.pinterest.com/shopkick/> <https://twitter.com/shopkickbiz> <https://www.linkedin.com/company-beta/831240/?pathWildcard=831240>