You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@zeppelin.apache.org by Sourav Mazumder <so...@gmail.com> on 2015/09/23 22:21:22 UTC

Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote
machine I always get the error saying try spark-submit than using spark
context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Posted by moon soo Lee <mo...@apache.org>.

If you're using master branch, i recommend export only SPARK_HOME in
conf/zeppelin-env.sh.
Then Zeppelin will use spark-submit command to run Spark interpreter, and
that supposed to works exactly the same as when you run job using
spark-submit command.

Thanks,
moon
On 2015년 10월 5일 (월) at 오후 9:57 Sourav Mazumder <so...@gmail.com>
wrote:

> I could execute following without any issue.
>
> spark-submit --class org.apache.spark.examples.SparkPi --master
> yarn-cluster --num-executors 1 --driver-memory 512m --executor-memory 512m
> --executor-cores 1 lib/spark-examples.jar 10
>
> Regards,
> Sourav
>
> On Mon, Oct 5, 2015 at 12:04 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
> wrote:
>
>> did you try a test job with yarn-cluster (outside zeppelin) ?
>>
>> On Mon, Oct 5, 2015 at 11:48 AM, Sourav Mazumder <
>> sourav.mazumder00@gmail.com> wrote:
>>
>>> Yes I have them setup appropriately.
>>>
>>> Where I am lost is I can see that interpreter is running spark-submit
>>> but at some point of time it is switching to creating a spark context.
>>>
>>> May be, as you rightly mentioned, because of some permission issue it is
>>> not able to run driver on yarn cluster. But what is that issue/required
>>> configuration I'm not able to figure out.
>>>
>>> Regards,
>>> Sourav
>>>
>>> On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>> wrote:
>>>
>>>> Do you have these settings configured in zeppelin-env.sh
>>>>
>>>> export JAVA_HOME=/usr/src/jdk1.7.0_79/
>>>>
>>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>>>
>>>> Most likely you have this as your able to run with yarn-client.
>>>>
>>>>
>>>> Looks like the issue is to not be able to run the driver program on
>>>> cluster.
>>>>
>>>> On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder <
>>>> sourav.mazumder00@gmail.com> wrote:
>>>>
>>>>> Yes. Spark is installed in the machine where zeppelin is running.
>>>>>
>>>>> The location of spark.yarn.jar is very similar to what you have. I'm
>>>>> using IOP as distribution and it is the directory naming convention
>>>>> specific to IOP which is different form hdp.
>>>>>
>>>>> And yes the setup works perfectly fine when I use master as
>>>>> yarn-client and same setup for SPARK_HOME, HADOOP_CONF_DIR and
>>>>> HADOOP_CLIENT>
>>>>>
>>>>> Regards,
>>>>> Sourav
>>>>>
>>>>> On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Is spark installed on your zeppelin machine ?
>>>>>>
>>>>>> I would to try these
>>>>>>
>>>>>> master yarn-client
>>>>>> spark.home === SPARK INSTALLATION HOME directory on your zeppelin
>>>>>> server.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Looking at  spark.yarn.jar , i see spark is installed at
>>>>>> /usr/iop/current/spark-thriftserver/  . But why is it thirftserver
>>>>>> (i do not know what is it).
>>>>>>
>>>>>> I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/
>>>>>>  (can be any location) and have spark.yarn.jar to
>>>>>> /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <
>>>>>> sourav.mazumder00@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Deepu,
>>>>>>>
>>>>>>> Here u go.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Sourav
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *Properties* name value args master yarn-cluster spark.app.name Zeppelin
>>>>>>> spark.cores.max spark.executor.memory 512m spark.home spark.yarn.jar
>>>>>>> /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar zeppelin.dep.localrepo
>>>>>>> local-repo zeppelin.pyspark.python python zeppelin.spark.concurrentSQL
>>>>>>> false zeppelin.spark.maxResult 1000 zeppelin.spark.useHiveContext true
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Can you share screen shot of your spark interpreter on zeppelin web
>>>>>>>> interface.
>>>>>>>>
>>>>>>>> I have exact same deployment structure and it runs fine with right
>>>>>>>> set of configurations.
>>>>>>>>
>>>>>>>> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <
>>>>>>>> sourav.mazumder00@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Moon,
>>>>>>>>>
>>>>>>>>> I'm using 0.6 SNAPSHOT which I built from latest git hub.
>>>>>>>>>
>>>>>>>>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see
>>>>>>>>> that the control goes to the appropriate IF-ELSE block in interpreter.sh by
>>>>>>>>> putting some debug statement.
>>>>>>>>>
>>>>>>>>> But I get the same error as follows -
>>>>>>>>>
>>>>>>>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but
>>>>>>>>> isn't running on a cluster. Deployment to YARN is not supported directly by
>>>>>>>>> SparkContext. Please use spark-submit. at
>>>>>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
>>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>>>>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>>> at
>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>>> at
>>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>>>>> at
>>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>>>>>
>>>>>>>>> Let me know if you need any other details to figure out what is
>>>>>>>>> going on.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Sourav
>>>>>>>>>
>>>>>>>>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <mo...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Which version of Zeppelin are you using?
>>>>>>>>>>
>>>>>>>>>> Master branch uses spark-submit command, when SPARK_HOME is
>>>>>>>>>> defined in conf/zeppelin-env.sh
>>>>>>>>>>
>>>>>>>>>> If you're not on master branch, recommend try it with SPARK_HOME
>>>>>>>>>> defined.
>>>>>>>>>>
>>>>>>>>>> Hope this helps,
>>>>>>>>>> moon
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
>>>>>>>>>> sourav.mazumder00@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> When I try to run Spark Interpreter in Yarn Cluster mode from a
>>>>>>>>>>> remote machine I always get the error saying try spark-submit than using
>>>>>>>>>>> spark context.
>>>>>>>>>>>
>>>>>>>>>>> Mu Zeppelin process runs in a separate machine remote to the
>>>>>>>>>>> YARN cluster.
>>>>>>>>>>>
>>>>>>>>>>> Any idea why is this error ?
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Sourav
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Deepak
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Deepak
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Deepak
>>>>
>>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Posted by Sourav Mazumder <so...@gmail.com>.

I could execute following without any issue.

spark-submit --class org.apache.spark.examples.SparkPi --master
yarn-cluster --num-executors 1 --driver-memory 512m --executor-memory 512m
--executor-cores 1 lib/spark-examples.jar 10

Regards,
Sourav

On Mon, Oct 5, 2015 at 12:04 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com> wrote:

> did you try a test job with yarn-cluster (outside zeppelin) ?
>
> On Mon, Oct 5, 2015 at 11:48 AM, Sourav Mazumder <
> sourav.mazumder00@gmail.com> wrote:
>
>> Yes I have them setup appropriately.
>>
>> Where I am lost is I can see that interpreter is running spark-submit but
>> at some point of time it is switching to creating a spark context.
>>
>> May be, as you rightly mentioned, because of some permission issue it is
>> not able to run driver on yarn cluster. But what is that issue/required
>> configuration I'm not able to figure out.
>>
>> Regards,
>> Sourav
>>
>> On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>> wrote:
>>
>>> Do you have these settings configured in zeppelin-env.sh
>>>
>>> export JAVA_HOME=/usr/src/jdk1.7.0_79/
>>>
>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>>
>>> Most likely you have this as your able to run with yarn-client.
>>>
>>>
>>> Looks like the issue is to not be able to run the driver program on
>>> cluster.
>>>
>>> On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder <
>>> sourav.mazumder00@gmail.com> wrote:
>>>
>>>> Yes. Spark is installed in the machine where zeppelin is running.
>>>>
>>>> The location of spark.yarn.jar is very similar to what you have. I'm
>>>> using IOP as distribution and it is the directory naming convention
>>>> specific to IOP which is different form hdp.
>>>>
>>>> And yes the setup works perfectly fine when I use master as yarn-client
>>>> and same setup for SPARK_HOME, HADOOP_CONF_DIR and HADOOP_CLIENT>
>>>>
>>>> Regards,
>>>> Sourav
>>>>
>>>> On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>>> wrote:
>>>>
>>>>> Is spark installed on your zeppelin machine ?
>>>>>
>>>>> I would to try these
>>>>>
>>>>> master yarn-client
>>>>> spark.home === SPARK INSTALLATION HOME directory on your zeppelin
>>>>> server.
>>>>>
>>>>>
>>>>>
>>>>> Looking at  spark.yarn.jar , i see spark is installed at
>>>>> /usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i
>>>>> do not know what is it).
>>>>>
>>>>> I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/
>>>>>  (can be any location) and have spark.yarn.jar to
>>>>> /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <
>>>>> sourav.mazumder00@gmail.com> wrote:
>>>>>
>>>>>> Hi Deepu,
>>>>>>
>>>>>> Here u go.
>>>>>>
>>>>>> Regards,
>>>>>> Sourav
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Properties* name value args master yarn-cluster spark.app.name Zeppelin
>>>>>> spark.cores.max spark.executor.memory 512m spark.home spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
>>>>>> zeppelin.dep.localrepo local-repo zeppelin.pyspark.python python zeppelin.spark.concurrentSQL
>>>>>> false zeppelin.spark.maxResult 1000 zeppelin.spark.useHiveContext true
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Can you share screen shot of your spark interpreter on zeppelin web
>>>>>>> interface.
>>>>>>>
>>>>>>> I have exact same deployment structure and it runs fine with right
>>>>>>> set of configurations.
>>>>>>>
>>>>>>> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <
>>>>>>> sourav.mazumder00@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Moon,
>>>>>>>>
>>>>>>>> I'm using 0.6 SNAPSHOT which I built from latest git hub.
>>>>>>>>
>>>>>>>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see
>>>>>>>> that the control goes to the appropriate IF-ELSE block in interpreter.sh by
>>>>>>>> putting some debug statement.
>>>>>>>>
>>>>>>>> But I get the same error as follows -
>>>>>>>>
>>>>>>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but
>>>>>>>> isn't running on a cluster. Deployment to YARN is not supported directly by
>>>>>>>> SparkContext. Please use spark-submit. at
>>>>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>>>>>>>> at
>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>>>>>>>> at
>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
>>>>>>>> at
>>>>>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>>>>>>>> at
>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>>>>>>>> at
>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>>>>>>>> at
>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>>>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>> at
>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>> at
>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>>>> at
>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>>>>
>>>>>>>> Let me know if you need any other details to figure out what is
>>>>>>>> going on.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Sourav
>>>>>>>>
>>>>>>>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <mo...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Which version of Zeppelin are you using?
>>>>>>>>>
>>>>>>>>> Master branch uses spark-submit command, when SPARK_HOME is
>>>>>>>>> defined in conf/zeppelin-env.sh
>>>>>>>>>
>>>>>>>>> If you're not on master branch, recommend try it with SPARK_HOME
>>>>>>>>> defined.
>>>>>>>>>
>>>>>>>>> Hope this helps,
>>>>>>>>> moon
>>>>>>>>>
>>>>>>>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
>>>>>>>>> sourav.mazumder00@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> When I try to run Spark Interpreter in Yarn Cluster mode from a
>>>>>>>>>> remote machine I always get the error saying try spark-submit than using
>>>>>>>>>> spark context.
>>>>>>>>>>
>>>>>>>>>> Mu Zeppelin process runs in a separate machine remote to the YARN
>>>>>>>>>> cluster.
>>>>>>>>>>
>>>>>>>>>> Any idea why is this error ?
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Sourav
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Deepak
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Deepak
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Deepak
>>>
>>>
>>
>
>
> --
> Deepak
>
>

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com.

did you try a test job with yarn-cluster (outside zeppelin) ?

On Mon, Oct 5, 2015 at 11:48 AM, Sourav Mazumder <
sourav.mazumder00@gmail.com> wrote:

> Yes I have them setup appropriately.
>
> Where I am lost is I can see that interpreter is running spark-submit but
> at some point of time it is switching to creating a spark context.
>
> May be, as you rightly mentioned, because of some permission issue it is
> not able to run driver on yarn cluster. But what is that issue/required
> configuration I'm not able to figure out.
>
> Regards,
> Sourav
>
> On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
> wrote:
>
>> Do you have these settings configured in zeppelin-env.sh
>>
>> export JAVA_HOME=/usr/src/jdk1.7.0_79/
>>
>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>
>> Most likely you have this as your able to run with yarn-client.
>>
>>
>> Looks like the issue is to not be able to run the driver program on
>> cluster.
>>
>> On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder <
>> sourav.mazumder00@gmail.com> wrote:
>>
>>> Yes. Spark is installed in the machine where zeppelin is running.
>>>
>>> The location of spark.yarn.jar is very similar to what you have. I'm
>>> using IOP as distribution and it is the directory naming convention
>>> specific to IOP which is different form hdp.
>>>
>>> And yes the setup works perfectly fine when I use master as yarn-client
>>> and same setup for SPARK_HOME, HADOOP_CONF_DIR and HADOOP_CLIENT>
>>>
>>> Regards,
>>> Sourav
>>>
>>> On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>> wrote:
>>>
>>>> Is spark installed on your zeppelin machine ?
>>>>
>>>> I would to try these
>>>>
>>>> master yarn-client
>>>> spark.home === SPARK INSTALLATION HOME directory on your zeppelin
>>>> server.
>>>>
>>>>
>>>>
>>>> Looking at  spark.yarn.jar , i see spark is installed at
>>>> /usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i
>>>> do not know what is it).
>>>>
>>>> I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/
>>>>  (can be any location) and have spark.yarn.jar to
>>>> /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <
>>>> sourav.mazumder00@gmail.com> wrote:
>>>>
>>>>> Hi Deepu,
>>>>>
>>>>> Here u go.
>>>>>
>>>>> Regards,
>>>>> Sourav
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *Properties* name value args master yarn-cluster spark.app.name Zeppelin
>>>>> spark.cores.max spark.executor.memory 512m spark.home spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
>>>>> zeppelin.dep.localrepo local-repo zeppelin.pyspark.python python zeppelin.spark.concurrentSQL
>>>>> false zeppelin.spark.maxResult 1000 zeppelin.spark.useHiveContext true
>>>>>
>>>>>
>>>>> On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Can you share screen shot of your spark interpreter on zeppelin web
>>>>>> interface.
>>>>>>
>>>>>> I have exact same deployment structure and it runs fine with right
>>>>>> set of configurations.
>>>>>>
>>>>>> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <
>>>>>> sourav.mazumder00@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Moon,
>>>>>>>
>>>>>>> I'm using 0.6 SNAPSHOT which I built from latest git hub.
>>>>>>>
>>>>>>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that
>>>>>>> the control goes to the appropriate IF-ELSE block in interpreter.sh by
>>>>>>> putting some debug statement.
>>>>>>>
>>>>>>> But I get the same error as follows -
>>>>>>>
>>>>>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but
>>>>>>> isn't running on a cluster. Deployment to YARN is not supported directly by
>>>>>>> SparkContext. Please use spark-submit. at
>>>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>>>>>>> at
>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>>>>>>> at
>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
>>>>>>> at
>>>>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>>>>>>> at
>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>>>>>>> at
>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>>>>>>> at
>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>> at
>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>> at
>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>>> at
>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>>>
>>>>>>> Let me know if you need any other details to figure out what is
>>>>>>> going on.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Sourav
>>>>>>>
>>>>>>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <mo...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Which version of Zeppelin are you using?
>>>>>>>>
>>>>>>>> Master branch uses spark-submit command, when SPARK_HOME is defined
>>>>>>>> in conf/zeppelin-env.sh
>>>>>>>>
>>>>>>>> If you're not on master branch, recommend try it with SPARK_HOME
>>>>>>>> defined.
>>>>>>>>
>>>>>>>> Hope this helps,
>>>>>>>> moon
>>>>>>>>
>>>>>>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
>>>>>>>> sourav.mazumder00@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> When I try to run Spark Interpreter in Yarn Cluster mode from a
>>>>>>>>> remote machine I always get the error saying try spark-submit than using
>>>>>>>>> spark context.
>>>>>>>>>
>>>>>>>>> Mu Zeppelin process runs in a separate machine remote to the YARN
>>>>>>>>> cluster.
>>>>>>>>>
>>>>>>>>> Any idea why is this error ?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Sourav
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Deepak
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Deepak
>>>>
>>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>


-- 
Deepak

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Posted by Sourav Mazumder <so...@gmail.com>.

Yes I have them setup appropriately.

Where I am lost is I can see that interpreter is running spark-submit but
at some point of time it is switching to creating a spark context.

May be, as you rightly mentioned, because of some permission issue it is
not able to run driver on yarn cluster. But what is that issue/required
configuration I'm not able to figure out.

Regards,
Sourav

On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com> wrote:

> Do you have these settings configured in zeppelin-env.sh
>
> export JAVA_HOME=/usr/src/jdk1.7.0_79/
>
> export HADOOP_CONF_DIR=/etc/hadoop/conf
>
> Most likely you have this as your able to run with yarn-client.
>
>
> Looks like the issue is to not be able to run the driver program on
> cluster.
>
> On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder <
> sourav.mazumder00@gmail.com> wrote:
>
>> Yes. Spark is installed in the machine where zeppelin is running.
>>
>> The location of spark.yarn.jar is very similar to what you have. I'm
>> using IOP as distribution and it is the directory naming convention
>> specific to IOP which is different form hdp.
>>
>> And yes the setup works perfectly fine when I use master as yarn-client
>> and same setup for SPARK_HOME, HADOOP_CONF_DIR and HADOOP_CLIENT>
>>
>> Regards,
>> Sourav
>>
>> On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>> wrote:
>>
>>> Is spark installed on your zeppelin machine ?
>>>
>>> I would to try these
>>>
>>> master yarn-client
>>> spark.home === SPARK INSTALLATION HOME directory on your zeppelin server.
>>>
>>>
>>>
>>> Looking at  spark.yarn.jar , i see spark is installed at
>>> /usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i
>>> do not know what is it).
>>>
>>> I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/
>>>  (can be any location) and have spark.yarn.jar to
>>> /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <
>>> sourav.mazumder00@gmail.com> wrote:
>>>
>>>> Hi Deepu,
>>>>
>>>> Here u go.
>>>>
>>>> Regards,
>>>> Sourav
>>>>
>>>>
>>>>
>>>>
>>>> *Properties* name value args master yarn-cluster spark.app.name Zeppelin
>>>> spark.cores.max spark.executor.memory 512m spark.home spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
>>>> zeppelin.dep.localrepo local-repo zeppelin.pyspark.python python zeppelin.spark.concurrentSQL
>>>> false zeppelin.spark.maxResult 1000 zeppelin.spark.useHiveContext true
>>>>
>>>> On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>>> wrote:
>>>>
>>>>> Can you share screen shot of your spark interpreter on zeppelin web
>>>>> interface.
>>>>>
>>>>> I have exact same deployment structure and it runs fine with right set
>>>>> of configurations.
>>>>>
>>>>> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <
>>>>> sourav.mazumder00@gmail.com> wrote:
>>>>>
>>>>>> Hi Moon,
>>>>>>
>>>>>> I'm using 0.6 SNAPSHOT which I built from latest git hub.
>>>>>>
>>>>>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that
>>>>>> the control goes to the appropriate IF-ELSE block in interpreter.sh by
>>>>>> putting some debug statement.
>>>>>>
>>>>>> But I get the same error as follows -
>>>>>>
>>>>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but
>>>>>> isn't running on a cluster. Deployment to YARN is not supported directly by
>>>>>> SparkContext. Please use spark-submit. at
>>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
>>>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>>>>>> at
>>>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>>>>>> at
>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
>>>>>> at
>>>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>>>>>> at
>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>>>>>> at
>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>>>>>> at
>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>> at
>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>> at
>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>> at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>>
>>>>>> Let me know if you need any other details to figure out what is going
>>>>>> on.
>>>>>>
>>>>>> Regards,
>>>>>> Sourav
>>>>>>
>>>>>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <mo...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Which version of Zeppelin are you using?
>>>>>>>
>>>>>>> Master branch uses spark-submit command, when SPARK_HOME is defined
>>>>>>> in conf/zeppelin-env.sh
>>>>>>>
>>>>>>> If you're not on master branch, recommend try it with SPARK_HOME
>>>>>>> defined.
>>>>>>>
>>>>>>> Hope this helps,
>>>>>>> moon
>>>>>>>
>>>>>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
>>>>>>> sourav.mazumder00@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> When I try to run Spark Interpreter in Yarn Cluster mode from a
>>>>>>>> remote machine I always get the error saying try spark-submit than using
>>>>>>>> spark context.
>>>>>>>>
>>>>>>>> Mu Zeppelin process runs in a separate machine remote to the YARN
>>>>>>>> cluster.
>>>>>>>>
>>>>>>>> Any idea why is this error ?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Sourav
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Deepak
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Deepak
>>>
>>>
>>
>
>
> --
> Deepak
>
>

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com.

Do you have these settings configured in zeppelin-env.sh

export JAVA_HOME=/usr/src/jdk1.7.0_79/

export HADOOP_CONF_DIR=/etc/hadoop/conf

Most likely you have this as your able to run with yarn-client.


Looks like the issue is to not be able to run the driver program on
cluster.

On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder <
sourav.mazumder00@gmail.com> wrote:

> Yes. Spark is installed in the machine where zeppelin is running.
>
> The location of spark.yarn.jar is very similar to what you have. I'm using
> IOP as distribution and it is the directory naming convention specific to
> IOP which is different form hdp.
>
> And yes the setup works perfectly fine when I use master as yarn-client
> and same setup for SPARK_HOME, HADOOP_CONF_DIR and HADOOP_CLIENT>
>
> Regards,
> Sourav
>
> On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
> wrote:
>
>> Is spark installed on your zeppelin machine ?
>>
>> I would to try these
>>
>> master yarn-client
>> spark.home === SPARK INSTALLATION HOME directory on your zeppelin server.
>>
>>
>>
>> Looking at  spark.yarn.jar , i see spark is installed at
>> /usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i do
>> not know what is it).
>>
>> I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/
>>  (can be any location) and have spark.yarn.jar to
>> /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.
>>
>>
>>
>>
>>
>> On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <
>> sourav.mazumder00@gmail.com> wrote:
>>
>>> Hi Deepu,
>>>
>>> Here u go.
>>>
>>> Regards,
>>> Sourav
>>>
>>>
>>>
>>>
>>> *Properties* name value args master yarn-cluster spark.app.name Zeppelin
>>> spark.cores.max spark.executor.memory 512m spark.home spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
>>> zeppelin.dep.localrepo local-repo zeppelin.pyspark.python python zeppelin.spark.concurrentSQL
>>> false zeppelin.spark.maxResult 1000 zeppelin.spark.useHiveContext true
>>>
>>> On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>>> wrote:
>>>
>>>> Can you share screen shot of your spark interpreter on zeppelin web
>>>> interface.
>>>>
>>>> I have exact same deployment structure and it runs fine with right set
>>>> of configurations.
>>>>
>>>> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <
>>>> sourav.mazumder00@gmail.com> wrote:
>>>>
>>>>> Hi Moon,
>>>>>
>>>>> I'm using 0.6 SNAPSHOT which I built from latest git hub.
>>>>>
>>>>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that
>>>>> the control goes to the appropriate IF-ELSE block in interpreter.sh by
>>>>> putting some debug statement.
>>>>>
>>>>> But I get the same error as follows -
>>>>>
>>>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't
>>>>> running on a cluster. Deployment to YARN is not supported directly by
>>>>> SparkContext. Please use spark-submit. at
>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
>>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>>>>> at
>>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>>>>> at
>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>> at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>> Let me know if you need any other details to figure out what is going
>>>>> on.
>>>>>
>>>>> Regards,
>>>>> Sourav
>>>>>
>>>>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <mo...@apache.org> wrote:
>>>>>
>>>>>> Which version of Zeppelin are you using?
>>>>>>
>>>>>> Master branch uses spark-submit command, when SPARK_HOME is defined
>>>>>> in conf/zeppelin-env.sh
>>>>>>
>>>>>> If you're not on master branch, recommend try it with SPARK_HOME
>>>>>> defined.
>>>>>>
>>>>>> Hope this helps,
>>>>>> moon
>>>>>>
>>>>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
>>>>>> sourav.mazumder00@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> When I try to run Spark Interpreter in Yarn Cluster mode from a
>>>>>>> remote machine I always get the error saying try spark-submit than using
>>>>>>> spark context.
>>>>>>>
>>>>>>> Mu Zeppelin process runs in a separate machine remote to the YARN
>>>>>>> cluster.
>>>>>>>
>>>>>>> Any idea why is this error ?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Sourav
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Deepak
>>>>
>>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>


-- 
Deepak

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Posted by Sourav Mazumder <so...@gmail.com>.

Yes. Spark is installed in the machine where zeppelin is running.

The location of spark.yarn.jar is very similar to what you have. I'm using
IOP as distribution and it is the directory naming convention specific to
IOP which is different form hdp.

And yes the setup works perfectly fine when I use master as yarn-client and
same setup for SPARK_HOME, HADOOP_CONF_DIR and HADOOP_CLIENT>

Regards,
Sourav

On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com> wrote:

> Is spark installed on your zeppelin machine ?
>
> I would to try these
>
> master yarn-client
> spark.home === SPARK INSTALLATION HOME directory on your zeppelin server.
>
>
>
> Looking at  spark.yarn.jar , i see spark is installed at
> /usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i do
> not know what is it).
>
> I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/
>  (can be any location) and have spark.yarn.jar to
> /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.
>
>
>
>
>
> On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <
> sourav.mazumder00@gmail.com> wrote:
>
>> Hi Deepu,
>>
>> Here u go.
>>
>> Regards,
>> Sourav
>>
>>
>>
>>
>> *Properties* name value args master yarn-cluster spark.app.name Zeppelin spark.cores.max
>> spark.executor.memory 512m spark.home spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
>> zeppelin.dep.localrepo local-repo zeppelin.pyspark.python python zeppelin.spark.concurrentSQL
>> false zeppelin.spark.maxResult 1000 zeppelin.spark.useHiveContext true
>>
>> On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
>> wrote:
>>
>>> Can you share screen shot of your spark interpreter on zeppelin web
>>> interface.
>>>
>>> I have exact same deployment structure and it runs fine with right set
>>> of configurations.
>>>
>>> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <
>>> sourav.mazumder00@gmail.com> wrote:
>>>
>>>> Hi Moon,
>>>>
>>>> I'm using 0.6 SNAPSHOT which I built from latest git hub.
>>>>
>>>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that
>>>> the control goes to the appropriate IF-ELSE block in interpreter.sh by
>>>> putting some debug statement.
>>>>
>>>> But I get the same error as follows -
>>>>
>>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't
>>>> running on a cluster. Deployment to YARN is not supported directly by
>>>> SparkContext. Please use spark-submit. at
>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>>>> at
>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>>>> at
>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
>>>> at
>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>>>> at
>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>>>> at
>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>> at
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>> at java.lang.Thread.run(Thread.java:745)
>>>>
>>>> Let me know if you need any other details to figure out what is going
>>>> on.
>>>>
>>>> Regards,
>>>> Sourav
>>>>
>>>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <mo...@apache.org> wrote:
>>>>
>>>>> Which version of Zeppelin are you using?
>>>>>
>>>>> Master branch uses spark-submit command, when SPARK_HOME is defined in
>>>>> conf/zeppelin-env.sh
>>>>>
>>>>> If you're not on master branch, recommend try it with SPARK_HOME
>>>>> defined.
>>>>>
>>>>> Hope this helps,
>>>>> moon
>>>>>
>>>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
>>>>> sourav.mazumder00@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> When I try to run Spark Interpreter in Yarn Cluster mode from a
>>>>>> remote machine I always get the error saying try spark-submit than using
>>>>>> spark context.
>>>>>>
>>>>>> Mu Zeppelin process runs in a separate machine remote to the YARN
>>>>>> cluster.
>>>>>>
>>>>>> Any idea why is this error ?
>>>>>>
>>>>>> Regards,
>>>>>> Sourav
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Deepak
>>>
>>>
>>
>
>
> --
> Deepak
>
>

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com.

Is spark installed on your zeppelin machine ?

I would to try these

master yarn-client
spark.home === SPARK INSTALLATION HOME directory on your zeppelin server.



Looking at  spark.yarn.jar , i see spark is installed at
/usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i do
not know what is it).

I have spark installed (unzip) on zeppelin machine at
/usr/hdp/2.3.1.0-2574/spark/spark/
 (can be any location) and have spark.yarn.jar to
/usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.





On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <
sourav.mazumder00@gmail.com> wrote:

> Hi Deepu,
>
> Here u go.
>
> Regards,
> Sourav
>
>
>
>
> *Properties* name value args master yarn-cluster spark.app.name Zeppelin spark.cores.max
> spark.executor.memory 512m spark.home spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
> zeppelin.dep.localrepo local-repo zeppelin.pyspark.python python zeppelin.spark.concurrentSQL
> false zeppelin.spark.maxResult 1000 zeppelin.spark.useHiveContext true
>
> On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com>
> wrote:
>
>> Can you share screen shot of your spark interpreter on zeppelin web
>> interface.
>>
>> I have exact same deployment structure and it runs fine with right set of
>> configurations.
>>
>> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <
>> sourav.mazumder00@gmail.com> wrote:
>>
>>> Hi Moon,
>>>
>>> I'm using 0.6 SNAPSHOT which I built from latest git hub.
>>>
>>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the
>>> control goes to the appropriate IF-ELSE block in interpreter.sh by putting
>>> some debug statement.
>>>
>>> But I get the same error as follows -
>>>
>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't
>>> running on a cluster. Deployment to YARN is not supported directly by
>>> SparkContext. Please use spark-submit. at
>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>>> at
>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>>> at
>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
>>> at
>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>>> at
>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>>> at
>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>> at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> Let me know if you need any other details to figure out what is going on.
>>>
>>> Regards,
>>> Sourav
>>>
>>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <mo...@apache.org> wrote:
>>>
>>>> Which version of Zeppelin are you using?
>>>>
>>>> Master branch uses spark-submit command, when SPARK_HOME is defined in
>>>> conf/zeppelin-env.sh
>>>>
>>>> If you're not on master branch, recommend try it with SPARK_HOME
>>>> defined.
>>>>
>>>> Hope this helps,
>>>> moon
>>>>
>>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
>>>> sourav.mazumder00@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> When I try to run Spark Interpreter in Yarn Cluster mode from a remote
>>>>> machine I always get the error saying try spark-submit than using spark
>>>>> context.
>>>>>
>>>>> Mu Zeppelin process runs in a separate machine remote to the YARN
>>>>> cluster.
>>>>>
>>>>> Any idea why is this error ?
>>>>>
>>>>> Regards,
>>>>> Sourav
>>>>>
>>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>


-- 
Deepak

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Posted by Sourav Mazumder <so...@gmail.com>.

Hi Deepu,

Here u go.

Regards,
Sourav




*Properties* name value args master yarn-cluster spark.app.name
Zeppelin spark.cores.max
spark.executor.memory 512m spark.home spark.yarn.jar
/usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
zeppelin.dep.localrepo local-repo zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL
false zeppelin.spark.maxResult 1000 zeppelin.spark.useHiveContext true

On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <de...@gmail.com> wrote:

> Can you share screen shot of your spark interpreter on zeppelin web
> interface.
>
> I have exact same deployment structure and it runs fine with right set of
> configurations.
>
> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <
> sourav.mazumder00@gmail.com> wrote:
>
>> Hi Moon,
>>
>> I'm using 0.6 SNAPSHOT which I built from latest git hub.
>>
>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the
>> control goes to the appropriate IF-ELSE block in interpreter.sh by putting
>> some debug statement.
>>
>> But I get the same error as follows -
>>
>> org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't
>> running on a cluster. Deployment to YARN is not supported directly by
>> SparkContext. Please use spark-submit. at
>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>> at
>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>> at
>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
>> at
>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>> at
>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>> at
>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>> at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Let me know if you need any other details to figure out what is going on.
>>
>> Regards,
>> Sourav
>>
>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <mo...@apache.org> wrote:
>>
>>> Which version of Zeppelin are you using?
>>>
>>> Master branch uses spark-submit command, when SPARK_HOME is defined in
>>> conf/zeppelin-env.sh
>>>
>>> If you're not on master branch, recommend try it with SPARK_HOME defined.
>>>
>>> Hope this helps,
>>> moon
>>>
>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
>>> sourav.mazumder00@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> When I try to run Spark Interpreter in Yarn Cluster mode from a remote
>>>> machine I always get the error saying try spark-submit than using spark
>>>> context.
>>>>
>>>> Mu Zeppelin process runs in a separate machine remote to the YARN
>>>> cluster.
>>>>
>>>> Any idea why is this error ?
>>>>
>>>> Regards,
>>>> Sourav
>>>>
>>>
>>
>
>
> --
> Deepak
>
>

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com.

Can you share screen shot of your spark interpreter on zeppelin web
interface.

I have exact same deployment structure and it runs fine with right set of
configurations.

On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <sourav.mazumder00@gmail.com
> wrote:

> Hi Moon,
>
> I'm using 0.6 SNAPSHOT which I built from latest git hub.
>
> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the
> control goes to the appropriate IF-ELSE block in interpreter.sh by putting
> some debug statement.
>
> But I get the same error as follows -
>
> org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't
> running on a cluster. Deployment to YARN is not supported directly by
> SparkContext. Please use spark-submit. at
> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
> at
> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
> at
> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
> at
> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
> at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
> at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> Let me know if you need any other details to figure out what is going on.
>
> Regards,
> Sourav
>
> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <mo...@apache.org> wrote:
>
>> Which version of Zeppelin are you using?
>>
>> Master branch uses spark-submit command, when SPARK_HOME is defined in
>> conf/zeppelin-env.sh
>>
>> If you're not on master branch, recommend try it with SPARK_HOME defined.
>>
>> Hope this helps,
>> moon
>>
>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
>> sourav.mazumder00@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> When I try to run Spark Interpreter in Yarn Cluster mode from a remote
>>> machine I always get the error saying try spark-submit than using spark
>>> context.
>>>
>>> Mu Zeppelin process runs in a separate machine remote to the YARN
>>> cluster.
>>>
>>> Any idea why is this error ?
>>>
>>> Regards,
>>> Sourav
>>>
>>
>


-- 
Deepak

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Posted by Sourav Mazumder <so...@gmail.com>.

Hi Moon,

I'm using 0.6 SNAPSHOT which I built from latest git hub.

I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the
control goes to the appropriate IF-ELSE block in interpreter.sh by putting
some debug statement.

But I get the same error as follows -

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't
running on a cluster. Deployment to YARN is not supported directly by
SparkContext. Please use spark-submit. at
org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
at
org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
at
org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Let me know if you need any other details to figure out what is going on.

Regards,
Sourav

On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <mo...@apache.org> wrote:

> Which version of Zeppelin are you using?
>
> Master branch uses spark-submit command, when SPARK_HOME is defined in
> conf/zeppelin-env.sh
>
> If you're not on master branch, recommend try it with SPARK_HOME defined.
>
> Hope this helps,
> moon
>
> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
> sourav.mazumder00@gmail.com> wrote:
>
>> Hi,
>>
>> When I try to run Spark Interpreter in Yarn Cluster mode from a remote
>> machine I always get the error saying try spark-submit than using spark
>> context.
>>
>> Mu Zeppelin process runs in a separate machine remote to the YARN
>> cluster.
>>
>> Any idea why is this error ?
>>
>> Regards,
>> Sourav
>>
>

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Posted by moon soo Lee <mo...@apache.org>.

Which version of Zeppelin are you using?

Master branch uses spark-submit command, when SPARK_HOME is defined in
conf/zeppelin-env.sh

If you're not on master branch, recommend try it with SPARK_HOME defined.

Hope this helps,
moon

On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
sourav.mazumder00@gmail.com> wrote:

> Hi,
>
> When I try to run Spark Interpreter in Yarn Cluster mode from a remote
> machine I always get the error saying try spark-submit than using spark
> context.
>
> Mu Zeppelin process runs in a separate machine remote to the YARN cluster.
>
> Any idea why is this error ?
>
> Regards,
> Sourav
>