You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@zeppelin.apache.org by Charmee Patel <ch...@gmail.com> on 2015/04/13 16:40:43 UTC

Run Zeppelin on yarn-client mode

Hi,

I am trying to get Zeppelin work in yarn-client and yarn-submit mode. There
are some conflicting notes in the email distros about how to get Zeppelin
to work on yarn-client mode. I have tried a few different things but none
have worked for me so far.

I have CDH 5.3 . Here is what I have so far

   1. Built Zeppelin on local (OSX) using
   1. mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0
      -Phadoop-2.4 -DskipTests -Pyarn
   2. Generated a distribution package and deployed to edge node of my
   cluster using
   1. mvn clean package -P build-distr -DskipTests
   3. At this point local mode works fine
   4. To get Yarn mode working
      1. I set Hadoop conf dir in zeppelin-env.sh and set Yarn
      Master=yarn-client
      I get an exception in my logs

      ERROR ({pool-2-thread-5} Job.java[run]:165) - Job failed
      com.nflabs.zeppelin.interpreter.InterpreterException:
      org.apache.thrift.TApplicationException: Internal error processing open

      2. Set Spark_Yarn_Jar but got same exception as above
      3. Copied my spark assembly jar in interpreter/spark directory but
      that did not work either
      4. Also set Spark Yarn Jar and Spark Home in interpreter UI but that
      did not work
      5. I took Spark Yarn Jar out and let Zeppelin use Spark that comes
      bundled with it
         1. This seemed to work initially, but as soon as I call Spark
         Action (collect/count etc) I get this exception

java.lang.RuntimeException: Error in configuring object at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:182)

Any pointers on I might have missed in configuring zeppelin to work with
yarn-client mode?

Thanks,
Charmee

Re: Run Zeppelin on yarn-client mode

Posted by Charmee Patel <ch...@gmail.com>.

Hi Ram, Jongyoul,

Many thanks for your response. Was not able to test your suggestions
earlier.

I tried replacing my custom spark jar in interpreter/spark/spark*.jar but
that is giving me exception that Zeppelin Spark interpreter not found.
Looks like the zeppelin spark jar is a fat jar that includes spark and
zeppelin code combined in one jar? I can work around this for now so it is
okay, but would be good to know how custom spark jar works.

Btw, I already had hive-site.xml in spark conf directory but as that was
not being recognized, I copied it over to zeppelin's conf and now it is
being used.

Charmee

On Wed, Apr 15, 2015 at 9:19 AM Ram Venkatesh <rv...@hortonworks.com>
wrote:

>  Hi Charmee,
>
>  I have successfully configured spark in yarn-client mode for HDP by
> setting spark.home and spark.yarn.jar appropriately. I have not validated
> this against CDH.
>
>  For hive metastore access, yes you need to have a hive-site.xml in your
> zeppelin/conf directory or SPARK_CONF directory.
>
>  HTH
> Ram
>
>   On Apr 14, 2015, at 8:33 PM, Charmee Patel <ch...@gmail.com> wrote:
>
>  One more question on yarn-mode.
>
>  After setting Spark Yarn Jar path and Spark Home path in Spark
> interpreter (via UI) I was expecting Zeppelin to use my own version of
> Spark. I don't think that is happening. From zeppelin, if I do sc.version
> it gives 1.2.1 (which is included for CDH 5.3). When I run spark-shell from
> my custom Spark lib, version shows as 1.2.0.
>
>  I have set Spark Yarn Jar, Spark Home (from Interpreter UI) and Hadoop
> conf dir (in zeppelin-env.sh). Do I need to do anything else for zeppelin
> to use my spark jar?
>
>  Also, my hive context does not recognize the databases/tables on the
> cluster. Do I need to point or copy hive-site.xml anywhere in zeppelin's
> conf?
>
>  Thanks,
> Charmee
>
>
>
> On Tue, Apr 14, 2015 at 9:32 PM Jongyoul Lee <jo...@gmail.com> wrote:
>
>> Good!!
>>
>> On Tue, Apr 14, 2015 at 11:14 PM, Charmee Patel <ch...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>>  Thanks.
>>>
>>>  I had pulled the code from https://github.com/NFLabs/zeppelin repository
>>> when it had not moved to apache/incubator-zeppelin yet and I kept pulling
>>> from same repo. Synced up my version with latest on
>>> apache/incubator-zeppelin and it is working now.
>>>
>>>  -Charmee
>>>
>>> On Mon, Apr 13, 2015 at 9:12 PM Jongyoul Lee <jo...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>>  I don't know what version you use exactly because com.nflabs.zeppelin
>>>> moves to org.apache.zeppelin. You don't need to set SPARK_YARN_JAR at the
>>>> latest master. Could you please check this out? And I've test yarn-client
>>>> mode on 2.5.0-cdh5.3.0 with spark 1.3 but didn't do it with spark 1.2. My
>>>> build script is
>>>>
>>>>  mvn clean package -Pspark-1.3 -Dhadoop.version=2.5.0-cdh5.3.0
>>>> -Phadoop-2.4 -DskipTests -Pyarn -Pbuild-distr
>>>>
>>>>  Regards,
>>>> Jongyoul Lee
>>>>
>>>> On Mon, Apr 13, 2015 at 11:40 PM, Charmee Patel <ch...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>>  I am trying to get Zeppelin work in yarn-client and yarn-submit
>>>>> mode. There are some conflicting notes in the email distros about how to
>>>>> get Zeppelin to work on yarn-client mode. I have tried a few different
>>>>> things but none have worked for me so far.
>>>>>
>>>>>  I have CDH 5.3 . Here is what I have so far
>>>>>
>>>>>    1. Built Zeppelin on local (OSX) using
>>>>>     1. mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0
>>>>>       -Phadoop-2.4 -DskipTests -Pyarn
>>>>>    2. Generated a distribution package and deployed to edge node of
>>>>>    my cluster using
>>>>>     1. mvn clean package -P build-distr -DskipTests
>>>>>    3. At this point local mode works fine
>>>>>    4. To get Yarn mode working
>>>>>       1. I set Hadoop conf dir in zeppelin-env.sh and set Yarn
>>>>>       Master=yarn-client
>>>>>       I get an exception in my logs
>>>>>
>>>>>       ERROR ({pool-2-thread-5} Job.java[run]:165) - Job failed
>>>>>       com.nflabs.zeppelin.interpreter.InterpreterException:
>>>>>       org.apache.thrift.TApplicationException: Internal error processing open
>>>>>
>>>>>       2. Set Spark_Yarn_Jar but got same exception as above
>>>>>       3. Copied my spark assembly jar in interpreter/spark directory
>>>>>       but that did not work either
>>>>>       4. Also set Spark Yarn Jar and Spark Home in interpreter UI but
>>>>>       that did not work
>>>>>       5. I took Spark Yarn Jar out and let Zeppelin use Spark that
>>>>>       comes bundled with it
>>>>>          1. This seemed to work initially, but as soon as I call
>>>>>          Spark Action (collect/count etc) I get this exception
>>>>>
>>>>>  java.lang.RuntimeException: Error in configuring object at
>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>>>>> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>>>> at
>>>>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>>>>> at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:182)
>>>>>
>>>>>  Any pointers on I might have missed in configuring zeppelin to work
>>>>> with yarn-client mode?
>>>>>
>>>>>  Thanks,
>>>>>  Charmee
>>>>>
>>>>
>>>>
>>>>
>>>>  --
>>>>  이종열, Jongyoul Lee, 李宗烈
>>>>  http://madeng.net
>>>>
>>>
>>
>>
>>  --
>>  이종열, Jongyoul Lee, 李宗烈
>>  http://madeng.net
>>
>
>

Re: Run Zeppelin on yarn-client mode

Posted by Ram Venkatesh <rv...@hortonworks.com>.

Hi Charmee,

I have successfully configured spark in yarn-client mode for HDP by setting spark.home and spark.yarn.jar appropriately. I have not validated this against CDH.

For hive metastore access, yes you need to have a hive-site.xml in your zeppelin/conf directory or SPARK_CONF directory.

HTH
Ram

On Apr 14, 2015, at 8:33 PM, Charmee Patel <ch...@gmail.com>> wrote:

One more question on yarn-mode.

After setting Spark Yarn Jar path and Spark Home path in Spark interpreter (via UI) I was expecting Zeppelin to use my own version of Spark. I don't think that is happening. From zeppelin, if I do sc.version it gives 1.2.1 (which is included for CDH 5.3). When I run spark-shell from my custom Spark lib, version shows as 1.2.0.

I have set Spark Yarn Jar, Spark Home (from Interpreter UI) and Hadoop conf dir (in zeppelin-env.sh). Do I need to do anything else for zeppelin to use my spark jar?

Also, my hive context does not recognize the databases/tables on the cluster. Do I need to point or copy hive-site.xml anywhere in zeppelin's conf?

Thanks,
Charmee



On Tue, Apr 14, 2015 at 9:32 PM Jongyoul Lee <jo...@gmail.com>> wrote:
Good!!

On Tue, Apr 14, 2015 at 11:14 PM, Charmee Patel <ch...@gmail.com>> wrote:
Hi,

Thanks.

I had pulled the code from https://github.com/NFLabs/zeppelin repository when it had not moved to apache/incubator-zeppelin yet and I kept pulling from same repo. Synced up my version with latest on apache/incubator-zeppelin and it is working now.

-Charmee

On Mon, Apr 13, 2015 at 9:12 PM Jongyoul Lee <jo...@gmail.com>> wrote:
Hi,

I don't know what version you use exactly because com.nflabs.zeppelin moves to org.apache.zeppelin. You don't need to set SPARK_YARN_JAR at the latest master. Could you please check this out? And I've test yarn-client mode on 2.5.0-cdh5.3.0 with spark 1.3 but didn't do it with spark 1.2. My build script is

mvn clean package -Pspark-1.3 -Dhadoop.version=2.5.0-cdh5.3.0 -Phadoop-2.4 -DskipTests -Pyarn -Pbuild-distr

Regards,
Jongyoul Lee

On Mon, Apr 13, 2015 at 11:40 PM, Charmee Patel <ch...@gmail.com>> wrote:
Hi,

I am trying to get Zeppelin work in yarn-client and yarn-submit mode. There are some conflicting notes in the email distros about how to get Zeppelin to work on yarn-client mode. I have tried a few different things but none have worked for me so far.

I have CDH 5.3 . Here is what I have so far

  1.  Built Zeppelin on local (OSX) using
     *   mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0 -Phadoop-2.4 -DskipTests -Pyarn
  2.  Generated a distribution package and deployed to edge node of my cluster using
     *   mvn clean package -P build-distr -DskipTests
  3.  At this point local mode works fine
  4.  To get Yarn mode working
     *   I set Hadoop conf dir in zeppelin-env.sh and set Yarn Master=yarn-client

I get an exception in my logs

ERROR ({pool-2-thread-5} Job.java[run]:165) - Job failed
com.nflabs.zeppelin.interpreter.InterpreterException: org.apache.thrift.TApplicationException: Internal error processing open

     *   Set Spark_Yarn_Jar but got same exception as above
     *   Copied my spark assembly jar in interpreter/spark directory but that did not work either
     *   Also set Spark Yarn Jar and Spark Home in interpreter UI but that did not work
     *   I took Spark Yarn Jar out and let Zeppelin use Spark that comes bundled with it
        *   This seemed to work initially, but as soon as I call Spark Action (collect/count etc) I get this exception

java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:182)

Any pointers on I might have missed in configuring zeppelin to work with yarn-client mode?

Thanks,
Charmee



--
이종열, Jongyoul Lee, 李宗烈
http://madeng.net<http://madeng.net/>



--
이종열, Jongyoul Lee, 李宗烈
http://madeng.net<http://madeng.net/>

Re: Run Zeppelin on yarn-client mode

Posted by Jongyoul Lee <jo...@gmail.com>.

Hi,

I'm not familiar with custom spark version. As mentioned by kevin, It would
work if you replace interperter/spark/spark*.jar to your custom spark*.jar.

Regards,
Jongyoul

On Wed, Apr 15, 2015 at 12:33 PM, Charmee Patel <ch...@gmail.com> wrote:

> One more question on yarn-mode.
>
> After setting Spark Yarn Jar path and Spark Home path in Spark interpreter
> (via UI) I was expecting Zeppelin to use my own version of Spark. I don't
> think that is happening. From zeppelin, if I do sc.version it gives 1.2.1
> (which is included for CDH 5.3). When I run spark-shell from my custom
> Spark lib, version shows as 1.2.0.
>
> I have set Spark Yarn Jar, Spark Home (from Interpreter UI) and Hadoop
> conf dir (in zeppelin-env.sh). Do I need to do anything else for zeppelin
> to use my spark jar?
>
> Also, my hive context does not recognize the databases/tables on the
> cluster. Do I need to point or copy hive-site.xml anywhere in zeppelin's
> conf?
>
> Thanks,
> Charmee
>
>
>
> On Tue, Apr 14, 2015 at 9:32 PM Jongyoul Lee <jo...@gmail.com> wrote:
>
>> Good!!
>>
>> On Tue, Apr 14, 2015 at 11:14 PM, Charmee Patel <ch...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Thanks.
>>>
>>> I had pulled the code from https://github.com/NFLabs/zeppelin repository
>>> when it had not moved to apache/incubator-zeppelin yet and I kept pulling
>>> from same repo. Synced up my version with latest on
>>> apache/incubator-zeppelin and it is working now.
>>>
>>> -Charmee
>>>
>>> On Mon, Apr 13, 2015 at 9:12 PM Jongyoul Lee <jo...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I don't know what version you use exactly because com.nflabs.zeppelin
>>>> moves to org.apache.zeppelin. You don't need to set SPARK_YARN_JAR at the
>>>> latest master. Could you please check this out? And I've test yarn-client
>>>> mode on 2.5.0-cdh5.3.0 with spark 1.3 but didn't do it with spark 1.2. My
>>>> build script is
>>>>
>>>> mvn clean package -Pspark-1.3 -Dhadoop.version=2.5.0-cdh5.3.0
>>>> -Phadoop-2.4 -DskipTests -Pyarn -Pbuild-distr
>>>>
>>>> Regards,
>>>> Jongyoul Lee
>>>>
>>>> On Mon, Apr 13, 2015 at 11:40 PM, Charmee Patel <ch...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to get Zeppelin work in yarn-client and yarn-submit
>>>>> mode. There are some conflicting notes in the email distros about how to
>>>>> get Zeppelin to work on yarn-client mode. I have tried a few different
>>>>> things but none have worked for me so far.
>>>>>
>>>>> I have CDH 5.3 . Here is what I have so far
>>>>>
>>>>>    1. Built Zeppelin on local (OSX) using
>>>>>    1. mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0
>>>>>       -Phadoop-2.4 -DskipTests -Pyarn
>>>>>    2. Generated a distribution package and deployed to edge node of
>>>>>    my cluster using
>>>>>    1. mvn clean package -P build-distr -DskipTests
>>>>>    3. At this point local mode works fine
>>>>>    4. To get Yarn mode working
>>>>>       1. I set Hadoop conf dir in zeppelin-env.sh and set Yarn
>>>>>       Master=yarn-client
>>>>>       I get an exception in my logs
>>>>>
>>>>>       ERROR ({pool-2-thread-5} Job.java[run]:165) - Job failed
>>>>>       com.nflabs.zeppelin.interpreter.InterpreterException:
>>>>>       org.apache.thrift.TApplicationException: Internal error processing open
>>>>>
>>>>>       2. Set Spark_Yarn_Jar but got same exception as above
>>>>>       3. Copied my spark assembly jar in interpreter/spark directory
>>>>>       but that did not work either
>>>>>       4. Also set Spark Yarn Jar and Spark Home in interpreter UI but
>>>>>       that did not work
>>>>>       5. I took Spark Yarn Jar out and let Zeppelin use Spark that
>>>>>       comes bundled with it
>>>>>          1. This seemed to work initially, but as soon as I call
>>>>>          Spark Action (collect/count etc) I get this exception
>>>>>
>>>>> java.lang.RuntimeException: Error in configuring object at
>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>>>>> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>>>> at
>>>>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>>>>> at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:182)
>>>>>
>>>>> Any pointers on I might have missed in configuring zeppelin to work
>>>>> with yarn-client mode?
>>>>>
>>>>> Thanks,
>>>>> Charmee
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> 이종열, Jongyoul Lee, 李宗烈
>>>> http://madeng.net
>>>>
>>>
>>
>>
>> --
>> 이종열, Jongyoul Lee, 李宗烈
>> http://madeng.net
>>
>


-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Re: Run Zeppelin on yarn-client mode

Posted by Charmee Patel <ch...@gmail.com>.

One more question on yarn-mode.

After setting Spark Yarn Jar path and Spark Home path in Spark interpreter
(via UI) I was expecting Zeppelin to use my own version of Spark. I don't
think that is happening. From zeppelin, if I do sc.version it gives 1.2.1
(which is included for CDH 5.3). When I run spark-shell from my custom
Spark lib, version shows as 1.2.0.

I have set Spark Yarn Jar, Spark Home (from Interpreter UI) and Hadoop conf
dir (in zeppelin-env.sh). Do I need to do anything else for zeppelin to use
my spark jar?

Also, my hive context does not recognize the databases/tables on the
cluster. Do I need to point or copy hive-site.xml anywhere in zeppelin's
conf?

Thanks,
Charmee



On Tue, Apr 14, 2015 at 9:32 PM Jongyoul Lee <jo...@gmail.com> wrote:

> Good!!
>
> On Tue, Apr 14, 2015 at 11:14 PM, Charmee Patel <ch...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Thanks.
>>
>> I had pulled the code from https://github.com/NFLabs/zeppelin repository
>> when it had not moved to apache/incubator-zeppelin yet and I kept pulling
>> from same repo. Synced up my version with latest on
>> apache/incubator-zeppelin and it is working now.
>>
>> -Charmee
>>
>> On Mon, Apr 13, 2015 at 9:12 PM Jongyoul Lee <jo...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I don't know what version you use exactly because com.nflabs.zeppelin
>>> moves to org.apache.zeppelin. You don't need to set SPARK_YARN_JAR at the
>>> latest master. Could you please check this out? And I've test yarn-client
>>> mode on 2.5.0-cdh5.3.0 with spark 1.3 but didn't do it with spark 1.2. My
>>> build script is
>>>
>>> mvn clean package -Pspark-1.3 -Dhadoop.version=2.5.0-cdh5.3.0
>>> -Phadoop-2.4 -DskipTests -Pyarn -Pbuild-distr
>>>
>>> Regards,
>>> Jongyoul Lee
>>>
>>> On Mon, Apr 13, 2015 at 11:40 PM, Charmee Patel <ch...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am trying to get Zeppelin work in yarn-client and yarn-submit
>>>> mode. There are some conflicting notes in the email distros about how to
>>>> get Zeppelin to work on yarn-client mode. I have tried a few different
>>>> things but none have worked for me so far.
>>>>
>>>> I have CDH 5.3 . Here is what I have so far
>>>>
>>>>    1. Built Zeppelin on local (OSX) using
>>>>    1. mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0
>>>>       -Phadoop-2.4 -DskipTests -Pyarn
>>>>    2. Generated a distribution package and deployed to edge node of my
>>>>    cluster using
>>>>    1. mvn clean package -P build-distr -DskipTests
>>>>    3. At this point local mode works fine
>>>>    4. To get Yarn mode working
>>>>       1. I set Hadoop conf dir in zeppelin-env.sh and set Yarn
>>>>       Master=yarn-client
>>>>       I get an exception in my logs
>>>>
>>>>       ERROR ({pool-2-thread-5} Job.java[run]:165) - Job failed
>>>>       com.nflabs.zeppelin.interpreter.InterpreterException:
>>>>       org.apache.thrift.TApplicationException: Internal error processing open
>>>>
>>>>       2. Set Spark_Yarn_Jar but got same exception as above
>>>>       3. Copied my spark assembly jar in interpreter/spark directory
>>>>       but that did not work either
>>>>       4. Also set Spark Yarn Jar and Spark Home in interpreter UI but
>>>>       that did not work
>>>>       5. I took Spark Yarn Jar out and let Zeppelin use Spark that
>>>>       comes bundled with it
>>>>          1. This seemed to work initially, but as soon as I call Spark
>>>>          Action (collect/count etc) I get this exception
>>>>
>>>> java.lang.RuntimeException: Error in configuring object at
>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>>>> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>>> at
>>>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>>>> at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:182)
>>>>
>>>> Any pointers on I might have missed in configuring zeppelin to work
>>>> with yarn-client mode?
>>>>
>>>> Thanks,
>>>> Charmee
>>>>
>>>
>>>
>>>
>>> --
>>> 이종열, Jongyoul Lee, 李宗烈
>>> http://madeng.net
>>>
>>
>
>
> --
> 이종열, Jongyoul Lee, 李宗烈
> http://madeng.net
>

Re: Run Zeppelin on yarn-client mode

Posted by Jongyoul Lee <jo...@gmail.com>.

Good!!

On Tue, Apr 14, 2015 at 11:14 PM, Charmee Patel <ch...@gmail.com> wrote:

> Hi,
>
> Thanks.
>
> I had pulled the code from https://github.com/NFLabs/zeppelin repository
> when it had not moved to apache/incubator-zeppelin yet and I kept pulling
> from same repo. Synced up my version with latest on
> apache/incubator-zeppelin and it is working now.
>
> -Charmee
>
> On Mon, Apr 13, 2015 at 9:12 PM Jongyoul Lee <jo...@gmail.com> wrote:
>
>> Hi,
>>
>> I don't know what version you use exactly because com.nflabs.zeppelin
>> moves to org.apache.zeppelin. You don't need to set SPARK_YARN_JAR at the
>> latest master. Could you please check this out? And I've test yarn-client
>> mode on 2.5.0-cdh5.3.0 with spark 1.3 but didn't do it with spark 1.2. My
>> build script is
>>
>> mvn clean package -Pspark-1.3 -Dhadoop.version=2.5.0-cdh5.3.0
>> -Phadoop-2.4 -DskipTests -Pyarn -Pbuild-distr
>>
>> Regards,
>> Jongyoul Lee
>>
>> On Mon, Apr 13, 2015 at 11:40 PM, Charmee Patel <ch...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I am trying to get Zeppelin work in yarn-client and yarn-submit
>>> mode. There are some conflicting notes in the email distros about how to
>>> get Zeppelin to work on yarn-client mode. I have tried a few different
>>> things but none have worked for me so far.
>>>
>>> I have CDH 5.3 . Here is what I have so far
>>>
>>>    1. Built Zeppelin on local (OSX) using
>>>    1. mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0
>>>       -Phadoop-2.4 -DskipTests -Pyarn
>>>    2. Generated a distribution package and deployed to edge node of my
>>>    cluster using
>>>    1. mvn clean package -P build-distr -DskipTests
>>>    3. At this point local mode works fine
>>>    4. To get Yarn mode working
>>>       1. I set Hadoop conf dir in zeppelin-env.sh and set Yarn
>>>       Master=yarn-client
>>>       I get an exception in my logs
>>>
>>>       ERROR ({pool-2-thread-5} Job.java[run]:165) - Job failed
>>>       com.nflabs.zeppelin.interpreter.InterpreterException:
>>>       org.apache.thrift.TApplicationException: Internal error processing open
>>>
>>>       2. Set Spark_Yarn_Jar but got same exception as above
>>>       3. Copied my spark assembly jar in interpreter/spark directory
>>>       but that did not work either
>>>       4. Also set Spark Yarn Jar and Spark Home in interpreter UI but
>>>       that did not work
>>>       5. I took Spark Yarn Jar out and let Zeppelin use Spark that
>>>       comes bundled with it
>>>          1. This seemed to work initially, but as soon as I call Spark
>>>          Action (collect/count etc) I get this exception
>>>
>>> java.lang.RuntimeException: Error in configuring object at
>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>>> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>> at
>>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>>> at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:182)
>>>
>>> Any pointers on I might have missed in configuring zeppelin to work with
>>> yarn-client mode?
>>>
>>> Thanks,
>>> Charmee
>>>
>>
>>
>>
>> --
>> 이종열, Jongyoul Lee, 李宗烈
>> http://madeng.net
>>
>


-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Re: Run Zeppelin on yarn-client mode

Posted by Charmee Patel <ch...@gmail.com>.

Hi,

Thanks.

I had pulled the code from https://github.com/NFLabs/zeppelin repository
when it had not moved to apache/incubator-zeppelin yet and I kept pulling
from same repo. Synced up my version with latest on
apache/incubator-zeppelin and it is working now.

-Charmee

On Mon, Apr 13, 2015 at 9:12 PM Jongyoul Lee <jo...@gmail.com> wrote:

> Hi,
>
> I don't know what version you use exactly because com.nflabs.zeppelin
> moves to org.apache.zeppelin. You don't need to set SPARK_YARN_JAR at the
> latest master. Could you please check this out? And I've test yarn-client
> mode on 2.5.0-cdh5.3.0 with spark 1.3 but didn't do it with spark 1.2. My
> build script is
>
> mvn clean package -Pspark-1.3 -Dhadoop.version=2.5.0-cdh5.3.0 -Phadoop-2.4
> -DskipTests -Pyarn -Pbuild-distr
>
> Regards,
> Jongyoul Lee
>
> On Mon, Apr 13, 2015 at 11:40 PM, Charmee Patel <ch...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am trying to get Zeppelin work in yarn-client and yarn-submit
>> mode. There are some conflicting notes in the email distros about how to
>> get Zeppelin to work on yarn-client mode. I have tried a few different
>> things but none have worked for me so far.
>>
>> I have CDH 5.3 . Here is what I have so far
>>
>>    1. Built Zeppelin on local (OSX) using
>>    1. mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0
>>       -Phadoop-2.4 -DskipTests -Pyarn
>>    2. Generated a distribution package and deployed to edge node of my
>>    cluster using
>>    1. mvn clean package -P build-distr -DskipTests
>>    3. At this point local mode works fine
>>    4. To get Yarn mode working
>>       1. I set Hadoop conf dir in zeppelin-env.sh and set Yarn
>>       Master=yarn-client
>>       I get an exception in my logs
>>
>>       ERROR ({pool-2-thread-5} Job.java[run]:165) - Job failed
>>       com.nflabs.zeppelin.interpreter.InterpreterException:
>>       org.apache.thrift.TApplicationException: Internal error processing open
>>
>>       2. Set Spark_Yarn_Jar but got same exception as above
>>       3. Copied my spark assembly jar in interpreter/spark directory but
>>       that did not work either
>>       4. Also set Spark Yarn Jar and Spark Home in interpreter UI but
>>       that did not work
>>       5. I took Spark Yarn Jar out and let Zeppelin use Spark that comes
>>       bundled with it
>>          1. This seemed to work initially, but as soon as I call Spark
>>          Action (collect/count etc) I get this exception
>>
>> java.lang.RuntimeException: Error in configuring object at
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>> at
>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>> at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:182)
>>
>> Any pointers on I might have missed in configuring zeppelin to work with
>> yarn-client mode?
>>
>> Thanks,
>> Charmee
>>
>
>
>
> --
> 이종열, Jongyoul Lee, 李宗烈
> http://madeng.net
>

Re: Run Zeppelin on yarn-client mode

Posted by Jongyoul Lee <jo...@gmail.com>.

Hi,

I don't know what version you use exactly because com.nflabs.zeppelin moves
to org.apache.zeppelin. You don't need to set SPARK_YARN_JAR at the latest
master. Could you please check this out? And I've test yarn-client mode on
2.5.0-cdh5.3.0 with spark 1.3 but didn't do it with spark 1.2. My build
script is

mvn clean package -Pspark-1.3 -Dhadoop.version=2.5.0-cdh5.3.0 -Phadoop-2.4
-DskipTests -Pyarn -Pbuild-distr

Regards,
Jongyoul Lee

On Mon, Apr 13, 2015 at 11:40 PM, Charmee Patel <ch...@gmail.com> wrote:

> Hi,
>
> I am trying to get Zeppelin work in yarn-client and yarn-submit
> mode. There are some conflicting notes in the email distros about how to
> get Zeppelin to work on yarn-client mode. I have tried a few different
> things but none have worked for me so far.
>
> I have CDH 5.3 . Here is what I have so far
>
>    1. Built Zeppelin on local (OSX) using
>    1. mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0
>       -Phadoop-2.4 -DskipTests -Pyarn
>    2. Generated a distribution package and deployed to edge node of my
>    cluster using
>    1. mvn clean package -P build-distr -DskipTests
>    3. At this point local mode works fine
>    4. To get Yarn mode working
>       1. I set Hadoop conf dir in zeppelin-env.sh and set Yarn
>       Master=yarn-client
>       I get an exception in my logs
>
>       ERROR ({pool-2-thread-5} Job.java[run]:165) - Job failed
>       com.nflabs.zeppelin.interpreter.InterpreterException:
>       org.apache.thrift.TApplicationException: Internal error processing open
>
>       2. Set Spark_Yarn_Jar but got same exception as above
>       3. Copied my spark assembly jar in interpreter/spark directory but
>       that did not work either
>       4. Also set Spark Yarn Jar and Spark Home in interpreter UI but
>       that did not work
>       5. I took Spark Yarn Jar out and let Zeppelin use Spark that comes
>       bundled with it
>          1. This seemed to work initially, but as soon as I call Spark
>          Action (collect/count etc) I get this exception
>
> java.lang.RuntimeException: Error in configuring object at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:182)
>
> Any pointers on I might have missed in configuring zeppelin to work with
> yarn-client mode?
>
> Thanks,
> Charmee
>



-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net