You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Serega Sheypak <se...@gmail.com> on 2018/03/18 23:19:39 UTC

Run spark 2.2 on yarn as usual java application

Hi, Is it even possible to run spark on yarn as usual java application?
I've built jat using maven with spark-yarn dependency and I manually
populate SparkConf with all hadoop properties.
SparkContext fails to start with exception:

   1. Caused by: java.lang.IllegalStateException: Library directory
   '/hadoop/yarn/local/usercache/root/appcache/application_1521375636129_0022/container_e06_1521375636129_0022_01_000002/assembly/target/scala-2.11/jars'
   does not exist; make sure Spark is built.
   2. at org.apache.spark.launcher.CommandBuilderUtils.checkState(
   CommandBuilderUtils.java:260)
   3. at org.apache.spark.launcher.CommandBuilderUtils.findJarsDir(
   CommandBuilderUtils.java:359)
   4. at org.apache.spark.launcher.YarnCommandBuilderUtils$.findJarsDir(
   YarnCommandBuilderUtils.scala:38)


I took a look at the code and it has some hardcodes and checks for specific
files layout. I don't follow why :)
Is it possible to bypass such checks?

Re: Run spark 2.2 on yarn as usual java application

Posted by Serega Sheypak <se...@gmail.com>.

Hi Jörn, thanks for your reply
Oozie starts ooze java action as single "long running" MapReduce Mapper.
This mapper is responsible for calling main class. Main class belongs to
user and this main class starts spark job.
yarn-cluster is not an option for me. I have to do something special to
maintain "run away" driver. Imagine I want to kill the spark job. I can
just kill oozie workflow, it will kill spawned mapper with main class with
driver inside it.
It won't happen in yarn-cluster mode, since driver is not running in the
process "managed" by oozie.


2018-03-19 13:41 GMT+01:00 Jörn Franke <jo...@gmail.com>:

> Maybe you should better run it in yarn cluster mode. Yarn client would
> start the driver on the oozie server.
>
> On 19. Mar 2018, at 12:58, Serega Sheypak <se...@gmail.com>
> wrote:
>
> I'm trying to run it as Oozie java action and reduce env dependency. The
> only thing I need is Hadoop Configuration to talk to hdfs and yarn.
> Spark submit is a shell thing. Trying to do all from jvm.
> Oozie java action starts main class which inststiates SparkConf and
> session. It works well in local mode but throws exception when I try to run
> spark as yarn-client
>
> пн, 19 марта 2018 г. в 7:16, Jacek Laskowski <ja...@japila.pl>:
>
>> Hi,
>>
>> What's the deployment process then (if not using spark-submit)? How is
>> the AM deployed? Why would you want to skip spark-submit?
>>
>> Jacek
>>
>> On 19 Mar 2018 00:20, "Serega Sheypak" <se...@gmail.com> wrote:
>>
>>> Hi, Is it even possible to run spark on yarn as usual java application?
>>> I've built jat using maven with spark-yarn dependency and I manually
>>> populate SparkConf with all hadoop properties.
>>> SparkContext fails to start with exception:
>>>
>>>    1. Caused by: java.lang.IllegalStateException: Library directory
>>>    '/hadoop/yarn/local/usercache/root/appcache/application_
>>>    1521375636129_0022/container_e06_1521375636129_0022_01_
>>>    000002/assembly/target/scala-2.11/jars' does not exist; make sure
>>>    Spark is built.
>>>    2. at org.apache.spark.launcher.CommandBuilderUtils.checkState(Com
>>>    mandBuilderUtils.java:260)
>>>    3. at org.apache.spark.launcher.CommandBuilderUtils.findJarsDir(Co
>>>    mmandBuilderUtils.java:359)
>>>    4. at org.apache.spark.launcher.YarnCommandBuilderUtils$.findJarsDir(
>>>    YarnCommandBuilderUtils.scala:38)
>>>
>>>
>>> I took a look at the code and it has some hardcodes and checks for
>>> specific files layout. I don't follow why :)
>>> Is it possible to bypass such checks?
>>>
>>

Re: Run spark 2.2 on yarn as usual java application

Posted by Jörn Franke <jo...@gmail.com>.

Maybe you should better run it in yarn cluster mode. Yarn client would start the driver on the oozie server.

> On 19. Mar 2018, at 12:58, Serega Sheypak <se...@gmail.com> wrote:
> 
> I'm trying to run it as Oozie java action and reduce env dependency. The only thing I need is Hadoop Configuration to talk to hdfs and yarn. 
> Spark submit is a shell thing. Trying to do all from jvm. 
> Oozie java action starts main class which inststiates SparkConf and session. It works well in local mode but throws exception when I try to run spark as yarn-client
> 
> пн, 19 марта 2018 г. в 7:16, Jacek Laskowski <ja...@japila.pl>:
>> Hi,
>> 
>> What's the deployment process then (if not using spark-submit)? How is the AM deployed? Why would you want to skip spark-submit?
>> 
>> Jacek
>> 
>>> On 19 Mar 2018 00:20, "Serega Sheypak" <se...@gmail.com> wrote:
>>> Hi, Is it even possible to run spark on yarn as usual java application?
>>> I've built jat using maven with spark-yarn dependency and I manually populate SparkConf with all hadoop properties. 
>>> SparkContext fails to start with exception:
>>> Caused by: java.lang.IllegalStateException: Library directory '/hadoop/yarn/local/usercache/root/appcache/application_1521375636129_0022/container_e06_1521375636129_0022_01_000002/assembly/target/scala-2.11/jars' does not exist; make sure Spark is built.
>>> 	at org.apache.spark.launcher.CommandBuilderUtils.checkState(CommandBuilderUtils.java:260)
>>> 	at org.apache.spark.launcher.CommandBuilderUtils.findJarsDir(CommandBuilderUtils.java:359)
>>> 	at org.apache.spark.launcher.YarnCommandBuilderUtils$.findJarsDir(YarnCommandBuilderUtils.scala:38)
>>> 
>>> I took a look at the code and it has some hardcodes and checks for specific files layout. I don't follow why :)
>>> Is it possible to bypass such checks?

Re: Run spark 2.2 on yarn as usual java application

Posted by Serega Sheypak <se...@gmail.com>.

I'm trying to run it as Oozie java action and reduce env dependency. The
only thing I need is Hadoop Configuration to talk to hdfs and yarn.
Spark submit is a shell thing. Trying to do all from jvm.
Oozie java action starts main class which inststiates SparkConf and
session. It works well in local mode but throws exception when I try to run
spark as yarn-client

пн, 19 марта 2018 г. в 7:16, Jacek Laskowski <ja...@japila.pl>:

> Hi,
>
> What's the deployment process then (if not using spark-submit)? How is the
> AM deployed? Why would you want to skip spark-submit?
>
> Jacek
>
> On 19 Mar 2018 00:20, "Serega Sheypak" <se...@gmail.com> wrote:
>
>> Hi, Is it even possible to run spark on yarn as usual java application?
>> I've built jat using maven with spark-yarn dependency and I manually
>> populate SparkConf with all hadoop properties.
>> SparkContext fails to start with exception:
>>
>>    1. Caused by: java.lang.IllegalStateException: Library directory
>>    '/hadoop/yarn/local/usercache/root/appcache/application_1521375636129_0022/container_e06_1521375636129_0022_01_000002/assembly/target/scala-2.11/jars'
>>    does not exist; make sure Spark is built.
>>    2. at org.apache.spark.launcher.CommandBuilderUtils.checkState(
>>    CommandBuilderUtils.java:260)
>>    3. at org.apache.spark.launcher.CommandBuilderUtils.findJarsDir(
>>    CommandBuilderUtils.java:359)
>>    4. at org.apache.spark.launcher.YarnCommandBuilderUtils$.findJarsDir(
>>    YarnCommandBuilderUtils.scala:38)
>>
>>
>> I took a look at the code and it has some hardcodes and checks for
>> specific files layout. I don't follow why :)
>> Is it possible to bypass such checks?
>>
>

Re: Run spark 2.2 on yarn as usual java application

Posted by Jacek Laskowski <ja...@japila.pl>.

Hi,

What's the deployment process then (if not using spark-submit)? How is the
AM deployed? Why would you want to skip spark-submit?

Jacek

On 19 Mar 2018 00:20, "Serega Sheypak" <se...@gmail.com> wrote:

> Hi, Is it even possible to run spark on yarn as usual java application?
> I've built jat using maven with spark-yarn dependency and I manually
> populate SparkConf with all hadoop properties.
> SparkContext fails to start with exception:
>
>    1. Caused by: java.lang.IllegalStateException: Library directory
>    '/hadoop/yarn/local/usercache/root/appcache/application_
>    1521375636129_0022/container_e06_1521375636129_0022_01_
>    000002/assembly/target/scala-2.11/jars' does not exist; make sure Spark
>    is built.
>    2. at org.apache.spark.launcher.CommandBuilderUtils.checkState(Com
>    mandBuilderUtils.java:260)
>    3. at org.apache.spark.launcher.CommandBuilderUtils.findJarsDir(Co
>    mmandBuilderUtils.java:359)
>    4. at org.apache.spark.launcher.YarnCommandBuilderUtils$.findJarsDir(
>    YarnCommandBuilderUtils.scala:38)
>
>
> I took a look at the code and it has some hardcodes and checks for
> specific files layout. I don't follow why :)
> Is it possible to bypass such checks?
>