You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Gordon Wang <gw...@gopivotal.com> on 2014/04/22 11:43:19 UTC

Question about running spark on yarn

In this page http://spark.apache.org/docs/0.9.0/running-on-yarn.html

We have to use spark assembly to submit spark apps to yarn cluster.
And I checked the assembly jars of spark. It contains some yarn classes
which are added during compile time. The yarn classes are not what I want.

My question is that is it possible to use other jars to submit spark app to
yarn cluster.
I do not want to use the assembly jar because it has yarn classes which may
overwrite the yarn class in HADOOP_CLASSPATH. If the yarn cluster is
upgraded, even if the YARN apis are same, spark has to be recompiled
against to the new version of yarn.

Any help is appreciated ! Thanks.

-- 
Regards
Gordon Wang

Re: Question about running spark on yarn

Posted by sa...@cloudera.com.
I currently don't have plans to work on that.

-Sandy

> On Apr 22, 2014, at 8:06 PM, Gordon Wang <gw...@gopivotal.com> wrote:
> 
> Thanks I see. Do you guys have plan to port this to sbt?
> 
> 
>> On Wed, Apr 23, 2014 at 10:24 AM, Sandy Ryza <sa...@cloudera.com> wrote:
>> Right, it only works for Maven
>> 
>> 
>>> On Tue, Apr 22, 2014 at 6:23 PM, Gordon Wang <gw...@gopivotal.com> wrote:
>>> Hi Sandy,
>>> 
>>> Thanks for your reply !
>>> 
>>> Does this work for sbt ?
>>> 
>>> I checked the commit, looks like only maven build has such option.
>>> 
>>> 
>>> 
>>>> On Wed, Apr 23, 2014 at 12:38 AM, Sandy Ryza <sa...@cloudera.com> wrote:
>>>> Hi Gordon,
>>>> 
>>>> We recently handled this in SPARK-1064.  As of 1.0.0, you'll be able to pass -Phadoop-provided to Maven and avoid including Hadoop and its dependencies in the assembly jar.
>>>> 
>>>> -Sandy
>>>> 
>>>> 
>>>>> On Tue, Apr 22, 2014 at 2:43 AM, Gordon Wang <gw...@gopivotal.com> wrote:
>>>>> In this page http://spark.apache.org/docs/0.9.0/running-on-yarn.html
>>>>> 
>>>>> We have to use spark assembly to submit spark apps to yarn cluster.
>>>>> And I checked the assembly jars of spark. It contains some yarn classes which are added during compile time. The yarn classes are not what I want. 
>>>>> 
>>>>> My question is that is it possible to use other jars to submit spark app to yarn cluster. 
>>>>> I do not want to use the assembly jar because it has yarn classes which may overwrite the yarn class in HADOOP_CLASSPATH. If the yarn cluster is upgraded, even if the YARN apis are same, spark has to be recompiled against to the new version of yarn.
>>>>> 
>>>>> Any help is appreciated ! Thanks.
>>>>> 
>>>>> -- 
>>>>> Regards
>>>>> Gordon Wang
>>> 
>>> 
>>> 
>>> -- 
>>> Regards
>>> Gordon Wang
> 
> 
> 
> -- 
> Regards
> Gordon Wang

Re: Question about running spark on yarn

Posted by Gordon Wang <gw...@gopivotal.com>.
Thanks I see. Do you guys have plan to port this to sbt?


On Wed, Apr 23, 2014 at 10:24 AM, Sandy Ryza <sa...@cloudera.com>wrote:

> Right, it only works for Maven
>
>
> On Tue, Apr 22, 2014 at 6:23 PM, Gordon Wang <gw...@gopivotal.com> wrote:
>
>> Hi Sandy,
>>
>> Thanks for your reply !
>>
>> Does this work for sbt ?
>>
>> I checked the commit, looks like only maven build has such option.
>>
>>
>>
>> On Wed, Apr 23, 2014 at 12:38 AM, Sandy Ryza <sa...@cloudera.com>wrote:
>>
>>> Hi Gordon,
>>>
>>> We recently handled this in SPARK-1064.  As of 1.0.0, you'll be able to
>>> pass -Phadoop-provided to Maven and avoid including Hadoop and its
>>> dependencies in the assembly jar.
>>>
>>> -Sandy
>>>
>>>
>>> On Tue, Apr 22, 2014 at 2:43 AM, Gordon Wang <gw...@gopivotal.com>wrote:
>>>
>>>> In this page http://spark.apache.org/docs/0.9.0/running-on-yarn.html
>>>>
>>>> We have to use spark assembly to submit spark apps to yarn cluster.
>>>> And I checked the assembly jars of spark. It contains some yarn classes
>>>> which are added during compile time. The yarn classes are not what I want.
>>>>
>>>> My question is that is it possible to use other jars to submit spark
>>>> app to yarn cluster.
>>>> I do not want to use the assembly jar because it has yarn classes which
>>>> may overwrite the yarn class in HADOOP_CLASSPATH. If the yarn cluster is
>>>> upgraded, even if the YARN apis are same, spark has to be recompiled
>>>> against to the new version of yarn.
>>>>
>>>> Any help is appreciated ! Thanks.
>>>>
>>>> --
>>>> Regards
>>>> Gordon Wang
>>>>
>>>
>>>
>>
>>
>> --
>> Regards
>> Gordon Wang
>>
>
>


-- 
Regards
Gordon Wang

Re: Question about running spark on yarn

Posted by Sandy Ryza <sa...@cloudera.com>.
Right, it only works for Maven


On Tue, Apr 22, 2014 at 6:23 PM, Gordon Wang <gw...@gopivotal.com> wrote:

> Hi Sandy,
>
> Thanks for your reply !
>
> Does this work for sbt ?
>
> I checked the commit, looks like only maven build has such option.
>
>
>
> On Wed, Apr 23, 2014 at 12:38 AM, Sandy Ryza <sa...@cloudera.com>wrote:
>
>> Hi Gordon,
>>
>> We recently handled this in SPARK-1064.  As of 1.0.0, you'll be able to
>> pass -Phadoop-provided to Maven and avoid including Hadoop and its
>> dependencies in the assembly jar.
>>
>> -Sandy
>>
>>
>> On Tue, Apr 22, 2014 at 2:43 AM, Gordon Wang <gw...@gopivotal.com> wrote:
>>
>>> In this page http://spark.apache.org/docs/0.9.0/running-on-yarn.html
>>>
>>> We have to use spark assembly to submit spark apps to yarn cluster.
>>> And I checked the assembly jars of spark. It contains some yarn classes
>>> which are added during compile time. The yarn classes are not what I want.
>>>
>>> My question is that is it possible to use other jars to submit spark app
>>> to yarn cluster.
>>> I do not want to use the assembly jar because it has yarn classes which
>>> may overwrite the yarn class in HADOOP_CLASSPATH. If the yarn cluster is
>>> upgraded, even if the YARN apis are same, spark has to be recompiled
>>> against to the new version of yarn.
>>>
>>> Any help is appreciated ! Thanks.
>>>
>>> --
>>> Regards
>>> Gordon Wang
>>>
>>
>>
>
>
> --
> Regards
> Gordon Wang
>

Re: Question about running spark on yarn

Posted by Gordon Wang <gw...@gopivotal.com>.
Hi Sandy,

Thanks for your reply !

Does this work for sbt ?

I checked the commit, looks like only maven build has such option.



On Wed, Apr 23, 2014 at 12:38 AM, Sandy Ryza <sa...@cloudera.com>wrote:

> Hi Gordon,
>
> We recently handled this in SPARK-1064.  As of 1.0.0, you'll be able to
> pass -Phadoop-provided to Maven and avoid including Hadoop and its
> dependencies in the assembly jar.
>
> -Sandy
>
>
> On Tue, Apr 22, 2014 at 2:43 AM, Gordon Wang <gw...@gopivotal.com> wrote:
>
>> In this page http://spark.apache.org/docs/0.9.0/running-on-yarn.html
>>
>> We have to use spark assembly to submit spark apps to yarn cluster.
>> And I checked the assembly jars of spark. It contains some yarn classes
>> which are added during compile time. The yarn classes are not what I want.
>>
>> My question is that is it possible to use other jars to submit spark app
>> to yarn cluster.
>> I do not want to use the assembly jar because it has yarn classes which
>> may overwrite the yarn class in HADOOP_CLASSPATH. If the yarn cluster is
>> upgraded, even if the YARN apis are same, spark has to be recompiled
>> against to the new version of yarn.
>>
>> Any help is appreciated ! Thanks.
>>
>> --
>> Regards
>> Gordon Wang
>>
>
>


-- 
Regards
Gordon Wang

Re: Question about running spark on yarn

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Gordon,

We recently handled this in SPARK-1064.  As of 1.0.0, you'll be able to
pass -Phadoop-provided to Maven and avoid including Hadoop and its
dependencies in the assembly jar.

-Sandy


On Tue, Apr 22, 2014 at 2:43 AM, Gordon Wang <gw...@gopivotal.com> wrote:

> In this page http://spark.apache.org/docs/0.9.0/running-on-yarn.html
>
> We have to use spark assembly to submit spark apps to yarn cluster.
> And I checked the assembly jars of spark. It contains some yarn classes
> which are added during compile time. The yarn classes are not what I want.
>
> My question is that is it possible to use other jars to submit spark app
> to yarn cluster.
> I do not want to use the assembly jar because it has yarn classes which
> may overwrite the yarn class in HADOOP_CLASSPATH. If the yarn cluster is
> upgraded, even if the YARN apis are same, spark has to be recompiled
> against to the new version of yarn.
>
> Any help is appreciated ! Thanks.
>
> --
> Regards
> Gordon Wang
>