You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Emre Sevinc <em...@gmail.com> on 2015/03/23 13:39:08 UTC
Why doesn't the --conf parameter work in yarn-cluster mode (but works
in yarn-client and local)?
Hello,
According to Spark Documentation at
https://spark.apache.org/docs/1.2.1/submitting-applications.html :
--conf: Arbitrary Spark configuration property in key=value format. For
values that contain spaces wrap “key=value” in quotes (as shown).
And indeed, when I use that parameter, in my Spark program I can retrieve
the value of the key by using:
System.getProperty("key");
This works when I test my program locally, and also in yarn-client mode, I
can log the value of the key and see that it matches what I wrote in the
command line, but it returns *null* when I submit the very same program in
*yarn-cluster* mode.
Why can't I retrieve the value of key given as --conf "key=value" when I
submit my Spark application in *yarn-cluster* mode?
Any ideas and/or workarounds?
--
Emre Sevinç
http://www.bigindustries.be/
Re: Why doesn't the --conf parameter work in yarn-cluster mode (but
works in yarn-client and local)?
Posted by Emre Sevinc <em...@gmail.com>.
Hello Sandy,
Thank you for your explanation. Then I would at least expect that to be
consistent across local, yarn-client, and yarn-cluster modes. (And not lead
to the case where it somehow works in two of them, and not for the third).
Kind regards,
Emre Sevinç
http://www.bigindustries.be/
On Tue, Mar 24, 2015 at 4:38 PM, Sandy Ryza <sa...@cloudera.com> wrote:
> Ah, yes, I believe this is because only properties prefixed with "spark"
> get passed on. The purpose of the "--conf" option is to allow passing
> Spark properties to the SparkConf, not to add general key-value pairs to
> the JVM system properties.
>
> -Sandy
>
> On Tue, Mar 24, 2015 at 4:25 AM, Emre Sevinc <em...@gmail.com>
> wrote:
>
>> Hello Sandy,
>>
>> Your suggestion does not work when I try it locally:
>>
>> When I pass
>>
>> --conf "key=someValue"
>>
>> and then try to retrieve it like:
>>
>> SparkConf sparkConf = new SparkConf();
>> logger.info("* * * key ~~~> {}", sparkConf.get("key"));
>>
>> I get
>>
>> Exception in thread "main" java.util.NoSuchElementException: key
>>
>> And I think that's expected because the key is an arbitrary one, not
>> necessarily a Spark configuration element. This is why I was passing it via
>> --conf and retrieving System.getProperty("key") (which worked locally and
>> in yarn-client mode but not in yarn-cluster mode). I'm surprised why I
>> can't use it on the cluster while I can use it while local development and
>> testing.
>>
>> Kind regards,
>>
>> Emre Sevinç
>> http://www.bigindustries.be/
>>
>>
>>
>> On Mon, Mar 23, 2015 at 6:15 PM, Sandy Ryza <sa...@cloudera.com>
>> wrote:
>>
>>> Hi Emre,
>>>
>>> The --conf property is meant to work with yarn-cluster mode.
>>> System.getProperty("key") isn't guaranteed, but new SparkConf().get("key")
>>> should. Does it not?
>>>
>>> -Sandy
>>>
>>> On Mon, Mar 23, 2015 at 8:39 AM, Emre Sevinc <em...@gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> According to Spark Documentation at
>>>> https://spark.apache.org/docs/1.2.1/submitting-applications.html :
>>>>
>>>> --conf: Arbitrary Spark configuration property in key=value format.
>>>> For values that contain spaces wrap “key=value” in quotes (as shown).
>>>>
>>>> And indeed, when I use that parameter, in my Spark program I can
>>>> retrieve the value of the key by using:
>>>>
>>>> System.getProperty("key");
>>>>
>>>> This works when I test my program locally, and also in yarn-client
>>>> mode, I can log the value of the key and see that it matches what I wrote
>>>> in the command line, but it returns *null* when I submit the very same
>>>> program in *yarn-cluster* mode.
>>>>
>>>> Why can't I retrieve the value of key given as --conf "key=value" when
>>>> I submit my Spark application in *yarn-cluster* mode?
>>>>
>>>> Any ideas and/or workarounds?
>>>>
>>>>
>>>> --
>>>> Emre Sevinç
>>>> http://www.bigindustries.be/
>>>>
>>>>
>>>
>>
>>
>> --
>> Emre Sevinc
>>
>
>
--
Emre Sevinc
Re: Why doesn't the --conf parameter work in yarn-cluster mode (but
works in yarn-client and local)?
Posted by Sandy Ryza <sa...@cloudera.com>.
Ah, yes, I believe this is because only properties prefixed with "spark"
get passed on. The purpose of the "--conf" option is to allow passing
Spark properties to the SparkConf, not to add general key-value pairs to
the JVM system properties.
-Sandy
On Tue, Mar 24, 2015 at 4:25 AM, Emre Sevinc <em...@gmail.com> wrote:
> Hello Sandy,
>
> Your suggestion does not work when I try it locally:
>
> When I pass
>
> --conf "key=someValue"
>
> and then try to retrieve it like:
>
> SparkConf sparkConf = new SparkConf();
> logger.info("* * * key ~~~> {}", sparkConf.get("key"));
>
> I get
>
> Exception in thread "main" java.util.NoSuchElementException: key
>
> And I think that's expected because the key is an arbitrary one, not
> necessarily a Spark configuration element. This is why I was passing it via
> --conf and retrieving System.getProperty("key") (which worked locally and
> in yarn-client mode but not in yarn-cluster mode). I'm surprised why I
> can't use it on the cluster while I can use it while local development and
> testing.
>
> Kind regards,
>
> Emre Sevinç
> http://www.bigindustries.be/
>
>
>
> On Mon, Mar 23, 2015 at 6:15 PM, Sandy Ryza <sa...@cloudera.com>
> wrote:
>
>> Hi Emre,
>>
>> The --conf property is meant to work with yarn-cluster mode.
>> System.getProperty("key") isn't guaranteed, but new SparkConf().get("key")
>> should. Does it not?
>>
>> -Sandy
>>
>> On Mon, Mar 23, 2015 at 8:39 AM, Emre Sevinc <em...@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> According to Spark Documentation at
>>> https://spark.apache.org/docs/1.2.1/submitting-applications.html :
>>>
>>> --conf: Arbitrary Spark configuration property in key=value format.
>>> For values that contain spaces wrap “key=value” in quotes (as shown).
>>>
>>> And indeed, when I use that parameter, in my Spark program I can
>>> retrieve the value of the key by using:
>>>
>>> System.getProperty("key");
>>>
>>> This works when I test my program locally, and also in yarn-client mode,
>>> I can log the value of the key and see that it matches what I wrote in the
>>> command line, but it returns *null* when I submit the very same program in
>>> *yarn-cluster* mode.
>>>
>>> Why can't I retrieve the value of key given as --conf "key=value" when I
>>> submit my Spark application in *yarn-cluster* mode?
>>>
>>> Any ideas and/or workarounds?
>>>
>>>
>>> --
>>> Emre Sevinç
>>> http://www.bigindustries.be/
>>>
>>>
>>
>
>
> --
> Emre Sevinc
>
Re: Why doesn't the --conf parameter work in yarn-cluster mode (but
works in yarn-client and local)?
Posted by Emre Sevinc <em...@gmail.com>.
Hello Sandy,
Your suggestion does not work when I try it locally:
When I pass
--conf "key=someValue"
and then try to retrieve it like:
SparkConf sparkConf = new SparkConf();
logger.info("* * * key ~~~> {}", sparkConf.get("key"));
I get
Exception in thread "main" java.util.NoSuchElementException: key
And I think that's expected because the key is an arbitrary one, not
necessarily a Spark configuration element. This is why I was passing it via
--conf and retrieving System.getProperty("key") (which worked locally and
in yarn-client mode but not in yarn-cluster mode). I'm surprised why I
can't use it on the cluster while I can use it while local development and
testing.
Kind regards,
Emre Sevinç
http://www.bigindustries.be/
On Mon, Mar 23, 2015 at 6:15 PM, Sandy Ryza <sa...@cloudera.com> wrote:
> Hi Emre,
>
> The --conf property is meant to work with yarn-cluster mode.
> System.getProperty("key") isn't guaranteed, but new SparkConf().get("key")
> should. Does it not?
>
> -Sandy
>
> On Mon, Mar 23, 2015 at 8:39 AM, Emre Sevinc <em...@gmail.com>
> wrote:
>
>> Hello,
>>
>> According to Spark Documentation at
>> https://spark.apache.org/docs/1.2.1/submitting-applications.html :
>>
>> --conf: Arbitrary Spark configuration property in key=value format.
>> For values that contain spaces wrap “key=value” in quotes (as shown).
>>
>> And indeed, when I use that parameter, in my Spark program I can retrieve
>> the value of the key by using:
>>
>> System.getProperty("key");
>>
>> This works when I test my program locally, and also in yarn-client mode,
>> I can log the value of the key and see that it matches what I wrote in the
>> command line, but it returns *null* when I submit the very same program in
>> *yarn-cluster* mode.
>>
>> Why can't I retrieve the value of key given as --conf "key=value" when I
>> submit my Spark application in *yarn-cluster* mode?
>>
>> Any ideas and/or workarounds?
>>
>>
>> --
>> Emre Sevinç
>> http://www.bigindustries.be/
>>
>>
>
--
Emre Sevinc
Re: Why doesn't the --conf parameter work in yarn-cluster mode (but
works in yarn-client and local)?
Posted by Sandy Ryza <sa...@cloudera.com>.
Hi Emre,
The --conf property is meant to work with yarn-cluster mode.
System.getProperty("key") isn't guaranteed, but new SparkConf().get("key")
should. Does it not?
-Sandy
On Mon, Mar 23, 2015 at 8:39 AM, Emre Sevinc <em...@gmail.com> wrote:
> Hello,
>
> According to Spark Documentation at
> https://spark.apache.org/docs/1.2.1/submitting-applications.html :
>
> --conf: Arbitrary Spark configuration property in key=value format. For
> values that contain spaces wrap “key=value” in quotes (as shown).
>
> And indeed, when I use that parameter, in my Spark program I can retrieve
> the value of the key by using:
>
> System.getProperty("key");
>
> This works when I test my program locally, and also in yarn-client mode, I
> can log the value of the key and see that it matches what I wrote in the
> command line, but it returns *null* when I submit the very same program in
> *yarn-cluster* mode.
>
> Why can't I retrieve the value of key given as --conf "key=value" when I
> submit my Spark application in *yarn-cluster* mode?
>
> Any ideas and/or workarounds?
>
>
> --
> Emre Sevinç
> http://www.bigindustries.be/
>
>