You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Cody Koeninger <co...@koeninger.org> on 2014/08/07 23:35:56 UTC

Re: replacement for SPARK_JAVA_OPTS

Just wanted to check in on this, see if I should file a bug report
regarding the mesos argument propagation.


On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger <co...@koeninger.org> wrote:

> 1. I've tried with and without escaping equals sign, it doesn't affect the
> results.
>
> 2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works for getting
> system properties set in the local shell (although not for executors).
>
> 3. We're using the default fine-grained mesos mode, not setting
> spark.mesos.coarse, so it doesn't seem immediately related to that ticket.
> Should I file a bug report?
>
>
> On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell <pw...@gmail.com>
> wrote:
>
>> The third issue may be related to this:
>> https://issues.apache.org/jira/browse/SPARK-2022
>>
>> We can take a look at this during the bug fix period for the 1.1
>> release next week. If we come up with a fix we can backport it into
>> the 1.0 branch also.
>>
>> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell <pw...@gmail.com>
>> wrote:
>> > Thanks for digging around here. I think there are a few distinct issues.
>> >
>> > 1. Properties containing the '=' character need to be escaped.
>> > I was able to load properties fine as long as I escape the '='
>> > character. But maybe we should document this:
>> >
>> > == spark-defaults.conf ==
>> > spark.foo a\=B
>> > == shell ==
>> > scala> sc.getConf.get("spark.foo")
>> > res2: String = a=B
>> >
>> > 2. spark.driver.extraJavaOptions, when set in the properties file,
>> > don't affect the driver when running in client mode (always the case
>> > for mesos). We should probably document this. In this case you need to
>> > either use --driver-java-options or set SPARK_SUBMIT_OPTS.
>> >
>> > 3. Arguments aren't propagated on Mesos (this might be because of the
>> > other issues, or a separate bug).
>> >
>> > - Patrick
>> >
>> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger <co...@koeninger.org>
>> wrote:
>> >> In addition, spark.executor.extraJavaOptions does not seem to behave
>> as I
>> >> would expect; java arguments don't seem to be propagated to executors.
>> >>
>> >>
>> >> $ cat conf/spark-defaults.conf
>> >>
>> >> spark.master
>> >>
>> mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters
>> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23
>> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>> >>
>> >>
>> >> $ ./bin/spark-shell
>> >>
>> >> scala> sc.getConf.get("spark.executor.extraJavaOptions")
>> >> res0: String = -Dfoo.bar.baz=23
>> >>
>> >> scala> sc.parallelize(1 to 100).map{ i => (
>> >>      |  java.net.InetAddress.getLocalHost.getHostName,
>> >>      |  System.getProperty("foo.bar.baz")
>> >>      | )}.collect
>> >>
>> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null),
>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null),
>> >> (dn-02.mxstg,null), ...
>> >>
>> >>
>> >>
>> >> Note that this is a mesos deployment, although I wouldn't expect that
>> to
>> >> affect the availability of spark.driver.extraJavaOptions in a local
>> spark
>> >> shell.
>> >>
>> >>
>> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger <co...@koeninger.org>
>> wrote:
>> >>
>> >>> Either whitespace or equals sign are valid properties file formats.
>> >>> Here's an example:
>> >>>
>> >>> $ cat conf/spark-defaults.conf
>> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>> >>>
>> >>> $ ./bin/spark-shell -v
>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
>> >>> Adding default property:
>> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
>> >>>
>> >>>
>> >>> scala>  System.getProperty("foo.bar.baz")
>> >>> res0: String = null
>> >>>
>> >>>
>> >>> If you add double quotes, the resulting string value will have double
>> >>> quotes.
>> >>>
>> >>>
>> >>> $ cat conf/spark-defaults.conf
>> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
>> >>>
>> >>> $ ./bin/spark-shell -v
>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
>> >>> Adding default property:
>> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23"
>> >>>
>> >>> scala>  System.getProperty("foo.bar.baz")
>> >>> res0: String = null
>> >>>
>> >>>
>> >>> Neither one of those affects the issue; the underlying problem in my
>> case
>> >>> seems to be that bin/spark-class uses the SPARK_SUBMIT_OPTS and
>> >>> SPARK_JAVA_OPTS environment variables, but nothing parses
>> >>> spark-defaults.conf before the java process is started.
>> >>>
>> >>> Here's an example of the process running when only
>> spark-defaults.conf is
>> >>> being used:
>> >>>
>> >>> $ ps -ef | grep spark
>> >>>
>> >>> 514       5182  2058  0 21:05 pts/2    00:00:00 bash
>> ./bin/spark-shell -v
>> >>>
>> >>> 514       5189  5182  4 21:05 pts/2    00:00:22
>> /usr/local/java/bin/java
>> >>> -cp
>> >>>
>> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
>> >>> -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m
>> >>> org.apache.spark.deploy.SparkSubmit spark-shell -v --class
>> >>> org.apache.spark.repl.Main
>> >>>
>> >>>
>> >>> Here's an example of it when the command line --driver-java-options is
>> >>> used (and thus things work):
>> >>>
>> >>>
>> >>> $ ps -ef | grep spark
>> >>> 514       5392  2058  0 21:15 pts/2    00:00:00 bash
>> ./bin/spark-shell -v
>> >>> --driver-java-options -Dfoo.bar.baz=23
>> >>>
>> >>> 514       5399  5392 80 21:15 pts/2    00:00:06
>> /usr/local/java/bin/java
>> >>> -cp
>> >>>
>> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
>> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path= -Xms512m
>> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit spark-shell -v
>> >>> --driver-java-options -Dfoo.bar.baz=23 --class
>> org.apache.spark.repl.Main
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick Wendell <pw...@gmail.com>
>> >>> wrote:
>> >>>
>> >>>> Cody - in your example you are using the '=' character, but in our
>> >>>> documentation and tests we use a whitespace to separate the key and
>> >>>> value in the defaults file.
>> >>>>
>> >>>> docs: http://spark.apache.org/docs/latest/configuration.html
>> >>>>
>> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>> >>>>
>> >>>> I'm not sure if the java properties file parser will try to interpret
>> >>>> the equals sign. If so you might need to do this.
>> >>>>
>> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
>> >>>>
>> >>>> Do those work for you?
>> >>>>
>> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo Vanzin <vanzin@cloudera.com
>> >
>> >>>> wrote:
>> >>>> > Hi Cody,
>> >>>> >
>> >>>> > Could you file a bug for this if there isn't one already?
>> >>>> >
>> >>>> > For system properties SparkSubmit should be able to read those
>> >>>> > settings and do the right thing, but that obviously won't work for
>> >>>> > other JVM options... the current code should work fine in cluster
>> mode
>> >>>> > though, since the driver is a different process. :-)
>> >>>> >
>> >>>> >
>> >>>> > On Wed, Jul 30, 2014 at 1:12 PM, Cody Koeninger <
>> cody@koeninger.org>
>> >>>> wrote:
>> >>>> >> We were previously using SPARK_JAVA_OPTS to set java system
>> properties
>> >>>> via
>> >>>> >> -D.
>> >>>> >>
>> >>>> >> This was used for properties that varied on a
>> >>>> per-deployment-environment
>> >>>> >> basis, but needed to be available in the spark shell and workers.
>> >>>> >>
>> >>>> >> On upgrading to 1.0, we saw that SPARK_JAVA_OPTS had been
>> deprecated,
>> >>>> and
>> >>>> >> replaced by spark-defaults.conf and command line arguments to
>> >>>> spark-submit
>> >>>> >> or spark-shell.
>> >>>> >>
>> >>>> >> However, setting spark.driver.extraJavaOptions and
>> >>>> >> spark.executor.extraJavaOptions in spark-defaults.conf is not a
>> >>>> replacement
>> >>>> >> for SPARK_JAVA_OPTS:
>> >>>> >>
>> >>>> >>
>> >>>> >> $ cat conf/spark-defaults.conf
>> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
>> >>>> >>
>> >>>> >> $ ./bin/spark-shell
>> >>>> >>
>> >>>> >> scala> System.getProperty("foo.bar.baz")
>> >>>> >> res0: String = null
>> >>>> >>
>> >>>> >>
>> >>>> >> $ ./bin/spark-shell --driver-java-options "-Dfoo.bar.baz=23"
>> >>>> >>
>> >>>> >> scala> System.getProperty("foo.bar.baz")
>> >>>> >> res0: String = 23
>> >>>> >>
>> >>>> >>
>> >>>> >> Looking through the shell scripts for spark-submit and
>> spark-class, I
>> >>>> can
>> >>>> >> see why this is; parsing spark-defaults.conf from bash could be
>> >>>> brittle.
>> >>>> >>
>> >>>> >> But from an ergonomic point of view, it's a step back to go from a
>> >>>> >> set-it-and-forget-it configuration in spark-env.sh, to requiring
>> >>>> command
>> >>>> >> line arguments.
>> >>>> >>
>> >>>> >> I can solve this with an ad-hoc script to wrap spark-shell with
>> the
>> >>>> >> appropriate arguments, but I wanted to bring the issue up to see
>> if
>> >>>> anyone
>> >>>> >> else had run into it,
>> >>>> >> or had any direction for a general solution (beyond parsing java
>> >>>> properties
>> >>>> >> files from bash).
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > --
>> >>>> > Marcelo
>> >>>>
>> >>>
>> >>>
>>
>
>

Re: replacement for SPARK_JAVA_OPTS

Posted by Andrew Or <an...@databricks.com>.
Ah, great to know this is already being fixed. Thanks Patrick, I have
marked my JIRA as a duplicate.


2014-08-07 21:42 GMT-07:00 Patrick Wendell <pw...@gmail.com>:

> Andrew - I think your JIRA may duplicate existing work:
> https://github.com/apache/spark/pull/1513
>
>
> On Thu, Aug 7, 2014 at 7:55 PM, Andrew Or <an...@databricks.com> wrote:
> > @Cody I took a quick glance at the Mesos code and it appears that we
> > currently do not even pass extra java options to executors except in
> coarse
> > grained mode, and even in this mode we do not pass them to executors
> > correctly. I have filed a related JIRA here:
> > https://issues.apache.org/jira/browse/SPARK-2921. This is a somewhat
> > serious limitation and we will try to fix this for 1.1.
> >
> > -Andrew
> >
> >
> > 2014-08-07 19:42 GMT-07:00 Andrew Or <an...@databricks.com>:
> >
> >> Thanks Marcelo, I have moved the changes to a new PR to describe the
> >> problems more clearly: https://github.com/apache/spark/pull/1845
> >>
> >> @Gary Yeah, the goal is to get this into 1.1 as a bug fix.
> >>
> >>
> >> 2014-08-07 17:30 GMT-07:00 Gary Malouf <ma...@gmail.com>:
> >>
> >> Can this be cherry-picked for 1.1 if everything works out?  In my
> opinion,
> >>> it could be qualified as a bug fix.
> >>>
> >>>
> >>> On Thu, Aug 7, 2014 at 5:47 PM, Marcelo Vanzin <va...@cloudera.com>
> >>> wrote:
> >>>
> >>> > Andrew has been working on a fix:
> >>> > https://github.com/apache/spark/pull/1770
> >>> >
> >>> > On Thu, Aug 7, 2014 at 2:35 PM, Cody Koeninger <co...@koeninger.org>
> >>> wrote:
> >>> > > Just wanted to check in on this, see if I should file a bug report
> >>> > > regarding the mesos argument propagation.
> >>> > >
> >>> > >
> >>> > > On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger <
> cody@koeninger.org>
> >>> > wrote:
> >>> > >
> >>> > >> 1. I've tried with and without escaping equals sign, it doesn't
> >>> affect
> >>> > the
> >>> > >> results.
> >>> > >>
> >>> > >> 2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works for
> >>> getting
> >>> > >> system properties set in the local shell (although not for
> >>> executors).
> >>> > >>
> >>> > >> 3. We're using the default fine-grained mesos mode, not setting
> >>> > >> spark.mesos.coarse, so it doesn't seem immediately related to that
> >>> > ticket.
> >>> > >> Should I file a bug report?
> >>> > >>
> >>> > >>
> >>> > >> On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell <
> pwendell@gmail.com
> >>> >
> >>> > >> wrote:
> >>> > >>
> >>> > >>> The third issue may be related to this:
> >>> > >>> https://issues.apache.org/jira/browse/SPARK-2022
> >>> > >>>
> >>> > >>> We can take a look at this during the bug fix period for the 1.1
> >>> > >>> release next week. If we come up with a fix we can backport it
> into
> >>> > >>> the 1.0 branch also.
> >>> > >>>
> >>> > >>> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell <
> >>> pwendell@gmail.com>
> >>> > >>> wrote:
> >>> > >>> > Thanks for digging around here. I think there are a few
> distinct
> >>> > issues.
> >>> > >>> >
> >>> > >>> > 1. Properties containing the '=' character need to be escaped.
> >>> > >>> > I was able to load properties fine as long as I escape the '='
> >>> > >>> > character. But maybe we should document this:
> >>> > >>> >
> >>> > >>> > == spark-defaults.conf ==
> >>> > >>> > spark.foo a\=B
> >>> > >>> > == shell ==
> >>> > >>> > scala> sc.getConf.get("spark.foo")
> >>> > >>> > res2: String = a=B
> >>> > >>> >
> >>> > >>> > 2. spark.driver.extraJavaOptions, when set in the properties
> file,
> >>> > >>> > don't affect the driver when running in client mode (always the
> >>> case
> >>> > >>> > for mesos). We should probably document this. In this case you
> >>> need
> >>> > to
> >>> > >>> > either use --driver-java-options or set SPARK_SUBMIT_OPTS.
> >>> > >>> >
> >>> > >>> > 3. Arguments aren't propagated on Mesos (this might be because
> of
> >>> the
> >>> > >>> > other issues, or a separate bug).
> >>> > >>> >
> >>> > >>> > - Patrick
> >>> > >>> >
> >>> > >>> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger <
> >>> cody@koeninger.org>
> >>> > >>> wrote:
> >>> > >>> >> In addition, spark.executor.extraJavaOptions does not seem to
> >>> behave
> >>> > >>> as I
> >>> > >>> >> would expect; java arguments don't seem to be propagated to
> >>> > executors.
> >>> > >>> >>
> >>> > >>> >>
> >>> > >>> >> $ cat conf/spark-defaults.conf
> >>> > >>> >>
> >>> > >>> >> spark.master
> >>> > >>> >>
> >>> > >>>
> >>> >
> >>>
> mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters
> >>> > >>> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23
> >>> > >>> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>> > >>> >>
> >>> > >>> >>
> >>> > >>> >> $ ./bin/spark-shell
> >>> > >>> >>
> >>> > >>> >> scala> sc.getConf.get("spark.executor.extraJavaOptions")
> >>> > >>> >> res0: String = -Dfoo.bar.baz=23
> >>> > >>> >>
> >>> > >>> >> scala> sc.parallelize(1 to 100).map{ i => (
> >>> > >>> >>      |  java.net.InetAddress.getLocalHost.getHostName,
> >>> > >>> >>      |  System.getProperty("foo.bar.baz")
> >>> > >>> >>      | )}.collect
> >>> > >>> >>
> >>> > >>> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null),
> >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null),
> >>> > >>> >> (dn-02.mxstg,null), ...
> >>> > >>> >>
> >>> > >>> >>
> >>> > >>> >>
> >>> > >>> >> Note that this is a mesos deployment, although I wouldn't
> expect
> >>> > that
> >>> > >>> to
> >>> > >>> >> affect the availability of spark.driver.extraJavaOptions in a
> >>> local
> >>> > >>> spark
> >>> > >>> >> shell.
> >>> > >>> >>
> >>> > >>> >>
> >>> > >>> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger <
> >>> cody@koeninger.org
> >>> > >
> >>> > >>> wrote:
> >>> > >>> >>
> >>> > >>> >>> Either whitespace or equals sign are valid properties file
> >>> formats.
> >>> > >>> >>> Here's an example:
> >>> > >>> >>>
> >>> > >>> >>> $ cat conf/spark-defaults.conf
> >>> > >>> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>> > >>> >>>
> >>> > >>> >>> $ ./bin/spark-shell -v
> >>> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
> >>> > >>> >>> Adding default property:
> >>> > >>> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> scala>  System.getProperty("foo.bar.baz")
> >>> > >>> >>> res0: String = null
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> If you add double quotes, the resulting string value will
> have
> >>> > double
> >>> > >>> >>> quotes.
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> $ cat conf/spark-defaults.conf
> >>> > >>> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
> >>> > >>> >>>
> >>> > >>> >>> $ ./bin/spark-shell -v
> >>> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
> >>> > >>> >>> Adding default property:
> >>> > >>> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23"
> >>> > >>> >>>
> >>> > >>> >>> scala>  System.getProperty("foo.bar.baz")
> >>> > >>> >>> res0: String = null
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> Neither one of those affects the issue; the underlying
> problem
> >>> in
> >>> > my
> >>> > >>> case
> >>> > >>> >>> seems to be that bin/spark-class uses the SPARK_SUBMIT_OPTS
> and
> >>> > >>> >>> SPARK_JAVA_OPTS environment variables, but nothing parses
> >>> > >>> >>> spark-defaults.conf before the java process is started.
> >>> > >>> >>>
> >>> > >>> >>> Here's an example of the process running when only
> >>> > >>> spark-defaults.conf is
> >>> > >>> >>> being used:
> >>> > >>> >>>
> >>> > >>> >>> $ ps -ef | grep spark
> >>> > >>> >>>
> >>> > >>> >>> 514       5182  2058  0 21:05 pts/2    00:00:00 bash
> >>> > >>> ./bin/spark-shell -v
> >>> > >>> >>>
> >>> > >>> >>> 514       5189  5182  4 21:05 pts/2    00:00:22
> >>> > >>> /usr/local/java/bin/java
> >>> > >>> >>> -cp
> >>> > >>> >>>
> >>> > >>>
> >>> >
> >>>
> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
> >>> > >>> >>> -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m
> >>> > >>> >>> org.apache.spark.deploy.SparkSubmit spark-shell -v --class
> >>> > >>> >>> org.apache.spark.repl.Main
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> Here's an example of it when the command line
> >>> > --driver-java-options is
> >>> > >>> >>> used (and thus things work):
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> $ ps -ef | grep spark
> >>> > >>> >>> 514       5392  2058  0 21:15 pts/2    00:00:00 bash
> >>> > >>> ./bin/spark-shell -v
> >>> > >>> >>> --driver-java-options -Dfoo.bar.baz=23
> >>> > >>> >>>
> >>> > >>> >>> 514       5399  5392 80 21:15 pts/2    00:00:06
> >>> > >>> /usr/local/java/bin/java
> >>> > >>> >>> -cp
> >>> > >>> >>>
> >>> > >>>
> >>> >
> >>>
> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
> >>> > >>> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path=
> >>> -Xms512m
> >>> > >>> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit spark-shell -v
> >>> > >>> >>> --driver-java-options -Dfoo.bar.baz=23 --class
> >>> > >>> org.apache.spark.repl.Main
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick Wendell <
> >>> > pwendell@gmail.com>
> >>> > >>> >>> wrote:
> >>> > >>> >>>
> >>> > >>> >>>> Cody - in your example you are using the '=' character, but
> in
> >>> our
> >>> > >>> >>>> documentation and tests we use a whitespace to separate the
> key
> >>> > and
> >>> > >>> >>>> value in the defaults file.
> >>> > >>> >>>>
> >>> > >>> >>>> docs:
> http://spark.apache.org/docs/latest/configuration.html
> >>> > >>> >>>>
> >>> > >>> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>> > >>> >>>>
> >>> > >>> >>>> I'm not sure if the java properties file parser will try to
> >>> > interpret
> >>> > >>> >>>> the equals sign. If so you might need to do this.
> >>> > >>> >>>>
> >>> > >>> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
> >>> > >>> >>>>
> >>> > >>> >>>> Do those work for you?
> >>> > >>> >>>>
> >>> > >>> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo Vanzin <
> >>> > vanzin@cloudera.com
> >>> > >>> >
> >>> > >>> >>>> wrote:
> >>> > >>> >>>> > Hi Cody,
> >>> > >>> >>>> >
> >>> > >>> >>>> > Could you file a bug for this if there isn't one already?
> >>> > >>> >>>> >
> >>> > >>> >>>> > For system properties SparkSubmit should be able to read
> >>> those
> >>> > >>> >>>> > settings and do the right thing, but that obviously won't
> >>> work
> >>> > for
> >>> > >>> >>>> > other JVM options... the current code should work fine in
> >>> > cluster
> >>> > >>> mode
> >>> > >>> >>>> > though, since the driver is a different process. :-)
> >>> > >>> >>>> >
> >>> > >>> >>>> >
> >>> > >>> >>>> > On Wed, Jul 30, 2014 at 1:12 PM, Cody Koeninger <
> >>> > >>> cody@koeninger.org>
> >>> > >>> >>>> wrote:
> >>> > >>> >>>> >> We were previously using SPARK_JAVA_OPTS to set java
> system
> >>> > >>> properties
> >>> > >>> >>>> via
> >>> > >>> >>>> >> -D.
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> This was used for properties that varied on a
> >>> > >>> >>>> per-deployment-environment
> >>> > >>> >>>> >> basis, but needed to be available in the spark shell and
> >>> > workers.
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> On upgrading to 1.0, we saw that SPARK_JAVA_OPTS had been
> >>> > >>> deprecated,
> >>> > >>> >>>> and
> >>> > >>> >>>> >> replaced by spark-defaults.conf and command line
> arguments
> >>> to
> >>> > >>> >>>> spark-submit
> >>> > >>> >>>> >> or spark-shell.
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> However, setting spark.driver.extraJavaOptions and
> >>> > >>> >>>> >> spark.executor.extraJavaOptions in spark-defaults.conf is
> >>> not a
> >>> > >>> >>>> replacement
> >>> > >>> >>>> >> for SPARK_JAVA_OPTS:
> >>> > >>> >>>> >>
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> $ cat conf/spark-defaults.conf
> >>> > >>> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> $ ./bin/spark-shell
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> scala> System.getProperty("foo.bar.baz")
> >>> > >>> >>>> >> res0: String = null
> >>> > >>> >>>> >>
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> $ ./bin/spark-shell --driver-java-options
> "-Dfoo.bar.baz=23"
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> scala> System.getProperty("foo.bar.baz")
> >>> > >>> >>>> >> res0: String = 23
> >>> > >>> >>>> >>
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> Looking through the shell scripts for spark-submit and
> >>> > >>> spark-class, I
> >>> > >>> >>>> can
> >>> > >>> >>>> >> see why this is; parsing spark-defaults.conf from bash
> >>> could be
> >>> > >>> >>>> brittle.
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> But from an ergonomic point of view, it's a step back to
> go
> >>> > from a
> >>> > >>> >>>> >> set-it-and-forget-it configuration in spark-env.sh, to
> >>> > requiring
> >>> > >>> >>>> command
> >>> > >>> >>>> >> line arguments.
> >>> > >>> >>>> >>
> >>> > >>> >>>> >> I can solve this with an ad-hoc script to wrap
> spark-shell
> >>> with
> >>> > >>> the
> >>> > >>> >>>> >> appropriate arguments, but I wanted to bring the issue
> up to
> >>> > see
> >>> > >>> if
> >>> > >>> >>>> anyone
> >>> > >>> >>>> >> else had run into it,
> >>> > >>> >>>> >> or had any direction for a general solution (beyond
> parsing
> >>> > java
> >>> > >>> >>>> properties
> >>> > >>> >>>> >> files from bash).
> >>> > >>> >>>> >
> >>> > >>> >>>> >
> >>> > >>> >>>> >
> >>> > >>> >>>> > --
> >>> > >>> >>>> > Marcelo
> >>> > >>> >>>>
> >>> > >>> >>>
> >>> > >>> >>>
> >>> > >>>
> >>> > >>
> >>> > >>
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Marcelo
> >>> >
> >>> > ---------------------------------------------------------------------
> >>> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> >>> > For additional commands, e-mail: dev-help@spark.apache.org
> >>> >
> >>> >
> >>>
> >>
> >>
>

Re: replacement for SPARK_JAVA_OPTS

Posted by Patrick Wendell <pw...@gmail.com>.
Andrew - I think your JIRA may duplicate existing work:
https://github.com/apache/spark/pull/1513


On Thu, Aug 7, 2014 at 7:55 PM, Andrew Or <an...@databricks.com> wrote:
> @Cody I took a quick glance at the Mesos code and it appears that we
> currently do not even pass extra java options to executors except in coarse
> grained mode, and even in this mode we do not pass them to executors
> correctly. I have filed a related JIRA here:
> https://issues.apache.org/jira/browse/SPARK-2921. This is a somewhat
> serious limitation and we will try to fix this for 1.1.
>
> -Andrew
>
>
> 2014-08-07 19:42 GMT-07:00 Andrew Or <an...@databricks.com>:
>
>> Thanks Marcelo, I have moved the changes to a new PR to describe the
>> problems more clearly: https://github.com/apache/spark/pull/1845
>>
>> @Gary Yeah, the goal is to get this into 1.1 as a bug fix.
>>
>>
>> 2014-08-07 17:30 GMT-07:00 Gary Malouf <ma...@gmail.com>:
>>
>> Can this be cherry-picked for 1.1 if everything works out?  In my opinion,
>>> it could be qualified as a bug fix.
>>>
>>>
>>> On Thu, Aug 7, 2014 at 5:47 PM, Marcelo Vanzin <va...@cloudera.com>
>>> wrote:
>>>
>>> > Andrew has been working on a fix:
>>> > https://github.com/apache/spark/pull/1770
>>> >
>>> > On Thu, Aug 7, 2014 at 2:35 PM, Cody Koeninger <co...@koeninger.org>
>>> wrote:
>>> > > Just wanted to check in on this, see if I should file a bug report
>>> > > regarding the mesos argument propagation.
>>> > >
>>> > >
>>> > > On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger <co...@koeninger.org>
>>> > wrote:
>>> > >
>>> > >> 1. I've tried with and without escaping equals sign, it doesn't
>>> affect
>>> > the
>>> > >> results.
>>> > >>
>>> > >> 2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works for
>>> getting
>>> > >> system properties set in the local shell (although not for
>>> executors).
>>> > >>
>>> > >> 3. We're using the default fine-grained mesos mode, not setting
>>> > >> spark.mesos.coarse, so it doesn't seem immediately related to that
>>> > ticket.
>>> > >> Should I file a bug report?
>>> > >>
>>> > >>
>>> > >> On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell <pwendell@gmail.com
>>> >
>>> > >> wrote:
>>> > >>
>>> > >>> The third issue may be related to this:
>>> > >>> https://issues.apache.org/jira/browse/SPARK-2022
>>> > >>>
>>> > >>> We can take a look at this during the bug fix period for the 1.1
>>> > >>> release next week. If we come up with a fix we can backport it into
>>> > >>> the 1.0 branch also.
>>> > >>>
>>> > >>> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell <
>>> pwendell@gmail.com>
>>> > >>> wrote:
>>> > >>> > Thanks for digging around here. I think there are a few distinct
>>> > issues.
>>> > >>> >
>>> > >>> > 1. Properties containing the '=' character need to be escaped.
>>> > >>> > I was able to load properties fine as long as I escape the '='
>>> > >>> > character. But maybe we should document this:
>>> > >>> >
>>> > >>> > == spark-defaults.conf ==
>>> > >>> > spark.foo a\=B
>>> > >>> > == shell ==
>>> > >>> > scala> sc.getConf.get("spark.foo")
>>> > >>> > res2: String = a=B
>>> > >>> >
>>> > >>> > 2. spark.driver.extraJavaOptions, when set in the properties file,
>>> > >>> > don't affect the driver when running in client mode (always the
>>> case
>>> > >>> > for mesos). We should probably document this. In this case you
>>> need
>>> > to
>>> > >>> > either use --driver-java-options or set SPARK_SUBMIT_OPTS.
>>> > >>> >
>>> > >>> > 3. Arguments aren't propagated on Mesos (this might be because of
>>> the
>>> > >>> > other issues, or a separate bug).
>>> > >>> >
>>> > >>> > - Patrick
>>> > >>> >
>>> > >>> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger <
>>> cody@koeninger.org>
>>> > >>> wrote:
>>> > >>> >> In addition, spark.executor.extraJavaOptions does not seem to
>>> behave
>>> > >>> as I
>>> > >>> >> would expect; java arguments don't seem to be propagated to
>>> > executors.
>>> > >>> >>
>>> > >>> >>
>>> > >>> >> $ cat conf/spark-defaults.conf
>>> > >>> >>
>>> > >>> >> spark.master
>>> > >>> >>
>>> > >>>
>>> >
>>> mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters
>>> > >>> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23
>>> > >>> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>>> > >>> >>
>>> > >>> >>
>>> > >>> >> $ ./bin/spark-shell
>>> > >>> >>
>>> > >>> >> scala> sc.getConf.get("spark.executor.extraJavaOptions")
>>> > >>> >> res0: String = -Dfoo.bar.baz=23
>>> > >>> >>
>>> > >>> >> scala> sc.parallelize(1 to 100).map{ i => (
>>> > >>> >>      |  java.net.InetAddress.getLocalHost.getHostName,
>>> > >>> >>      |  System.getProperty("foo.bar.baz")
>>> > >>> >>      | )}.collect
>>> > >>> >>
>>> > >>> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null),
>>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null),
>>> > >>> >> (dn-02.mxstg,null), ...
>>> > >>> >>
>>> > >>> >>
>>> > >>> >>
>>> > >>> >> Note that this is a mesos deployment, although I wouldn't expect
>>> > that
>>> > >>> to
>>> > >>> >> affect the availability of spark.driver.extraJavaOptions in a
>>> local
>>> > >>> spark
>>> > >>> >> shell.
>>> > >>> >>
>>> > >>> >>
>>> > >>> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger <
>>> cody@koeninger.org
>>> > >
>>> > >>> wrote:
>>> > >>> >>
>>> > >>> >>> Either whitespace or equals sign are valid properties file
>>> formats.
>>> > >>> >>> Here's an example:
>>> > >>> >>>
>>> > >>> >>> $ cat conf/spark-defaults.conf
>>> > >>> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>>> > >>> >>>
>>> > >>> >>> $ ./bin/spark-shell -v
>>> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
>>> > >>> >>> Adding default property:
>>> > >>> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
>>> > >>> >>>
>>> > >>> >>>
>>> > >>> >>> scala>  System.getProperty("foo.bar.baz")
>>> > >>> >>> res0: String = null
>>> > >>> >>>
>>> > >>> >>>
>>> > >>> >>> If you add double quotes, the resulting string value will have
>>> > double
>>> > >>> >>> quotes.
>>> > >>> >>>
>>> > >>> >>>
>>> > >>> >>> $ cat conf/spark-defaults.conf
>>> > >>> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
>>> > >>> >>>
>>> > >>> >>> $ ./bin/spark-shell -v
>>> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
>>> > >>> >>> Adding default property:
>>> > >>> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23"
>>> > >>> >>>
>>> > >>> >>> scala>  System.getProperty("foo.bar.baz")
>>> > >>> >>> res0: String = null
>>> > >>> >>>
>>> > >>> >>>
>>> > >>> >>> Neither one of those affects the issue; the underlying problem
>>> in
>>> > my
>>> > >>> case
>>> > >>> >>> seems to be that bin/spark-class uses the SPARK_SUBMIT_OPTS and
>>> > >>> >>> SPARK_JAVA_OPTS environment variables, but nothing parses
>>> > >>> >>> spark-defaults.conf before the java process is started.
>>> > >>> >>>
>>> > >>> >>> Here's an example of the process running when only
>>> > >>> spark-defaults.conf is
>>> > >>> >>> being used:
>>> > >>> >>>
>>> > >>> >>> $ ps -ef | grep spark
>>> > >>> >>>
>>> > >>> >>> 514       5182  2058  0 21:05 pts/2    00:00:00 bash
>>> > >>> ./bin/spark-shell -v
>>> > >>> >>>
>>> > >>> >>> 514       5189  5182  4 21:05 pts/2    00:00:22
>>> > >>> /usr/local/java/bin/java
>>> > >>> >>> -cp
>>> > >>> >>>
>>> > >>>
>>> >
>>> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
>>> > >>> >>> -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m
>>> > >>> >>> org.apache.spark.deploy.SparkSubmit spark-shell -v --class
>>> > >>> >>> org.apache.spark.repl.Main
>>> > >>> >>>
>>> > >>> >>>
>>> > >>> >>> Here's an example of it when the command line
>>> > --driver-java-options is
>>> > >>> >>> used (and thus things work):
>>> > >>> >>>
>>> > >>> >>>
>>> > >>> >>> $ ps -ef | grep spark
>>> > >>> >>> 514       5392  2058  0 21:15 pts/2    00:00:00 bash
>>> > >>> ./bin/spark-shell -v
>>> > >>> >>> --driver-java-options -Dfoo.bar.baz=23
>>> > >>> >>>
>>> > >>> >>> 514       5399  5392 80 21:15 pts/2    00:00:06
>>> > >>> /usr/local/java/bin/java
>>> > >>> >>> -cp
>>> > >>> >>>
>>> > >>>
>>> >
>>> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
>>> > >>> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path=
>>> -Xms512m
>>> > >>> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit spark-shell -v
>>> > >>> >>> --driver-java-options -Dfoo.bar.baz=23 --class
>>> > >>> org.apache.spark.repl.Main
>>> > >>> >>>
>>> > >>> >>>
>>> > >>> >>>
>>> > >>> >>>
>>> > >>> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick Wendell <
>>> > pwendell@gmail.com>
>>> > >>> >>> wrote:
>>> > >>> >>>
>>> > >>> >>>> Cody - in your example you are using the '=' character, but in
>>> our
>>> > >>> >>>> documentation and tests we use a whitespace to separate the key
>>> > and
>>> > >>> >>>> value in the defaults file.
>>> > >>> >>>>
>>> > >>> >>>> docs: http://spark.apache.org/docs/latest/configuration.html
>>> > >>> >>>>
>>> > >>> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>>> > >>> >>>>
>>> > >>> >>>> I'm not sure if the java properties file parser will try to
>>> > interpret
>>> > >>> >>>> the equals sign. If so you might need to do this.
>>> > >>> >>>>
>>> > >>> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
>>> > >>> >>>>
>>> > >>> >>>> Do those work for you?
>>> > >>> >>>>
>>> > >>> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo Vanzin <
>>> > vanzin@cloudera.com
>>> > >>> >
>>> > >>> >>>> wrote:
>>> > >>> >>>> > Hi Cody,
>>> > >>> >>>> >
>>> > >>> >>>> > Could you file a bug for this if there isn't one already?
>>> > >>> >>>> >
>>> > >>> >>>> > For system properties SparkSubmit should be able to read
>>> those
>>> > >>> >>>> > settings and do the right thing, but that obviously won't
>>> work
>>> > for
>>> > >>> >>>> > other JVM options... the current code should work fine in
>>> > cluster
>>> > >>> mode
>>> > >>> >>>> > though, since the driver is a different process. :-)
>>> > >>> >>>> >
>>> > >>> >>>> >
>>> > >>> >>>> > On Wed, Jul 30, 2014 at 1:12 PM, Cody Koeninger <
>>> > >>> cody@koeninger.org>
>>> > >>> >>>> wrote:
>>> > >>> >>>> >> We were previously using SPARK_JAVA_OPTS to set java system
>>> > >>> properties
>>> > >>> >>>> via
>>> > >>> >>>> >> -D.
>>> > >>> >>>> >>
>>> > >>> >>>> >> This was used for properties that varied on a
>>> > >>> >>>> per-deployment-environment
>>> > >>> >>>> >> basis, but needed to be available in the spark shell and
>>> > workers.
>>> > >>> >>>> >>
>>> > >>> >>>> >> On upgrading to 1.0, we saw that SPARK_JAVA_OPTS had been
>>> > >>> deprecated,
>>> > >>> >>>> and
>>> > >>> >>>> >> replaced by spark-defaults.conf and command line arguments
>>> to
>>> > >>> >>>> spark-submit
>>> > >>> >>>> >> or spark-shell.
>>> > >>> >>>> >>
>>> > >>> >>>> >> However, setting spark.driver.extraJavaOptions and
>>> > >>> >>>> >> spark.executor.extraJavaOptions in spark-defaults.conf is
>>> not a
>>> > >>> >>>> replacement
>>> > >>> >>>> >> for SPARK_JAVA_OPTS:
>>> > >>> >>>> >>
>>> > >>> >>>> >>
>>> > >>> >>>> >> $ cat conf/spark-defaults.conf
>>> > >>> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
>>> > >>> >>>> >>
>>> > >>> >>>> >> $ ./bin/spark-shell
>>> > >>> >>>> >>
>>> > >>> >>>> >> scala> System.getProperty("foo.bar.baz")
>>> > >>> >>>> >> res0: String = null
>>> > >>> >>>> >>
>>> > >>> >>>> >>
>>> > >>> >>>> >> $ ./bin/spark-shell --driver-java-options "-Dfoo.bar.baz=23"
>>> > >>> >>>> >>
>>> > >>> >>>> >> scala> System.getProperty("foo.bar.baz")
>>> > >>> >>>> >> res0: String = 23
>>> > >>> >>>> >>
>>> > >>> >>>> >>
>>> > >>> >>>> >> Looking through the shell scripts for spark-submit and
>>> > >>> spark-class, I
>>> > >>> >>>> can
>>> > >>> >>>> >> see why this is; parsing spark-defaults.conf from bash
>>> could be
>>> > >>> >>>> brittle.
>>> > >>> >>>> >>
>>> > >>> >>>> >> But from an ergonomic point of view, it's a step back to go
>>> > from a
>>> > >>> >>>> >> set-it-and-forget-it configuration in spark-env.sh, to
>>> > requiring
>>> > >>> >>>> command
>>> > >>> >>>> >> line arguments.
>>> > >>> >>>> >>
>>> > >>> >>>> >> I can solve this with an ad-hoc script to wrap spark-shell
>>> with
>>> > >>> the
>>> > >>> >>>> >> appropriate arguments, but I wanted to bring the issue up to
>>> > see
>>> > >>> if
>>> > >>> >>>> anyone
>>> > >>> >>>> >> else had run into it,
>>> > >>> >>>> >> or had any direction for a general solution (beyond parsing
>>> > java
>>> > >>> >>>> properties
>>> > >>> >>>> >> files from bash).
>>> > >>> >>>> >
>>> > >>> >>>> >
>>> > >>> >>>> >
>>> > >>> >>>> > --
>>> > >>> >>>> > Marcelo
>>> > >>> >>>>
>>> > >>> >>>
>>> > >>> >>>
>>> > >>>
>>> > >>
>>> > >>
>>> >
>>> >
>>> >
>>> > --
>>> > Marcelo
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> > For additional commands, e-mail: dev-help@spark.apache.org
>>> >
>>> >
>>>
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: replacement for SPARK_JAVA_OPTS

Posted by Andrew Or <an...@databricks.com>.
@Cody I took a quick glance at the Mesos code and it appears that we
currently do not even pass extra java options to executors except in coarse
grained mode, and even in this mode we do not pass them to executors
correctly. I have filed a related JIRA here:
https://issues.apache.org/jira/browse/SPARK-2921. This is a somewhat
serious limitation and we will try to fix this for 1.1.

-Andrew


2014-08-07 19:42 GMT-07:00 Andrew Or <an...@databricks.com>:

> Thanks Marcelo, I have moved the changes to a new PR to describe the
> problems more clearly: https://github.com/apache/spark/pull/1845
>
> @Gary Yeah, the goal is to get this into 1.1 as a bug fix.
>
>
> 2014-08-07 17:30 GMT-07:00 Gary Malouf <ma...@gmail.com>:
>
> Can this be cherry-picked for 1.1 if everything works out?  In my opinion,
>> it could be qualified as a bug fix.
>>
>>
>> On Thu, Aug 7, 2014 at 5:47 PM, Marcelo Vanzin <va...@cloudera.com>
>> wrote:
>>
>> > Andrew has been working on a fix:
>> > https://github.com/apache/spark/pull/1770
>> >
>> > On Thu, Aug 7, 2014 at 2:35 PM, Cody Koeninger <co...@koeninger.org>
>> wrote:
>> > > Just wanted to check in on this, see if I should file a bug report
>> > > regarding the mesos argument propagation.
>> > >
>> > >
>> > > On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger <co...@koeninger.org>
>> > wrote:
>> > >
>> > >> 1. I've tried with and without escaping equals sign, it doesn't
>> affect
>> > the
>> > >> results.
>> > >>
>> > >> 2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works for
>> getting
>> > >> system properties set in the local shell (although not for
>> executors).
>> > >>
>> > >> 3. We're using the default fine-grained mesos mode, not setting
>> > >> spark.mesos.coarse, so it doesn't seem immediately related to that
>> > ticket.
>> > >> Should I file a bug report?
>> > >>
>> > >>
>> > >> On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell <pwendell@gmail.com
>> >
>> > >> wrote:
>> > >>
>> > >>> The third issue may be related to this:
>> > >>> https://issues.apache.org/jira/browse/SPARK-2022
>> > >>>
>> > >>> We can take a look at this during the bug fix period for the 1.1
>> > >>> release next week. If we come up with a fix we can backport it into
>> > >>> the 1.0 branch also.
>> > >>>
>> > >>> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell <
>> pwendell@gmail.com>
>> > >>> wrote:
>> > >>> > Thanks for digging around here. I think there are a few distinct
>> > issues.
>> > >>> >
>> > >>> > 1. Properties containing the '=' character need to be escaped.
>> > >>> > I was able to load properties fine as long as I escape the '='
>> > >>> > character. But maybe we should document this:
>> > >>> >
>> > >>> > == spark-defaults.conf ==
>> > >>> > spark.foo a\=B
>> > >>> > == shell ==
>> > >>> > scala> sc.getConf.get("spark.foo")
>> > >>> > res2: String = a=B
>> > >>> >
>> > >>> > 2. spark.driver.extraJavaOptions, when set in the properties file,
>> > >>> > don't affect the driver when running in client mode (always the
>> case
>> > >>> > for mesos). We should probably document this. In this case you
>> need
>> > to
>> > >>> > either use --driver-java-options or set SPARK_SUBMIT_OPTS.
>> > >>> >
>> > >>> > 3. Arguments aren't propagated on Mesos (this might be because of
>> the
>> > >>> > other issues, or a separate bug).
>> > >>> >
>> > >>> > - Patrick
>> > >>> >
>> > >>> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger <
>> cody@koeninger.org>
>> > >>> wrote:
>> > >>> >> In addition, spark.executor.extraJavaOptions does not seem to
>> behave
>> > >>> as I
>> > >>> >> would expect; java arguments don't seem to be propagated to
>> > executors.
>> > >>> >>
>> > >>> >>
>> > >>> >> $ cat conf/spark-defaults.conf
>> > >>> >>
>> > >>> >> spark.master
>> > >>> >>
>> > >>>
>> >
>> mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters
>> > >>> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23
>> > >>> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>> > >>> >>
>> > >>> >>
>> > >>> >> $ ./bin/spark-shell
>> > >>> >>
>> > >>> >> scala> sc.getConf.get("spark.executor.extraJavaOptions")
>> > >>> >> res0: String = -Dfoo.bar.baz=23
>> > >>> >>
>> > >>> >> scala> sc.parallelize(1 to 100).map{ i => (
>> > >>> >>      |  java.net.InetAddress.getLocalHost.getHostName,
>> > >>> >>      |  System.getProperty("foo.bar.baz")
>> > >>> >>      | )}.collect
>> > >>> >>
>> > >>> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null),
>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null),
>> > >>> >> (dn-02.mxstg,null), ...
>> > >>> >>
>> > >>> >>
>> > >>> >>
>> > >>> >> Note that this is a mesos deployment, although I wouldn't expect
>> > that
>> > >>> to
>> > >>> >> affect the availability of spark.driver.extraJavaOptions in a
>> local
>> > >>> spark
>> > >>> >> shell.
>> > >>> >>
>> > >>> >>
>> > >>> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger <
>> cody@koeninger.org
>> > >
>> > >>> wrote:
>> > >>> >>
>> > >>> >>> Either whitespace or equals sign are valid properties file
>> formats.
>> > >>> >>> Here's an example:
>> > >>> >>>
>> > >>> >>> $ cat conf/spark-defaults.conf
>> > >>> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>> > >>> >>>
>> > >>> >>> $ ./bin/spark-shell -v
>> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
>> > >>> >>> Adding default property:
>> > >>> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
>> > >>> >>>
>> > >>> >>>
>> > >>> >>> scala>  System.getProperty("foo.bar.baz")
>> > >>> >>> res0: String = null
>> > >>> >>>
>> > >>> >>>
>> > >>> >>> If you add double quotes, the resulting string value will have
>> > double
>> > >>> >>> quotes.
>> > >>> >>>
>> > >>> >>>
>> > >>> >>> $ cat conf/spark-defaults.conf
>> > >>> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
>> > >>> >>>
>> > >>> >>> $ ./bin/spark-shell -v
>> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
>> > >>> >>> Adding default property:
>> > >>> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23"
>> > >>> >>>
>> > >>> >>> scala>  System.getProperty("foo.bar.baz")
>> > >>> >>> res0: String = null
>> > >>> >>>
>> > >>> >>>
>> > >>> >>> Neither one of those affects the issue; the underlying problem
>> in
>> > my
>> > >>> case
>> > >>> >>> seems to be that bin/spark-class uses the SPARK_SUBMIT_OPTS and
>> > >>> >>> SPARK_JAVA_OPTS environment variables, but nothing parses
>> > >>> >>> spark-defaults.conf before the java process is started.
>> > >>> >>>
>> > >>> >>> Here's an example of the process running when only
>> > >>> spark-defaults.conf is
>> > >>> >>> being used:
>> > >>> >>>
>> > >>> >>> $ ps -ef | grep spark
>> > >>> >>>
>> > >>> >>> 514       5182  2058  0 21:05 pts/2    00:00:00 bash
>> > >>> ./bin/spark-shell -v
>> > >>> >>>
>> > >>> >>> 514       5189  5182  4 21:05 pts/2    00:00:22
>> > >>> /usr/local/java/bin/java
>> > >>> >>> -cp
>> > >>> >>>
>> > >>>
>> >
>> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
>> > >>> >>> -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m
>> > >>> >>> org.apache.spark.deploy.SparkSubmit spark-shell -v --class
>> > >>> >>> org.apache.spark.repl.Main
>> > >>> >>>
>> > >>> >>>
>> > >>> >>> Here's an example of it when the command line
>> > --driver-java-options is
>> > >>> >>> used (and thus things work):
>> > >>> >>>
>> > >>> >>>
>> > >>> >>> $ ps -ef | grep spark
>> > >>> >>> 514       5392  2058  0 21:15 pts/2    00:00:00 bash
>> > >>> ./bin/spark-shell -v
>> > >>> >>> --driver-java-options -Dfoo.bar.baz=23
>> > >>> >>>
>> > >>> >>> 514       5399  5392 80 21:15 pts/2    00:00:06
>> > >>> /usr/local/java/bin/java
>> > >>> >>> -cp
>> > >>> >>>
>> > >>>
>> >
>> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
>> > >>> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path=
>> -Xms512m
>> > >>> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit spark-shell -v
>> > >>> >>> --driver-java-options -Dfoo.bar.baz=23 --class
>> > >>> org.apache.spark.repl.Main
>> > >>> >>>
>> > >>> >>>
>> > >>> >>>
>> > >>> >>>
>> > >>> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick Wendell <
>> > pwendell@gmail.com>
>> > >>> >>> wrote:
>> > >>> >>>
>> > >>> >>>> Cody - in your example you are using the '=' character, but in
>> our
>> > >>> >>>> documentation and tests we use a whitespace to separate the key
>> > and
>> > >>> >>>> value in the defaults file.
>> > >>> >>>>
>> > >>> >>>> docs: http://spark.apache.org/docs/latest/configuration.html
>> > >>> >>>>
>> > >>> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>> > >>> >>>>
>> > >>> >>>> I'm not sure if the java properties file parser will try to
>> > interpret
>> > >>> >>>> the equals sign. If so you might need to do this.
>> > >>> >>>>
>> > >>> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
>> > >>> >>>>
>> > >>> >>>> Do those work for you?
>> > >>> >>>>
>> > >>> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo Vanzin <
>> > vanzin@cloudera.com
>> > >>> >
>> > >>> >>>> wrote:
>> > >>> >>>> > Hi Cody,
>> > >>> >>>> >
>> > >>> >>>> > Could you file a bug for this if there isn't one already?
>> > >>> >>>> >
>> > >>> >>>> > For system properties SparkSubmit should be able to read
>> those
>> > >>> >>>> > settings and do the right thing, but that obviously won't
>> work
>> > for
>> > >>> >>>> > other JVM options... the current code should work fine in
>> > cluster
>> > >>> mode
>> > >>> >>>> > though, since the driver is a different process. :-)
>> > >>> >>>> >
>> > >>> >>>> >
>> > >>> >>>> > On Wed, Jul 30, 2014 at 1:12 PM, Cody Koeninger <
>> > >>> cody@koeninger.org>
>> > >>> >>>> wrote:
>> > >>> >>>> >> We were previously using SPARK_JAVA_OPTS to set java system
>> > >>> properties
>> > >>> >>>> via
>> > >>> >>>> >> -D.
>> > >>> >>>> >>
>> > >>> >>>> >> This was used for properties that varied on a
>> > >>> >>>> per-deployment-environment
>> > >>> >>>> >> basis, but needed to be available in the spark shell and
>> > workers.
>> > >>> >>>> >>
>> > >>> >>>> >> On upgrading to 1.0, we saw that SPARK_JAVA_OPTS had been
>> > >>> deprecated,
>> > >>> >>>> and
>> > >>> >>>> >> replaced by spark-defaults.conf and command line arguments
>> to
>> > >>> >>>> spark-submit
>> > >>> >>>> >> or spark-shell.
>> > >>> >>>> >>
>> > >>> >>>> >> However, setting spark.driver.extraJavaOptions and
>> > >>> >>>> >> spark.executor.extraJavaOptions in spark-defaults.conf is
>> not a
>> > >>> >>>> replacement
>> > >>> >>>> >> for SPARK_JAVA_OPTS:
>> > >>> >>>> >>
>> > >>> >>>> >>
>> > >>> >>>> >> $ cat conf/spark-defaults.conf
>> > >>> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
>> > >>> >>>> >>
>> > >>> >>>> >> $ ./bin/spark-shell
>> > >>> >>>> >>
>> > >>> >>>> >> scala> System.getProperty("foo.bar.baz")
>> > >>> >>>> >> res0: String = null
>> > >>> >>>> >>
>> > >>> >>>> >>
>> > >>> >>>> >> $ ./bin/spark-shell --driver-java-options "-Dfoo.bar.baz=23"
>> > >>> >>>> >>
>> > >>> >>>> >> scala> System.getProperty("foo.bar.baz")
>> > >>> >>>> >> res0: String = 23
>> > >>> >>>> >>
>> > >>> >>>> >>
>> > >>> >>>> >> Looking through the shell scripts for spark-submit and
>> > >>> spark-class, I
>> > >>> >>>> can
>> > >>> >>>> >> see why this is; parsing spark-defaults.conf from bash
>> could be
>> > >>> >>>> brittle.
>> > >>> >>>> >>
>> > >>> >>>> >> But from an ergonomic point of view, it's a step back to go
>> > from a
>> > >>> >>>> >> set-it-and-forget-it configuration in spark-env.sh, to
>> > requiring
>> > >>> >>>> command
>> > >>> >>>> >> line arguments.
>> > >>> >>>> >>
>> > >>> >>>> >> I can solve this with an ad-hoc script to wrap spark-shell
>> with
>> > >>> the
>> > >>> >>>> >> appropriate arguments, but I wanted to bring the issue up to
>> > see
>> > >>> if
>> > >>> >>>> anyone
>> > >>> >>>> >> else had run into it,
>> > >>> >>>> >> or had any direction for a general solution (beyond parsing
>> > java
>> > >>> >>>> properties
>> > >>> >>>> >> files from bash).
>> > >>> >>>> >
>> > >>> >>>> >
>> > >>> >>>> >
>> > >>> >>>> > --
>> > >>> >>>> > Marcelo
>> > >>> >>>>
>> > >>> >>>
>> > >>> >>>
>> > >>>
>> > >>
>> > >>
>> >
>> >
>> >
>> > --
>> > Marcelo
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> > For additional commands, e-mail: dev-help@spark.apache.org
>> >
>> >
>>
>
>

Re: replacement for SPARK_JAVA_OPTS

Posted by Andrew Or <an...@databricks.com>.
Thanks Marcelo, I have moved the changes to a new PR to describe the
problems more clearly: https://github.com/apache/spark/pull/1845

@Gary Yeah, the goal is to get this into 1.1 as a bug fix.


2014-08-07 17:30 GMT-07:00 Gary Malouf <ma...@gmail.com>:

> Can this be cherry-picked for 1.1 if everything works out?  In my opinion,
> it could be qualified as a bug fix.
>
>
> On Thu, Aug 7, 2014 at 5:47 PM, Marcelo Vanzin <va...@cloudera.com>
> wrote:
>
> > Andrew has been working on a fix:
> > https://github.com/apache/spark/pull/1770
> >
> > On Thu, Aug 7, 2014 at 2:35 PM, Cody Koeninger <co...@koeninger.org>
> wrote:
> > > Just wanted to check in on this, see if I should file a bug report
> > > regarding the mesos argument propagation.
> > >
> > >
> > > On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger <co...@koeninger.org>
> > wrote:
> > >
> > >> 1. I've tried with and without escaping equals sign, it doesn't affect
> > the
> > >> results.
> > >>
> > >> 2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works for
> getting
> > >> system properties set in the local shell (although not for executors).
> > >>
> > >> 3. We're using the default fine-grained mesos mode, not setting
> > >> spark.mesos.coarse, so it doesn't seem immediately related to that
> > ticket.
> > >> Should I file a bug report?
> > >>
> > >>
> > >> On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell <pw...@gmail.com>
> > >> wrote:
> > >>
> > >>> The third issue may be related to this:
> > >>> https://issues.apache.org/jira/browse/SPARK-2022
> > >>>
> > >>> We can take a look at this during the bug fix period for the 1.1
> > >>> release next week. If we come up with a fix we can backport it into
> > >>> the 1.0 branch also.
> > >>>
> > >>> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell <
> pwendell@gmail.com>
> > >>> wrote:
> > >>> > Thanks for digging around here. I think there are a few distinct
> > issues.
> > >>> >
> > >>> > 1. Properties containing the '=' character need to be escaped.
> > >>> > I was able to load properties fine as long as I escape the '='
> > >>> > character. But maybe we should document this:
> > >>> >
> > >>> > == spark-defaults.conf ==
> > >>> > spark.foo a\=B
> > >>> > == shell ==
> > >>> > scala> sc.getConf.get("spark.foo")
> > >>> > res2: String = a=B
> > >>> >
> > >>> > 2. spark.driver.extraJavaOptions, when set in the properties file,
> > >>> > don't affect the driver when running in client mode (always the
> case
> > >>> > for mesos). We should probably document this. In this case you need
> > to
> > >>> > either use --driver-java-options or set SPARK_SUBMIT_OPTS.
> > >>> >
> > >>> > 3. Arguments aren't propagated on Mesos (this might be because of
> the
> > >>> > other issues, or a separate bug).
> > >>> >
> > >>> > - Patrick
> > >>> >
> > >>> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger <
> cody@koeninger.org>
> > >>> wrote:
> > >>> >> In addition, spark.executor.extraJavaOptions does not seem to
> behave
> > >>> as I
> > >>> >> would expect; java arguments don't seem to be propagated to
> > executors.
> > >>> >>
> > >>> >>
> > >>> >> $ cat conf/spark-defaults.conf
> > >>> >>
> > >>> >> spark.master
> > >>> >>
> > >>>
> >
> mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters
> > >>> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23
> > >>> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> > >>> >>
> > >>> >>
> > >>> >> $ ./bin/spark-shell
> > >>> >>
> > >>> >> scala> sc.getConf.get("spark.executor.extraJavaOptions")
> > >>> >> res0: String = -Dfoo.bar.baz=23
> > >>> >>
> > >>> >> scala> sc.parallelize(1 to 100).map{ i => (
> > >>> >>      |  java.net.InetAddress.getLocalHost.getHostName,
> > >>> >>      |  System.getProperty("foo.bar.baz")
> > >>> >>      | )}.collect
> > >>> >>
> > >>> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null),
> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null),
> > >>> >> (dn-02.mxstg,null), ...
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >> Note that this is a mesos deployment, although I wouldn't expect
> > that
> > >>> to
> > >>> >> affect the availability of spark.driver.extraJavaOptions in a
> local
> > >>> spark
> > >>> >> shell.
> > >>> >>
> > >>> >>
> > >>> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger <
> cody@koeninger.org
> > >
> > >>> wrote:
> > >>> >>
> > >>> >>> Either whitespace or equals sign are valid properties file
> formats.
> > >>> >>> Here's an example:
> > >>> >>>
> > >>> >>> $ cat conf/spark-defaults.conf
> > >>> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> > >>> >>>
> > >>> >>> $ ./bin/spark-shell -v
> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
> > >>> >>> Adding default property:
> > >>> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
> > >>> >>>
> > >>> >>>
> > >>> >>> scala>  System.getProperty("foo.bar.baz")
> > >>> >>> res0: String = null
> > >>> >>>
> > >>> >>>
> > >>> >>> If you add double quotes, the resulting string value will have
> > double
> > >>> >>> quotes.
> > >>> >>>
> > >>> >>>
> > >>> >>> $ cat conf/spark-defaults.conf
> > >>> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
> > >>> >>>
> > >>> >>> $ ./bin/spark-shell -v
> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
> > >>> >>> Adding default property:
> > >>> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23"
> > >>> >>>
> > >>> >>> scala>  System.getProperty("foo.bar.baz")
> > >>> >>> res0: String = null
> > >>> >>>
> > >>> >>>
> > >>> >>> Neither one of those affects the issue; the underlying problem in
> > my
> > >>> case
> > >>> >>> seems to be that bin/spark-class uses the SPARK_SUBMIT_OPTS and
> > >>> >>> SPARK_JAVA_OPTS environment variables, but nothing parses
> > >>> >>> spark-defaults.conf before the java process is started.
> > >>> >>>
> > >>> >>> Here's an example of the process running when only
> > >>> spark-defaults.conf is
> > >>> >>> being used:
> > >>> >>>
> > >>> >>> $ ps -ef | grep spark
> > >>> >>>
> > >>> >>> 514       5182  2058  0 21:05 pts/2    00:00:00 bash
> > >>> ./bin/spark-shell -v
> > >>> >>>
> > >>> >>> 514       5189  5182  4 21:05 pts/2    00:00:22
> > >>> /usr/local/java/bin/java
> > >>> >>> -cp
> > >>> >>>
> > >>>
> >
> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
> > >>> >>> -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m
> > >>> >>> org.apache.spark.deploy.SparkSubmit spark-shell -v --class
> > >>> >>> org.apache.spark.repl.Main
> > >>> >>>
> > >>> >>>
> > >>> >>> Here's an example of it when the command line
> > --driver-java-options is
> > >>> >>> used (and thus things work):
> > >>> >>>
> > >>> >>>
> > >>> >>> $ ps -ef | grep spark
> > >>> >>> 514       5392  2058  0 21:15 pts/2    00:00:00 bash
> > >>> ./bin/spark-shell -v
> > >>> >>> --driver-java-options -Dfoo.bar.baz=23
> > >>> >>>
> > >>> >>> 514       5399  5392 80 21:15 pts/2    00:00:06
> > >>> /usr/local/java/bin/java
> > >>> >>> -cp
> > >>> >>>
> > >>>
> >
> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
> > >>> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path=
> -Xms512m
> > >>> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit spark-shell -v
> > >>> >>> --driver-java-options -Dfoo.bar.baz=23 --class
> > >>> org.apache.spark.repl.Main
> > >>> >>>
> > >>> >>>
> > >>> >>>
> > >>> >>>
> > >>> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick Wendell <
> > pwendell@gmail.com>
> > >>> >>> wrote:
> > >>> >>>
> > >>> >>>> Cody - in your example you are using the '=' character, but in
> our
> > >>> >>>> documentation and tests we use a whitespace to separate the key
> > and
> > >>> >>>> value in the defaults file.
> > >>> >>>>
> > >>> >>>> docs: http://spark.apache.org/docs/latest/configuration.html
> > >>> >>>>
> > >>> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> > >>> >>>>
> > >>> >>>> I'm not sure if the java properties file parser will try to
> > interpret
> > >>> >>>> the equals sign. If so you might need to do this.
> > >>> >>>>
> > >>> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
> > >>> >>>>
> > >>> >>>> Do those work for you?
> > >>> >>>>
> > >>> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo Vanzin <
> > vanzin@cloudera.com
> > >>> >
> > >>> >>>> wrote:
> > >>> >>>> > Hi Cody,
> > >>> >>>> >
> > >>> >>>> > Could you file a bug for this if there isn't one already?
> > >>> >>>> >
> > >>> >>>> > For system properties SparkSubmit should be able to read those
> > >>> >>>> > settings and do the right thing, but that obviously won't work
> > for
> > >>> >>>> > other JVM options... the current code should work fine in
> > cluster
> > >>> mode
> > >>> >>>> > though, since the driver is a different process. :-)
> > >>> >>>> >
> > >>> >>>> >
> > >>> >>>> > On Wed, Jul 30, 2014 at 1:12 PM, Cody Koeninger <
> > >>> cody@koeninger.org>
> > >>> >>>> wrote:
> > >>> >>>> >> We were previously using SPARK_JAVA_OPTS to set java system
> > >>> properties
> > >>> >>>> via
> > >>> >>>> >> -D.
> > >>> >>>> >>
> > >>> >>>> >> This was used for properties that varied on a
> > >>> >>>> per-deployment-environment
> > >>> >>>> >> basis, but needed to be available in the spark shell and
> > workers.
> > >>> >>>> >>
> > >>> >>>> >> On upgrading to 1.0, we saw that SPARK_JAVA_OPTS had been
> > >>> deprecated,
> > >>> >>>> and
> > >>> >>>> >> replaced by spark-defaults.conf and command line arguments to
> > >>> >>>> spark-submit
> > >>> >>>> >> or spark-shell.
> > >>> >>>> >>
> > >>> >>>> >> However, setting spark.driver.extraJavaOptions and
> > >>> >>>> >> spark.executor.extraJavaOptions in spark-defaults.conf is
> not a
> > >>> >>>> replacement
> > >>> >>>> >> for SPARK_JAVA_OPTS:
> > >>> >>>> >>
> > >>> >>>> >>
> > >>> >>>> >> $ cat conf/spark-defaults.conf
> > >>> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
> > >>> >>>> >>
> > >>> >>>> >> $ ./bin/spark-shell
> > >>> >>>> >>
> > >>> >>>> >> scala> System.getProperty("foo.bar.baz")
> > >>> >>>> >> res0: String = null
> > >>> >>>> >>
> > >>> >>>> >>
> > >>> >>>> >> $ ./bin/spark-shell --driver-java-options "-Dfoo.bar.baz=23"
> > >>> >>>> >>
> > >>> >>>> >> scala> System.getProperty("foo.bar.baz")
> > >>> >>>> >> res0: String = 23
> > >>> >>>> >>
> > >>> >>>> >>
> > >>> >>>> >> Looking through the shell scripts for spark-submit and
> > >>> spark-class, I
> > >>> >>>> can
> > >>> >>>> >> see why this is; parsing spark-defaults.conf from bash could
> be
> > >>> >>>> brittle.
> > >>> >>>> >>
> > >>> >>>> >> But from an ergonomic point of view, it's a step back to go
> > from a
> > >>> >>>> >> set-it-and-forget-it configuration in spark-env.sh, to
> > requiring
> > >>> >>>> command
> > >>> >>>> >> line arguments.
> > >>> >>>> >>
> > >>> >>>> >> I can solve this with an ad-hoc script to wrap spark-shell
> with
> > >>> the
> > >>> >>>> >> appropriate arguments, but I wanted to bring the issue up to
> > see
> > >>> if
> > >>> >>>> anyone
> > >>> >>>> >> else had run into it,
> > >>> >>>> >> or had any direction for a general solution (beyond parsing
> > java
> > >>> >>>> properties
> > >>> >>>> >> files from bash).
> > >>> >>>> >
> > >>> >>>> >
> > >>> >>>> >
> > >>> >>>> > --
> > >>> >>>> > Marcelo
> > >>> >>>>
> > >>> >>>
> > >>> >>>
> > >>>
> > >>
> > >>
> >
> >
> >
> > --
> > Marcelo
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> > For additional commands, e-mail: dev-help@spark.apache.org
> >
> >
>

Re: replacement for SPARK_JAVA_OPTS

Posted by Gary Malouf <ma...@gmail.com>.
Can this be cherry-picked for 1.1 if everything works out?  In my opinion,
it could be qualified as a bug fix.


On Thu, Aug 7, 2014 at 5:47 PM, Marcelo Vanzin <va...@cloudera.com> wrote:

> Andrew has been working on a fix:
> https://github.com/apache/spark/pull/1770
>
> On Thu, Aug 7, 2014 at 2:35 PM, Cody Koeninger <co...@koeninger.org> wrote:
> > Just wanted to check in on this, see if I should file a bug report
> > regarding the mesos argument propagation.
> >
> >
> > On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger <co...@koeninger.org>
> wrote:
> >
> >> 1. I've tried with and without escaping equals sign, it doesn't affect
> the
> >> results.
> >>
> >> 2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works for getting
> >> system properties set in the local shell (although not for executors).
> >>
> >> 3. We're using the default fine-grained mesos mode, not setting
> >> spark.mesos.coarse, so it doesn't seem immediately related to that
> ticket.
> >> Should I file a bug report?
> >>
> >>
> >> On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell <pw...@gmail.com>
> >> wrote:
> >>
> >>> The third issue may be related to this:
> >>> https://issues.apache.org/jira/browse/SPARK-2022
> >>>
> >>> We can take a look at this during the bug fix period for the 1.1
> >>> release next week. If we come up with a fix we can backport it into
> >>> the 1.0 branch also.
> >>>
> >>> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell <pw...@gmail.com>
> >>> wrote:
> >>> > Thanks for digging around here. I think there are a few distinct
> issues.
> >>> >
> >>> > 1. Properties containing the '=' character need to be escaped.
> >>> > I was able to load properties fine as long as I escape the '='
> >>> > character. But maybe we should document this:
> >>> >
> >>> > == spark-defaults.conf ==
> >>> > spark.foo a\=B
> >>> > == shell ==
> >>> > scala> sc.getConf.get("spark.foo")
> >>> > res2: String = a=B
> >>> >
> >>> > 2. spark.driver.extraJavaOptions, when set in the properties file,
> >>> > don't affect the driver when running in client mode (always the case
> >>> > for mesos). We should probably document this. In this case you need
> to
> >>> > either use --driver-java-options or set SPARK_SUBMIT_OPTS.
> >>> >
> >>> > 3. Arguments aren't propagated on Mesos (this might be because of the
> >>> > other issues, or a separate bug).
> >>> >
> >>> > - Patrick
> >>> >
> >>> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger <co...@koeninger.org>
> >>> wrote:
> >>> >> In addition, spark.executor.extraJavaOptions does not seem to behave
> >>> as I
> >>> >> would expect; java arguments don't seem to be propagated to
> executors.
> >>> >>
> >>> >>
> >>> >> $ cat conf/spark-defaults.conf
> >>> >>
> >>> >> spark.master
> >>> >>
> >>>
> mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters
> >>> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23
> >>> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>> >>
> >>> >>
> >>> >> $ ./bin/spark-shell
> >>> >>
> >>> >> scala> sc.getConf.get("spark.executor.extraJavaOptions")
> >>> >> res0: String = -Dfoo.bar.baz=23
> >>> >>
> >>> >> scala> sc.parallelize(1 to 100).map{ i => (
> >>> >>      |  java.net.InetAddress.getLocalHost.getHostName,
> >>> >>      |  System.getProperty("foo.bar.baz")
> >>> >>      | )}.collect
> >>> >>
> >>> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null),
> >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null),
> >>> >> (dn-02.mxstg,null), ...
> >>> >>
> >>> >>
> >>> >>
> >>> >> Note that this is a mesos deployment, although I wouldn't expect
> that
> >>> to
> >>> >> affect the availability of spark.driver.extraJavaOptions in a local
> >>> spark
> >>> >> shell.
> >>> >>
> >>> >>
> >>> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger <cody@koeninger.org
> >
> >>> wrote:
> >>> >>
> >>> >>> Either whitespace or equals sign are valid properties file formats.
> >>> >>> Here's an example:
> >>> >>>
> >>> >>> $ cat conf/spark-defaults.conf
> >>> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>> >>>
> >>> >>> $ ./bin/spark-shell -v
> >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
> >>> >>> Adding default property:
> >>> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
> >>> >>>
> >>> >>>
> >>> >>> scala>  System.getProperty("foo.bar.baz")
> >>> >>> res0: String = null
> >>> >>>
> >>> >>>
> >>> >>> If you add double quotes, the resulting string value will have
> double
> >>> >>> quotes.
> >>> >>>
> >>> >>>
> >>> >>> $ cat conf/spark-defaults.conf
> >>> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
> >>> >>>
> >>> >>> $ ./bin/spark-shell -v
> >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
> >>> >>> Adding default property:
> >>> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23"
> >>> >>>
> >>> >>> scala>  System.getProperty("foo.bar.baz")
> >>> >>> res0: String = null
> >>> >>>
> >>> >>>
> >>> >>> Neither one of those affects the issue; the underlying problem in
> my
> >>> case
> >>> >>> seems to be that bin/spark-class uses the SPARK_SUBMIT_OPTS and
> >>> >>> SPARK_JAVA_OPTS environment variables, but nothing parses
> >>> >>> spark-defaults.conf before the java process is started.
> >>> >>>
> >>> >>> Here's an example of the process running when only
> >>> spark-defaults.conf is
> >>> >>> being used:
> >>> >>>
> >>> >>> $ ps -ef | grep spark
> >>> >>>
> >>> >>> 514       5182  2058  0 21:05 pts/2    00:00:00 bash
> >>> ./bin/spark-shell -v
> >>> >>>
> >>> >>> 514       5189  5182  4 21:05 pts/2    00:00:22
> >>> /usr/local/java/bin/java
> >>> >>> -cp
> >>> >>>
> >>>
> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
> >>> >>> -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m
> >>> >>> org.apache.spark.deploy.SparkSubmit spark-shell -v --class
> >>> >>> org.apache.spark.repl.Main
> >>> >>>
> >>> >>>
> >>> >>> Here's an example of it when the command line
> --driver-java-options is
> >>> >>> used (and thus things work):
> >>> >>>
> >>> >>>
> >>> >>> $ ps -ef | grep spark
> >>> >>> 514       5392  2058  0 21:15 pts/2    00:00:00 bash
> >>> ./bin/spark-shell -v
> >>> >>> --driver-java-options -Dfoo.bar.baz=23
> >>> >>>
> >>> >>> 514       5399  5392 80 21:15 pts/2    00:00:06
> >>> /usr/local/java/bin/java
> >>> >>> -cp
> >>> >>>
> >>>
> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
> >>> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path= -Xms512m
> >>> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit spark-shell -v
> >>> >>> --driver-java-options -Dfoo.bar.baz=23 --class
> >>> org.apache.spark.repl.Main
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick Wendell <
> pwendell@gmail.com>
> >>> >>> wrote:
> >>> >>>
> >>> >>>> Cody - in your example you are using the '=' character, but in our
> >>> >>>> documentation and tests we use a whitespace to separate the key
> and
> >>> >>>> value in the defaults file.
> >>> >>>>
> >>> >>>> docs: http://spark.apache.org/docs/latest/configuration.html
> >>> >>>>
> >>> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>> >>>>
> >>> >>>> I'm not sure if the java properties file parser will try to
> interpret
> >>> >>>> the equals sign. If so you might need to do this.
> >>> >>>>
> >>> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
> >>> >>>>
> >>> >>>> Do those work for you?
> >>> >>>>
> >>> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo Vanzin <
> vanzin@cloudera.com
> >>> >
> >>> >>>> wrote:
> >>> >>>> > Hi Cody,
> >>> >>>> >
> >>> >>>> > Could you file a bug for this if there isn't one already?
> >>> >>>> >
> >>> >>>> > For system properties SparkSubmit should be able to read those
> >>> >>>> > settings and do the right thing, but that obviously won't work
> for
> >>> >>>> > other JVM options... the current code should work fine in
> cluster
> >>> mode
> >>> >>>> > though, since the driver is a different process. :-)
> >>> >>>> >
> >>> >>>> >
> >>> >>>> > On Wed, Jul 30, 2014 at 1:12 PM, Cody Koeninger <
> >>> cody@koeninger.org>
> >>> >>>> wrote:
> >>> >>>> >> We were previously using SPARK_JAVA_OPTS to set java system
> >>> properties
> >>> >>>> via
> >>> >>>> >> -D.
> >>> >>>> >>
> >>> >>>> >> This was used for properties that varied on a
> >>> >>>> per-deployment-environment
> >>> >>>> >> basis, but needed to be available in the spark shell and
> workers.
> >>> >>>> >>
> >>> >>>> >> On upgrading to 1.0, we saw that SPARK_JAVA_OPTS had been
> >>> deprecated,
> >>> >>>> and
> >>> >>>> >> replaced by spark-defaults.conf and command line arguments to
> >>> >>>> spark-submit
> >>> >>>> >> or spark-shell.
> >>> >>>> >>
> >>> >>>> >> However, setting spark.driver.extraJavaOptions and
> >>> >>>> >> spark.executor.extraJavaOptions in spark-defaults.conf is not a
> >>> >>>> replacement
> >>> >>>> >> for SPARK_JAVA_OPTS:
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> $ cat conf/spark-defaults.conf
> >>> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
> >>> >>>> >>
> >>> >>>> >> $ ./bin/spark-shell
> >>> >>>> >>
> >>> >>>> >> scala> System.getProperty("foo.bar.baz")
> >>> >>>> >> res0: String = null
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> $ ./bin/spark-shell --driver-java-options "-Dfoo.bar.baz=23"
> >>> >>>> >>
> >>> >>>> >> scala> System.getProperty("foo.bar.baz")
> >>> >>>> >> res0: String = 23
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> Looking through the shell scripts for spark-submit and
> >>> spark-class, I
> >>> >>>> can
> >>> >>>> >> see why this is; parsing spark-defaults.conf from bash could be
> >>> >>>> brittle.
> >>> >>>> >>
> >>> >>>> >> But from an ergonomic point of view, it's a step back to go
> from a
> >>> >>>> >> set-it-and-forget-it configuration in spark-env.sh, to
> requiring
> >>> >>>> command
> >>> >>>> >> line arguments.
> >>> >>>> >>
> >>> >>>> >> I can solve this with an ad-hoc script to wrap spark-shell with
> >>> the
> >>> >>>> >> appropriate arguments, but I wanted to bring the issue up to
> see
> >>> if
> >>> >>>> anyone
> >>> >>>> >> else had run into it,
> >>> >>>> >> or had any direction for a general solution (beyond parsing
> java
> >>> >>>> properties
> >>> >>>> >> files from bash).
> >>> >>>> >
> >>> >>>> >
> >>> >>>> >
> >>> >>>> > --
> >>> >>>> > Marcelo
> >>> >>>>
> >>> >>>
> >>> >>>
> >>>
> >>
> >>
>
>
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: replacement for SPARK_JAVA_OPTS

Posted by Marcelo Vanzin <va...@cloudera.com>.
Andrew has been working on a fix:
https://github.com/apache/spark/pull/1770

On Thu, Aug 7, 2014 at 2:35 PM, Cody Koeninger <co...@koeninger.org> wrote:
> Just wanted to check in on this, see if I should file a bug report
> regarding the mesos argument propagation.
>
>
> On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger <co...@koeninger.org> wrote:
>
>> 1. I've tried with and without escaping equals sign, it doesn't affect the
>> results.
>>
>> 2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works for getting
>> system properties set in the local shell (although not for executors).
>>
>> 3. We're using the default fine-grained mesos mode, not setting
>> spark.mesos.coarse, so it doesn't seem immediately related to that ticket.
>> Should I file a bug report?
>>
>>
>> On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell <pw...@gmail.com>
>> wrote:
>>
>>> The third issue may be related to this:
>>> https://issues.apache.org/jira/browse/SPARK-2022
>>>
>>> We can take a look at this during the bug fix period for the 1.1
>>> release next week. If we come up with a fix we can backport it into
>>> the 1.0 branch also.
>>>
>>> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell <pw...@gmail.com>
>>> wrote:
>>> > Thanks for digging around here. I think there are a few distinct issues.
>>> >
>>> > 1. Properties containing the '=' character need to be escaped.
>>> > I was able to load properties fine as long as I escape the '='
>>> > character. But maybe we should document this:
>>> >
>>> > == spark-defaults.conf ==
>>> > spark.foo a\=B
>>> > == shell ==
>>> > scala> sc.getConf.get("spark.foo")
>>> > res2: String = a=B
>>> >
>>> > 2. spark.driver.extraJavaOptions, when set in the properties file,
>>> > don't affect the driver when running in client mode (always the case
>>> > for mesos). We should probably document this. In this case you need to
>>> > either use --driver-java-options or set SPARK_SUBMIT_OPTS.
>>> >
>>> > 3. Arguments aren't propagated on Mesos (this might be because of the
>>> > other issues, or a separate bug).
>>> >
>>> > - Patrick
>>> >
>>> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger <co...@koeninger.org>
>>> wrote:
>>> >> In addition, spark.executor.extraJavaOptions does not seem to behave
>>> as I
>>> >> would expect; java arguments don't seem to be propagated to executors.
>>> >>
>>> >>
>>> >> $ cat conf/spark-defaults.conf
>>> >>
>>> >> spark.master
>>> >>
>>> mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters
>>> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23
>>> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>>> >>
>>> >>
>>> >> $ ./bin/spark-shell
>>> >>
>>> >> scala> sc.getConf.get("spark.executor.extraJavaOptions")
>>> >> res0: String = -Dfoo.bar.baz=23
>>> >>
>>> >> scala> sc.parallelize(1 to 100).map{ i => (
>>> >>      |  java.net.InetAddress.getLocalHost.getHostName,
>>> >>      |  System.getProperty("foo.bar.baz")
>>> >>      | )}.collect
>>> >>
>>> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null),
>>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
>>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null),
>>> >> (dn-02.mxstg,null), ...
>>> >>
>>> >>
>>> >>
>>> >> Note that this is a mesos deployment, although I wouldn't expect that
>>> to
>>> >> affect the availability of spark.driver.extraJavaOptions in a local
>>> spark
>>> >> shell.
>>> >>
>>> >>
>>> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger <co...@koeninger.org>
>>> wrote:
>>> >>
>>> >>> Either whitespace or equals sign are valid properties file formats.
>>> >>> Here's an example:
>>> >>>
>>> >>> $ cat conf/spark-defaults.conf
>>> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>>> >>>
>>> >>> $ ./bin/spark-shell -v
>>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
>>> >>> Adding default property:
>>> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
>>> >>>
>>> >>>
>>> >>> scala>  System.getProperty("foo.bar.baz")
>>> >>> res0: String = null
>>> >>>
>>> >>>
>>> >>> If you add double quotes, the resulting string value will have double
>>> >>> quotes.
>>> >>>
>>> >>>
>>> >>> $ cat conf/spark-defaults.conf
>>> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
>>> >>>
>>> >>> $ ./bin/spark-shell -v
>>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
>>> >>> Adding default property:
>>> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23"
>>> >>>
>>> >>> scala>  System.getProperty("foo.bar.baz")
>>> >>> res0: String = null
>>> >>>
>>> >>>
>>> >>> Neither one of those affects the issue; the underlying problem in my
>>> case
>>> >>> seems to be that bin/spark-class uses the SPARK_SUBMIT_OPTS and
>>> >>> SPARK_JAVA_OPTS environment variables, but nothing parses
>>> >>> spark-defaults.conf before the java process is started.
>>> >>>
>>> >>> Here's an example of the process running when only
>>> spark-defaults.conf is
>>> >>> being used:
>>> >>>
>>> >>> $ ps -ef | grep spark
>>> >>>
>>> >>> 514       5182  2058  0 21:05 pts/2    00:00:00 bash
>>> ./bin/spark-shell -v
>>> >>>
>>> >>> 514       5189  5182  4 21:05 pts/2    00:00:22
>>> /usr/local/java/bin/java
>>> >>> -cp
>>> >>>
>>> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
>>> >>> -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m
>>> >>> org.apache.spark.deploy.SparkSubmit spark-shell -v --class
>>> >>> org.apache.spark.repl.Main
>>> >>>
>>> >>>
>>> >>> Here's an example of it when the command line --driver-java-options is
>>> >>> used (and thus things work):
>>> >>>
>>> >>>
>>> >>> $ ps -ef | grep spark
>>> >>> 514       5392  2058  0 21:15 pts/2    00:00:00 bash
>>> ./bin/spark-shell -v
>>> >>> --driver-java-options -Dfoo.bar.baz=23
>>> >>>
>>> >>> 514       5399  5392 80 21:15 pts/2    00:00:06
>>> /usr/local/java/bin/java
>>> >>> -cp
>>> >>>
>>> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
>>> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path= -Xms512m
>>> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit spark-shell -v
>>> >>> --driver-java-options -Dfoo.bar.baz=23 --class
>>> org.apache.spark.repl.Main
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick Wendell <pw...@gmail.com>
>>> >>> wrote:
>>> >>>
>>> >>>> Cody - in your example you are using the '=' character, but in our
>>> >>>> documentation and tests we use a whitespace to separate the key and
>>> >>>> value in the defaults file.
>>> >>>>
>>> >>>> docs: http://spark.apache.org/docs/latest/configuration.html
>>> >>>>
>>> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
>>> >>>>
>>> >>>> I'm not sure if the java properties file parser will try to interpret
>>> >>>> the equals sign. If so you might need to do this.
>>> >>>>
>>> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
>>> >>>>
>>> >>>> Do those work for you?
>>> >>>>
>>> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo Vanzin <vanzin@cloudera.com
>>> >
>>> >>>> wrote:
>>> >>>> > Hi Cody,
>>> >>>> >
>>> >>>> > Could you file a bug for this if there isn't one already?
>>> >>>> >
>>> >>>> > For system properties SparkSubmit should be able to read those
>>> >>>> > settings and do the right thing, but that obviously won't work for
>>> >>>> > other JVM options... the current code should work fine in cluster
>>> mode
>>> >>>> > though, since the driver is a different process. :-)
>>> >>>> >
>>> >>>> >
>>> >>>> > On Wed, Jul 30, 2014 at 1:12 PM, Cody Koeninger <
>>> cody@koeninger.org>
>>> >>>> wrote:
>>> >>>> >> We were previously using SPARK_JAVA_OPTS to set java system
>>> properties
>>> >>>> via
>>> >>>> >> -D.
>>> >>>> >>
>>> >>>> >> This was used for properties that varied on a
>>> >>>> per-deployment-environment
>>> >>>> >> basis, but needed to be available in the spark shell and workers.
>>> >>>> >>
>>> >>>> >> On upgrading to 1.0, we saw that SPARK_JAVA_OPTS had been
>>> deprecated,
>>> >>>> and
>>> >>>> >> replaced by spark-defaults.conf and command line arguments to
>>> >>>> spark-submit
>>> >>>> >> or spark-shell.
>>> >>>> >>
>>> >>>> >> However, setting spark.driver.extraJavaOptions and
>>> >>>> >> spark.executor.extraJavaOptions in spark-defaults.conf is not a
>>> >>>> replacement
>>> >>>> >> for SPARK_JAVA_OPTS:
>>> >>>> >>
>>> >>>> >>
>>> >>>> >> $ cat conf/spark-defaults.conf
>>> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
>>> >>>> >>
>>> >>>> >> $ ./bin/spark-shell
>>> >>>> >>
>>> >>>> >> scala> System.getProperty("foo.bar.baz")
>>> >>>> >> res0: String = null
>>> >>>> >>
>>> >>>> >>
>>> >>>> >> $ ./bin/spark-shell --driver-java-options "-Dfoo.bar.baz=23"
>>> >>>> >>
>>> >>>> >> scala> System.getProperty("foo.bar.baz")
>>> >>>> >> res0: String = 23
>>> >>>> >>
>>> >>>> >>
>>> >>>> >> Looking through the shell scripts for spark-submit and
>>> spark-class, I
>>> >>>> can
>>> >>>> >> see why this is; parsing spark-defaults.conf from bash could be
>>> >>>> brittle.
>>> >>>> >>
>>> >>>> >> But from an ergonomic point of view, it's a step back to go from a
>>> >>>> >> set-it-and-forget-it configuration in spark-env.sh, to requiring
>>> >>>> command
>>> >>>> >> line arguments.
>>> >>>> >>
>>> >>>> >> I can solve this with an ad-hoc script to wrap spark-shell with
>>> the
>>> >>>> >> appropriate arguments, but I wanted to bring the issue up to see
>>> if
>>> >>>> anyone
>>> >>>> >> else had run into it,
>>> >>>> >> or had any direction for a general solution (beyond parsing java
>>> >>>> properties
>>> >>>> >> files from bash).
>>> >>>> >
>>> >>>> >
>>> >>>> >
>>> >>>> > --
>>> >>>> > Marcelo
>>> >>>>
>>> >>>
>>> >>>
>>>
>>
>>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org