You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by vonnagy <iv...@vadio.com> on 2016/10/06 16:20:47 UTC

Submit job with driver options in Mesos Cluster mode

I am trying to submit a job to spark running in a Mesos cluster. We need to
pass custom java options to the driver and executor for configuration, but
the driver task never includes the options. Here is an example submit. 

GC_OPTS="-XX:+UseConcMarkSweepGC 
         -verbose:gc -XX:+PrintGCTimeStamps -Xloggc:$appdir/gc.out 
         -XX:MaxPermSize=512m 
         -XX:+CMSClassUnloadingEnabled " 

EXEC_PARAMS="-Dloglevel=DEBUG -Dkafka.broker-address=${KAFKA_ADDRESS}
-Dredis.master=${REDIS_MASTER} -Dredis.port=${REDIS_PORT} 

spark-submit \ 
  --name client-events-intake \ 
  --class ClientEventsApp \ 
  --deploy-mode cluster \ 
  --driver-java-options "${EXEC_PARAMS} ${GC_OPTS}" \ 
  --conf "spark.ui.killEnabled=true" \ 
  --conf "spark.mesos.coarse=true" \ 
  --conf "spark.driver.extraJavaOptions=${EXEC_PARAMS}" \ 
  --conf "spark.executor.extraJavaOptions=${EXEC_PARAMS}" \ 
  --master mesos://someip:7077 \ 
  --verbose \ 
  some.jar 

When the driver task runs in Mesos it is creating the following command: 

sh -c 'cd spark-1*;  bin/spark-submit --name client-events-intake --class
ClientEventsApp --master mesos://someip:5050 --driver-cores 1.0
--driver-memory 512M ../some.jar ' 

There are no options for the driver here, thus the driver app blows up
because it can't find the java options. However, the environment variables
contain the executor options: 

SPARK_EXECUTOR_OPTS -> -Dspark.executor.extraJavaOptions=-Dloglevel=DEBUG
... 

Any help would be great. I know that we can set some "spark.*" settings in
default configs, but these are not necessarily spark related. This is not an
issue when running the same logic outside of a Mesos cluster in Spark
standalone mode. 

Thanks! 



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Submit-job-with-driver-options-in-Mesos-Cluster-mode-tp27853.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Submit job with driver options in Mesos Cluster mode

Posted by Michael Gummelt <mg...@mesosphere.io>.
Can you check if this JIRA is relevant?
https://issues.apache.org/jira/browse/SPARK-2608

If not, can you make a new one?

On Thu, Oct 27, 2016 at 10:27 PM, Rodrick Brown <ro...@orchard-app.com>
wrote:

> Try setting the values in $SPARK_HOME/conf/spark-defaults.conf
>
> i.e.
>
> $ egrep 'spark.(driver|executor).extra' /data/orchard/spark-2.0.1/
> conf/spark-defaults.conf
> spark.executor.extraJavaOptions     -Duser.timezone=UTC
> -Xloggc:garbage-collector.log
> spark.driver.extraJavaOptions   -Duser.timezone=UTC
> -Xloggc:garbage-collector.log
>
> --
>
> [image: Orchard Platform] <http://www.orchardplatform.com/>
>
> Rodrick Brown / DevOPs Engineer
> +1 917 445 6839 / rodrick@orchardplatform.com
> <ch...@orchardplatform.com>
>
> Orchard Platform
> 101 5th Avenue, 4th Floor, New York, NY 10003
> http://www.orchardplatform.com
>
> Orchard Blog <http://www.orchardplatform.com/blog/> | Marketplace Lending
> Meetup <http://www.meetup.com/Peer-to-Peer-Lending-P2P/>
>
> On Oct 6, 2016, at 12:20 PM, vonnagy <iv...@vadio.com> wrote:
>
> I am trying to submit a job to spark running in a Mesos cluster. We need to
> pass custom java options to the driver and executor for configuration, but
> the driver task never includes the options. Here is an example submit.
>
> GC_OPTS="-XX:+UseConcMarkSweepGC
>         -verbose:gc -XX:+PrintGCTimeStamps -Xloggc:$appdir/gc.out
>         -XX:MaxPermSize=512m
>         -XX:+CMSClassUnloadingEnabled "
>
> EXEC_PARAMS="-Dloglevel=DEBUG -Dkafka.broker-address=${KAFKA_ADDRESS}
> -Dredis.master=${REDIS_MASTER} -Dredis.port=${REDIS_PORT}
>
> spark-submit \
>  --name client-events-intake \
>  --class ClientEventsApp \
>  --deploy-mode cluster \
>  --driver-java-options "${EXEC_PARAMS} ${GC_OPTS}" \
>  --conf "spark.ui.killEnabled=true" \
>  --conf "spark.mesos.coarse=true" \
>  --conf "spark.driver.extraJavaOptions=${EXEC_PARAMS}" \
>  --conf "spark.executor.extraJavaOptions=${EXEC_PARAMS}" \
>  --master mesos://someip:7077 \
>  --verbose \
>  some.jar
>
> When the driver task runs in Mesos it is creating the following command:
>
> sh -c 'cd spark-1*;  bin/spark-submit --name client-events-intake --class
> ClientEventsApp --master mesos://someip:5050 --driver-cores 1.0
> --driver-memory 512M ../some.jar '
>
> There are no options for the driver here, thus the driver app blows up
> because it can't find the java options. However, the environment variables
> contain the executor options:
>
> SPARK_EXECUTOR_OPTS -> -Dspark.executor.extraJavaOptions=-Dloglevel=DEBUG
> ...
>
> Any help would be great. I know that we can set some "spark.*" settings in
> default configs, but these are not necessarily spark related. This is not
> an
> issue when running the same logic outside of a Mesos cluster in Spark
> standalone mode.
>
> Thanks!
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Submit-job-with-driver-options-in-
> Mesos-Cluster-mode-tp27853.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
>
> *NOTICE TO RECIPIENTS*: This communication is confidential and intended
> for the use of the addressee only. If you are not an intended recipient of
> this communication, please delete it immediately and notify the sender by
> return email. Unauthorized reading, dissemination, distribution or copying
> of this communication is prohibited. This communication does not constitute
> an offer to sell or a solicitation of an indication of interest to purchase
> any loan, security or any other financial product or instrument, nor is it
> an offer to sell or a solicitation of an indication of interest to purchase
> any products or services to any persons who are prohibited from receiving
> such information under applicable law. The contents of this communication
> may not be accurate or complete and are subject to change without notice.
> As such, Orchard App, Inc. (including its subsidiaries and affiliates,
> "Orchard") makes no representation regarding the accuracy or completeness
> of the information contained herein. The intended recipient is advised to
> consult its own professional advisors, including those specializing in
> legal, tax and accounting matters. Orchard does not provide legal, tax or
> accounting advice.
>



-- 
Michael Gummelt
Software Engineer
Mesosphere

Re: Submit job with driver options in Mesos Cluster mode

Posted by Rodrick Brown <ro...@orchard-app.com>.
Try setting the values in $SPARK_HOME/conf/spark-defaults.conf 

i.e. 

$ egrep 'spark.(driver|executor).extra' /data/orchard/spark-2.0.1/conf/spark-defaults.conf
spark.executor.extraJavaOptions    	-Duser.timezone=UTC -Xloggc:garbage-collector.log
spark.driver.extraJavaOptions 	   	-Duser.timezone=UTC -Xloggc:garbage-collector.log

-- 
 <http://www.orchardplatform.com/>
Rodrick Brown / DevOPs Engineer 
+1 917 445 6839 / rodrick@orchardplatform.com <ma...@orchardplatform.com>
Orchard Platform 
101 5th Avenue, 4th Floor, New York, NY 10003 
http://www.orchardplatform.com <http://www.orchardplatform.com/>
Orchard Blog <http://www.orchardplatform.com/blog/> | Marketplace Lending Meetup <http://www.meetup.com/Peer-to-Peer-Lending-P2P/>
> On Oct 6, 2016, at 12:20 PM, vonnagy <iv...@vadio.com> wrote:
> 
> I am trying to submit a job to spark running in a Mesos cluster. We need to
> pass custom java options to the driver and executor for configuration, but
> the driver task never includes the options. Here is an example submit. 
> 
> GC_OPTS="-XX:+UseConcMarkSweepGC 
>         -verbose:gc -XX:+PrintGCTimeStamps -Xloggc:$appdir/gc.out 
>         -XX:MaxPermSize=512m 
>         -XX:+CMSClassUnloadingEnabled " 
> 
> EXEC_PARAMS="-Dloglevel=DEBUG -Dkafka.broker-address=${KAFKA_ADDRESS}
> -Dredis.master=${REDIS_MASTER} -Dredis.port=${REDIS_PORT} 
> 
> spark-submit \ 
>  --name client-events-intake \ 
>  --class ClientEventsApp \ 
>  --deploy-mode cluster \ 
>  --driver-java-options "${EXEC_PARAMS} ${GC_OPTS}" \ 
>  --conf "spark.ui.killEnabled=true" \ 
>  --conf "spark.mesos.coarse=true" \ 
>  --conf "spark.driver.extraJavaOptions=${EXEC_PARAMS}" \ 
>  --conf "spark.executor.extraJavaOptions=${EXEC_PARAMS}" \ 
>  --master mesos://someip:7077 \ 
>  --verbose \ 
>  some.jar 
> 
> When the driver task runs in Mesos it is creating the following command: 
> 
> sh -c 'cd spark-1*;  bin/spark-submit --name client-events-intake --class
> ClientEventsApp --master mesos://someip:5050 --driver-cores 1.0
> --driver-memory 512M ../some.jar ' 
> 
> There are no options for the driver here, thus the driver app blows up
> because it can't find the java options. However, the environment variables
> contain the executor options: 
> 
> SPARK_EXECUTOR_OPTS -> -Dspark.executor.extraJavaOptions=-Dloglevel=DEBUG
> ... 
> 
> Any help would be great. I know that we can set some "spark.*" settings in
> default configs, but these are not necessarily spark related. This is not an
> issue when running the same logic outside of a Mesos cluster in Spark
> standalone mode. 
> 
> Thanks! 
> 
> 
> 
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Submit-job-with-driver-options-in-Mesos-Cluster-mode-tp27853.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> 


-- 
*NOTICE TO RECIPIENTS*: This communication is confidential and intended for 
the use of the addressee only. If you are not an intended recipient of this 
communication, please delete it immediately and notify the sender by return 
email. Unauthorized reading, dissemination, distribution or copying of this 
communication is prohibited. This communication does not constitute an 
offer to sell or a solicitation of an indication of interest to purchase 
any loan, security or any other financial product or instrument, nor is it 
an offer to sell or a solicitation of an indication of interest to purchase 
any products or services to any persons who are prohibited from receiving 
such information under applicable law. The contents of this communication 
may not be accurate or complete and are subject to change without notice. 
As such, Orchard App, Inc. (including its subsidiaries and affiliates, 
"Orchard") makes no representation regarding the accuracy or completeness 
of the information contained herein. The intended recipient is advised to 
consult its own professional advisors, including those specializing in 
legal, tax and accounting matters. Orchard does not provide legal, tax or 
accounting advice.

Re: Submit job with driver options in Mesos Cluster mode

Posted by vonnagy <iv...@vadio.com>.
We were using 1.6, but now we are on 2.0.1. Both versions show the same
issue.

I dove deep into the Spark code and have identified that the extra java
options are /not/ added to the process on the executors. At this point, I
believe you have to use spark-defaults.conf to set any values that will be
used. The problem for us, is that these extra Java options are not the same
for each job that is submitted and thus can't put the values in
spark-defaults.conf.

Ivan



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Submit-job-with-driver-options-in-Mesos-Cluster-mode-tp27853p27973.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Submit job with driver options in Mesos Cluster mode

Posted by csakoda <cm...@gmail.com>.
I'm seeing something very similar in my own Mesos/Spark Cluster.

High level summary: When I use `--deploy-mode cluster`, java properties that
I pass to my driver via `spark.driver.extraJavaOptions` are not available to
the driver.  I've confirmed this by inspecting the output of
`System.getProperties` and the environment variables from within the driver.  

I also see SPARK_EXECUTOR_OPTS showing the values that I wish were available
as java system properties.

Did you find anything that helped you either understand or resolve this
issue?  I'm still stuck.

What version of spark + mesos are you using?  1.6.1 and
1.0.1-2.0.93.ubuntu1404 respectively over here.

The only helpful bit I've found is that if I set the properties I care about
in spark-defaults.conf on all the spark workers, they appear as desired in
the drivers java properties.  




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Submit-job-with-driver-options-in-Mesos-Cluster-mode-tp27853p27972.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org