You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Pedro Rodriguez (JIRA)" <ji...@apache.org> on 2014/11/14 18:52:34 UTC

[jira] [Created] (SPARK-4408) Behavior difference between spark-submit conf vs cmd line args

Pedro Rodriguez created SPARK-4408:
--------------------------------------

             Summary: Behavior difference between spark-submit conf vs cmd line args
                 Key: SPARK-4408
                 URL: https://issues.apache.org/jira/browse/SPARK-4408
             Project: Spark
          Issue Type: Bug
          Components: Deploy, Documentation
    Affects Versions: 1.1.0, 1.2.0
            Reporter: Pedro Rodriguez
            Priority: Minor


There seems to be a difference between the behavior of bin/spark-submit with using command line arguments vs configuration file. It looks like either a bug or at least where documentation could be clearer about this difference.

Steps to Replicate:
1. Submit a job with a command similar to:
bin/spark-submit --class nipslda.NipsLda --master local --conf spark.executor.memory=2g --conf spark.driver.memory=2g --verbose ~/Code/nips-lda/target/scala-2.10/nips-lda-assembly-0.1.jar
2. Navigate to SparkUI.
3. Environment tab lists driver and executor memory correctly.
4. Executor tab shows memory as ~260MB (my case) or default JVM limit
5. Write memory arguments to conf/spark-defaults.conf
6. Run same command without memory arguments
7. SparkUI executor tab correctly shows memory setting

Looking at spark-submit, it includes this passage in the comments:

# For client mode, the driver will be launched in the same JVM that launches
# SparkSubmit, so we may need to read the properties file for any extra class
# paths, library paths, java options and memory early on. Otherwise, it will
# be too late by the time the driver JVM has started.

Based on this, it seems that JVM parameters for spark-submit are used for the job itself when in client mode. Effectively, it makes it impossible to use the command line argument setting method to change JVM parameters since the JVM is already launched.

This seems like unexpected/undesirable behavior which could be fixed or docs could be changed to better reflect how this works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org