You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by witgo <gi...@git.apache.org> on 2014/09/13 09:37:49 UTC

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

GitHub user witgo opened a pull request:

    https://github.com/apache/spark/pull/2379

    [SPARK-2098] All Spark processes should support spark-defaults.conf, config file

    This is another implementation about #1256
    cc @andrewor14 @vanzin

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/witgo/spark SPARK-2098-new

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2379.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2379
    
----
commit 217280a18c2e4df4c7531b7387c8ec7bf03efb8b
Author: GuoQiang Li <wi...@qq.com>
Date:   2014-09-13T07:26:36Z

    All Spark processes should support spark-defaults.conf, config file

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001383
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,15 +48,18 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    -  }
       if (System.getenv("SPARK_WORKER_DIR") != null) {
         workDir = System.getenv("SPARK_WORKER_DIR")
       }
     
       parse(args.toList)
     
    +  propertiesFile = Utils.loadDefaultSparkProperties(conf, propertiesFile)
    --- End diff --
    
    Same here, add a comment to explain ordering


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18548720
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1387,6 +1387,54 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load default Spark properties from the given file. If no file is provided,
    +   * use the common defaults file. This mutates state in the given SparkConf and
    +   * in this JVM's system properties if the config specified in the file is not
    +   * already set. Return the path of the properties file used.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf, filePath: String = null): String = {
    +    val path = Option(filePath).getOrElse(getDefaultPropertiesFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the path of the default Spark properties file. */
    +  def getDefaultPropertiesFile(): String = {
    +    sys.env.get("SPARK_CONF_DIR")
    +      .orElse(sys.env.get("SPARK_HOME").map { t => s"$t${File.separator}conf"})
    +      .map { t => new File(s"$t${File.separator}spark-defaults.conf")}
    --- End diff --
    
    nit: space before `}`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001245
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/master/MasterArguments.scala ---
    @@ -38,12 +39,14 @@ private[spark] class MasterArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_MASTER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_MASTER_WEBUI_PORT").toInt
       }
    +
    +  parse(args.toList)
    +  propertiesFile = Utils.loadDefaultSparkProperties(conf, propertiesFile)
    --- End diff --
    
    We should add a comment that explains why this line must happen before we access `conf` (i.e. because it mutates `conf`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56253206
  
    Jenkins, retest this please. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-59159165
  
    Cool!  Thanks @andrewor14  @vanzin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17808036
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,55 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * If file is null, then use the getDefaultConfigFile method's return.
    +   */
    +  private[spark] def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    +    val path = Option(file).getOrElse(getDefaultConfigFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the default spark configuration file */
    +  private[spark] def getDefaultConfigFile(): String = {
    +    val s = File.separator
    +    val configFile = sys.env.get("SPARK_CONF_DIR").map(t =>
    --- End diff --
    
    Can you reformat this so that the two `.map` and the `.filter` are on separate lines?
    
        foo.map {}
          .filter {}
          .map {}



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55986568
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20507/consoleFull) for   PR 2379 at commit [`c1824d2`](https://github.com/apache/spark/commit/c1824d23222e2aecca415426c193ed3529a3ac73).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001361
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -44,10 +51,17 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
         }
       }
     
    +  propertiesFile = Utils.loadDefaultSparkProperties(conf, propertiesFile)
    +
       private def printUsageAndExit(exitCode: Int) {
         System.err.println(
           """
    -      |Usage: HistoryServer
    +      |Usage: HistoryServer [options]
    +      |
    +      |Options:
    +      |  -d DIR, --dir DIR           Directory where app logs are stored.
    +      |  --properties-file FILE      Path to a file from which to load extra properties. If not
    +      |                              specified, this will look for conf/spark-defaults.conf.
    --- End diff --
    
    "Path to a custom Spark properties file. Default is conf/spark-defaults.conf."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-57040598
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20897/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56819375
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20804/consoleFull) for   PR 2379 at commit [`d0991dc`](https://github.com/apache/spark/commit/d0991dcab5e7ee6e10cb2723ea5aa989827b2a45).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17646251
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,14 +48,15 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    +  if (System.getenv("spark.worker.ui.port")!= null) {
    +    webUiPort = System.getenv("spark.worker.ui.port").toInt
    --- End diff --
    
    是这样的:在spark的源码中有非常多这样的代码:` val conf = new SparkConf` 例如:
    [SparkHadoopUtil.scala#L38](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L38) ,[EventLoggingListener.scala#L216](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala#L216),
    这要求我们必须要设置`system property `. 
    
    这样的话如果不改变初始话顺序, 我们需要同时设置`conf` 和 `sys.props`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17614926
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,14 +48,15 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    +  if (System.getenv("spark.worker.ui.port")!= null) {
    +    webUiPort = System.getenv("spark.worker.ui.port").toInt
    --- End diff --
    
    Wait, this changes the semantics of this property. `spark.worker.ui.port` is not an environment variable but a Spark property. Can you change this back?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17615426
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala ---
    @@ -360,14 +360,20 @@ private[spark] class Worker(
     private[spark] object Worker extends Logging {
       def main(argStrings: Array[String]) {
         SignalLogger.register(log)
    +    val args = new WorkerArguments(argStrings)
    +    Option(args.propertiesFile).foreach { file =>
    +      for ((k, v) <- Utils.getPropertiesFromFile(file) if k.startsWith("spark.")) {
    +        sys.props.getOrElseUpdate(k,v)
    +      }
    +    }
         val conf = new SparkConf
    -    val args = new WorkerArguments(argStrings, conf)
    -    val (actorSystem, _) = startSystemAndActor(args.host, args.port, args.webUiPort, args.cores,
    -      args.memory, args.masters, args.workDir)
    --- End diff --
    
    As I mentioned in the main discussion thread, if we do `conf.set` here instead of `sys.props.getOrElseUpdate` then we don't have to change the ordering.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17610954
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
    @@ -167,14 +167,19 @@ class HistoryServer(
      * This launches the HistoryServer as a Spark daemon.
      */
     object HistoryServer extends Logging {
    -  private val conf = new SparkConf
     
       val UI_PATH_PREFIX = "/history"
     
       def main(argStrings: Array[String]) {
         SignalLogger.register(log)
    -    initSecurity()
    -    new HistoryServerArguments(conf, argStrings)
    +    val arg = new HistoryServerArguments(argStrings)
    +    Option(arg.propertiesFile).foreach { file =>
    --- End diff --
    
    This block is repeated 3 times, maybe add it to Utils? Also, wonder if this would look better:
    
        Option(arg.propertiesFile)
          .map { f => Utils.getPropertiesFromFile(f) }
          .filter { case (k, v) => k.startsWith("spark.") }
          .foreach { case (k, v) => sys.props.getOrElseUpdate(k, v) }


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55916931
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20468/consoleFull) for   PR 2379 at commit [`dae0120`](https://github.com/apache/spark/commit/dae012094195737e28b7cbbb6d4ab1f1ab0d01b3).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17807955
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,55 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * If file is null, then use the getDefaultConfigFile method's return.
    +   */
    +  private[spark] def loadDefaultSparkProperties(conf: SparkConf,
    --- End diff --
    
    super nit: modifier is not necessary since class is already `private[spark]`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-59158756
  
    Alright, I'm merging this into master. Thanks @witgo!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17745881
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -44,30 +50,19 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
         }
       }
     
    +  // Use common defaults file, if not specified by user
    +  propertiesFile = Option(propertiesFile).getOrElse(Utils.getDefaultConfigFile)
    +
       private def printUsageAndExit(exitCode: Int) {
         System.err.println(
           """
    -      |Usage: HistoryServer
    -      |
    -      |Configuration options can be set by setting the corresponding JVM system property.
    -      |History Server options are always available; additional options depend on the provider.
    -      |
    -      |History Server options:
    -      |
    -      |  spark.history.ui.port              Port where server will listen for connections
    -      |                                     (default 18080)
    -      |  spark.history.acls.enable          Whether to enable view acls for all applications
    -      |                                     (default false)
    -      |  spark.history.provider             Name of history provider class (defaults to
    -      |                                     file system-based provider)
    -      |  spark.history.retainedApplications Max number of application UIs to keep loaded in memory
    -      |                                     (default 50)
    -      |FsHistoryProvider options:
    -      |
    -      |  spark.history.fs.logDirectory      Directory where app logs are stored (required)
    -      |  spark.history.fs.updateInterval    How often to reload log data from storage (in seconds,
    -      |                                     default 10)
    -      |""".stripMargin)
    --- End diff --
    
    I still think we should add it back. It's more convenient because some of these options are actually necessary (`spark.history.fs.logDirectory`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17808099
  
    --- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
    @@ -297,4 +300,21 @@ class UtilsSuite extends FunSuite {
         }
       }
     
    +  test("loading properties from file") {
    +    val outFile = File.createTempFile("test-load-spark-properties", "test")
    +    try {
    +      System.setProperty("spark.test.fileNameLoadB", "2")
    +      Files.write("spark.test.fileNameLoadA true\n" +
    +        "spark.test.fileNameLoadB 1\n", outFile, Charset.forName("UTF-8"))
    --- End diff --
    
    super nit: Guava has `com.google.common.base.Charsets.UTF_8` which avoids the hardcoded string.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17641601
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,14 +48,15 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    +  if (System.getenv("spark.worker.ui.port")!= null) {
    +    webUiPort = System.getenv("spark.worker.ui.port").toInt
    --- End diff --
    
    If the spark-defaults.conf file contains `spark.worker.ui.port` configuration, how should we do?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18552640
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -32,11 +34,16 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
           case ("--dir" | "-d") :: value :: tail =>
             logDir = value
             conf.set("spark.history.fs.logDirectory", value)
    +        System.setProperty("spark.history.fs.logDirectory", value)
    --- End diff --
    
    I had the same thought and almost made the same comment, but then I figured it might be better to keep whatever set through the conf consistent with the sys props. Not a huge deal either way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001181
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -44,10 +51,17 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
         }
       }
     
    +  propertiesFile = Utils.loadDefaultSparkProperties(conf, propertiesFile)
    +
       private def printUsageAndExit(exitCode: Int) {
         System.err.println(
           """
    -      |Usage: HistoryServer
    +      |Usage: HistoryServer [options]
    +      |
    +      |Options:
    +      |  -d DIR, --dir DIR           Directory where app logs are stored.
    --- End diff --
    
    Ah, yeah. It works but is not the preferred way, so it's better not to put this back in.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55848101
  
    @witgo Not sure if I understand what you mean. The changes I am proposing actually changes less code (compared to Spark master), not more. Right now a problem in this PR is that the ordering between `new SparkConf` and `WorkerArgument` (or other arguments) is changed, but I'm saying it should not be. All I'm saying is that within `WorkerArgument`, replace `sys.props.getOrElseUpdate` with `conf.setIfMissing`.  Do you understand?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17743007
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -44,30 +51,27 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
         }
       }
     
    +  // Use common defaults file, if not specified by user
    +  propertiesFile = Option(propertiesFile).getOrElse(Utils.getDefaultConfigFile)
    --- End diff --
    
    I know I wrote this in the bug, but I'm wondering if this is really OK. Before, except for spark-submit, no daemons would read the default config file. So this has the potential to add config data that might make things behave differently when running the new Spark.
    
    Just a thought, though. I think the new behavior makes more sense, but it might trip some people.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58255299
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21401/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-57040105
  
    @andrewor14  The code has been updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001681
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    --- End diff --
    
    Put this on the line above


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58270876
  
    Looks like there is a legitimate test failure though, can you resolve this @witgo?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001538
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/master/MasterArguments.scala ---
    @@ -38,12 +39,14 @@ private[spark] class MasterArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_MASTER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_MASTER_WEBUI_PORT").toInt
       }
    +
    +  parse(args.toList)
    +  propertiesFile = Utils.loadDefaultSparkProperties(conf, propertiesFile)
    --- End diff --
    
    Maybe something like
    ```
    // This mutates conf, so all accesses to it must be made after this line.
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17744702
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,14 +48,25 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    -  }
       if (System.getenv("SPARK_WORKER_DIR") != null) {
         workDir = System.getenv("SPARK_WORKER_DIR")
       }
     
       parse(args.toList)
    +  propertiesFile = Option(propertiesFile).getOrElse(Utils.getDefaultConfigFile)
    +
    +  Option(propertiesFile).foreach { file =>
    +    Utils.getPropertiesFromFile(file).filter { case (k, v) =>
    +      k.startsWith("spark.")
    +    }.foreach { case (k, v) =>
    +      conf.setIfMissing(k, v)
    +      sys.props.getOrElseUpdate(k, v)
    +    }
    +  }
    --- End diff --
    
    I don't want to put too many things in a Utils's method. The method should be kept simple.
    How about this?
    ```scala
      def loadSparkProperties(properties: Map[String, String],
        conf: SparkConf): Unit = {
        properties.filter { case (k, v) =>
          k.startsWith("spark.")
        }.foreach { case (k, v) =>
          conf.setIfMissing(k, v)
          sys.props.getOrElseUpdate(k, v)
        }
      }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56253720
  
    Don't know why no trigger Jenkins retest thi,the code should be no problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56811320
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20804/consoleFull) for   PR 2379 at commit [`d0991dc`](https://github.com/apache/spark/commit/d0991dcab5e7ee6e10cb2723ea5aa989827b2a45).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17807849
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -44,10 +51,18 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
         }
       }
     
    +  // Use common defaults file, if not specified by user
    --- End diff --
    
    Comment really belongs in `loadDefaultSparkProperties`, not here (if it's not already there).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17645691
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -25,7 +25,7 @@ import org.apache.spark.SparkConf
     /**
      * Command-line parser for the worker.
      */
    -private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
    --- End diff --
    
    As mentioned elsewhere, we shouldn't `SparkConf` as an argument.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17696773
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,14 +48,15 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    +  if (System.getenv("spark.worker.ui.port")!= null) {
    +    webUiPort = System.getenv("spark.worker.ui.port").toInt
    --- End diff --
    
    Ha, sorry about that.
    
    === TRANSLATION ===
    @witgo: In Spark, we do the following a lot: `val conf = new SparkConf`. For example, SparkHadoopUtil.scala#L38 ,EventLoggingListener.scala#L216. This requires us to set `system property`. This way, if we don't change the initialization order, we need to set both `conf` and `sys.props`.
    
    @andrewor14 Yes, I understand that part. What I mean is if we call `conf.setIfMissing` after initializing `SparkConf`, then we don't need to set the corresponding configs through `sys.props` *before* we initialize it. This way we can preserve the existing initialization order. At the same time, because `spark-defaults.conf` can also set `spark.worker.ui.port`, we need to first load `spark-defaults.conf` before accessing `conf.get("spark.worker.ui.port")`. All of this logic can be implemented in `WorkerArgument`, then `Worker` doesn't need to change much. Do you understand what I mean?
    === END ===


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58260972
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21404/consoleFull) for   PR 2379 at commit [`80b0b12`](https://github.com/apache/spark/commit/80b0b12abd15620890523af2d455fef3902ad545).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001605
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    --- End diff --
    
    Load default Spark properties from the given file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-57040212
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20899/consoleFull) for   PR 2379 at commit [`b9320ad`](https://github.com/apache/spark/commit/b9320adac216faab345d1411370a5a0d95188c19).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17745821
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,14 +48,25 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    -  }
       if (System.getenv("SPARK_WORKER_DIR") != null) {
         workDir = System.getenv("SPARK_WORKER_DIR")
       }
     
       parse(args.toList)
    +  propertiesFile = Option(propertiesFile).getOrElse(Utils.getDefaultConfigFile)
    +
    +  Option(propertiesFile).foreach { file =>
    +    Utils.getPropertiesFromFile(file).filter { case (k, v) =>
    +      k.startsWith("spark.")
    +    }.foreach { case (k, v) =>
    +      conf.setIfMissing(k, v)
    +      sys.props.getOrElseUpdate(k, v)
    +    }
    +  }
    --- End diff --
    
    +1 on factoring it out into `Utils`. These few lines are identical especially across `Master` and `Worker`. Though maybe I would call it something else, like `loadDefaultSparkProperties` (the "default" here is important because we call `setIfMissing` instead of `set`, and `getOrElseUpdate` instead of just overriding the system property).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18100045
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    +    val path = Option(file).getOrElse(getDefaultConfigFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the default spark configuration file */
    +  def getDefaultConfigFile(): String = {
    +    val s = File.separator
    +    val configFile = sys.env.get("SPARK_CONF_DIR")
    +      .map(t => new File(s"$t${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    +
    +    configFile.getOrElse(sys.env.get("SPARK_HOME")
    +      .map(t => new File(s"${t}${s}conf${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    --- End diff --
    
    Why not? We add "/conf" to whatever `SPARK_HOME` is set to


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58259744
  
    LGTM. Another comments @vanzin?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17615106
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -44,30 +50,19 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
         }
       }
     
    +  // Use common defaults file, if not specified by user
    +  propertiesFile = Option(propertiesFile).getOrElse(Utils.getDefaultConfigFile)
    +
       private def printUsageAndExit(exitCode: Int) {
         System.err.println(
           """
    -      |Usage: HistoryServer
    -      |
    -      |Configuration options can be set by setting the corresponding JVM system property.
    -      |History Server options are always available; additional options depend on the provider.
    -      |
    -      |History Server options:
    -      |
    -      |  spark.history.ui.port              Port where server will listen for connections
    -      |                                     (default 18080)
    -      |  spark.history.acls.enable          Whether to enable view acls for all applications
    -      |                                     (default false)
    -      |  spark.history.provider             Name of history provider class (defaults to
    -      |                                     file system-based provider)
    -      |  spark.history.retainedApplications Max number of application UIs to keep loaded in memory
    -      |                                     (default 50)
    -      |FsHistoryProvider options:
    -      |
    -      |  spark.history.fs.logDirectory      Directory where app logs are stored (required)
    -      |  spark.history.fs.updateInterval    How often to reload log data from storage (in seconds,
    -      |                                     default 10)
    -      |""".stripMargin)
    --- End diff --
    
    Why remove all this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17745188
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,14 +48,25 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    -  }
       if (System.getenv("SPARK_WORKER_DIR") != null) {
         workDir = System.getenv("SPARK_WORKER_DIR")
       }
     
       parse(args.toList)
    +  propertiesFile = Option(propertiesFile).getOrElse(Utils.getDefaultConfigFile)
    +
    +  Option(propertiesFile).foreach { file =>
    +    Utils.getPropertiesFromFile(file).filter { case (k, v) =>
    +      k.startsWith("spark.")
    +    }.foreach { case (k, v) =>
    +      conf.setIfMissing(k, v)
    +      sys.props.getOrElseUpdate(k, v)
    +    }
    +  }
    --- End diff --
    
    That makes the call sites too complicated for my taste (since all of them need to deal with "what's the default config file" and "how to load properties into a map"), but no strong opinion. 
    
    Since this method is already so single-purpose (load properties into an existing SparkConf), I feel like my suggestion makes things cleaner.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18548647
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -32,11 +34,16 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
           case ("--dir" | "-d") :: value :: tail =>
             logDir = value
             conf.set("spark.history.fs.logDirectory", value)
    +        System.setProperty("spark.history.fs.logDirectory", value)
    --- End diff --
    
    Is this needed since you're setting the conf above?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-59153778
  
    @andrewor14 The code has been updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17614596
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1352,6 +1352,33 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Seq[(String, String)] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().toSeq.map(k => (k, properties(k).trim))
    +    } catch {
    +      case e: IOException =>
    +        val message = s"Failed when loading Spark properties"
    +        throw new SparkException(message, e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  private[spark] def getDefaultConfigFile: String = {
    +    val s = File.separator
    +    Seq(
    +      sys.env.get("SPARK_CONF_DIR").map(t => new File(s"$t${s}spark-defaults.conf")),
    +      sys.env.get("SPARK_HOME").map(t => new File(s"${t}${s}conf${s}spark-defaults.conf"))).
    +      filter(_.isDefined).map(_.get).find(_.exists).map(_.getAbsolutePath).orNull
    +  }
    --- End diff --
    
    This looks kinda funky. How about the following:
    ```
    sys.env.get("SPARK_CONF_DIR").map(...)
      .orElse(sys.env.get("SPARK_HOME").map(...))
      .map(_.getAbsolutePath)
      .orNull
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56256482
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20605/consoleFull) for   PR 2379 at commit [`ccc9833`](https://github.com/apache/spark/commit/ccc9833a75f5585dc029233d85debf5f02d75aaa).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55848348
  
    @witgo I have added a few more in-line comments to explain what I mean. Let me know if you have any more questions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55982531
  
    OK, the code has been updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001098
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -44,10 +51,17 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
         }
       }
     
    +  propertiesFile = Utils.loadDefaultSparkProperties(conf, propertiesFile)
    +
       private def printUsageAndExit(exitCode: Int) {
         System.err.println(
           """
    -      |Usage: HistoryServer
    +      |Usage: HistoryServer [options]
    +      |
    +      |Options:
    +      |  -d DIR, --dir DIR           Directory where app logs are stored.
    --- End diff --
    
    I believe we deprecated being able to specify the log directory through command line. Maybe we shouldn't document this. Can you confirm @vanzin


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55776666
  
    @vanzin @witgo Actually my intention was not to put everything you load from the properties file into `sys.props`. This is what `SparkSubmit` does, but only because it does not have access to the conf that the application will use. There are a few places (`Worker`, `Master`, `HistoryServer`) where the way we set the properties through `sys.props` right now makes the initialization ordering of `SparkConf` and the respective `*Argument` class sensitive. Instead, it might be simpler if we do `conf.set` for those places instead (we already do this elsewhere in `HistoryServerArgument`).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58260161
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17743439
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,14 +48,25 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    -  }
       if (System.getenv("SPARK_WORKER_DIR") != null) {
         workDir = System.getenv("SPARK_WORKER_DIR")
       }
     
       parse(args.toList)
    +  propertiesFile = Option(propertiesFile).getOrElse(Utils.getDefaultConfigFile)
    +
    +  Option(propertiesFile).foreach { file =>
    +    Utils.getPropertiesFromFile(file).filter { case (k, v) =>
    +      k.startsWith("spark.")
    +    }.foreach { case (k, v) =>
    +      conf.setIfMissing(k, v)
    +      sys.props.getOrElseUpdate(k, v)
    +    }
    +  }
    --- End diff --
    
    This whole block (from L56) is repeated three times in your patch. Could you add a helper method for this instead?
    
    You could have something like:
    
        def loadPropertiesFile(file: Option[String], conf: SparkConf): Map[String, String] = {
          val path = file.getOrElse(getDefaultConfigFile()))
          getPropertiesFromFile(file)
            .filter { case (k, v) =>
              k.startsWith("spark.")
            }
            .foreach { case (k, v) =>
              conf.setIfMissing(k, v)
              sys.props.getOrElseUpdate(k, v)
            }
        }
    
    And this whole block becomes a single method call. If you're creative you could even come up with an API that could be used by the SparkSubmit code too.
    
        


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56232823
  
    Jenkins complained about some scalastyle issue, can you double check that you didn't cause it? Otherwise looks good (with a few nits).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18121155
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    +    val path = Option(file).getOrElse(getDefaultConfigFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the default spark configuration file */
    +  def getDefaultConfigFile(): String = {
    +    val s = File.separator
    +    val configFile = sys.env.get("SPARK_CONF_DIR")
    +      .map(t => new File(s"$t${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    +
    +    configFile.getOrElse(sys.env.get("SPARK_HOME")
    +      .map(t => new File(s"${t}${s}conf${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    --- End diff --
    
    When `sys.env.get("SPARK_CONF_DIR")` returns the value is not null,
    `orElse ` will be ignored


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55987569
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20519/consoleFull) for   PR 2379 at commit [`c4a77e9`](https://github.com/apache/spark/commit/c4a77e907052322eeec604d21c729cd1c9c6a1f3).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55484694
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20260/consoleFull) for   PR 2379 at commit [`217280a`](https://github.com/apache/spark/commit/217280a18c2e4df4c7531b7387c8ec7bf03efb8b).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55982667
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20507/consoleFull) for   PR 2379 at commit [`c1824d2`](https://github.com/apache/spark/commit/c1824d23222e2aecca415426c193ed3529a3ac73).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17614779
  
    --- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
    @@ -297,4 +298,16 @@ class UtilsSuite extends FunSuite {
         }
       }
     
    +  test("loading properties from file") {
    +    val outFile = File.createTempFile("test-load-spark-properties", "test")
    +    try {
    +      Files.write("spark.test.fileNameLoad true\n", outFile, Charset.forName("UTF-8"))
    --- End diff --
    
    Can you write more than 1 config to the file?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001577
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -122,7 +130,9 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
           "  -i HOST, --ip IP         Hostname to listen on (deprecated, please use --host or -h)\n" +
           "  -h HOST, --host HOST     Hostname to listen on\n" +
           "  -p PORT, --port PORT     Port to listen on (default: random)\n" +
    -      "  --webui-port PORT        Port for web UI (default: 8081)")
    +      "  --webui-port PORT        Port for web UI (default: 8081)\n" +
    +      "  --properties-file FILE   Path to a file from which to load extra properties. If not \n" +
    +      "                           specified, this will look for conf/spark-defaults.conf.")
    --- End diff --
    
    Update wording


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17681350
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,14 +48,15 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    +  if (System.getenv("spark.worker.ui.port")!= null) {
    +    webUiPort = System.getenv("spark.worker.ui.port").toInt
    --- End diff --
    
    I didn't quite get that last part, could you guys clarify that? :-p


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55838336
  
    @andrewor14 @vanzin   This is a very good idea, we create a SparkConf from the beginning and we will use that for the entire process. But it need to modify a lot of code ,Maybe we need to create a separate jira


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-57893391
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21285/consoleFull) for   PR 2379 at commit [`80b0b12`](https://github.com/apache/spark/commit/80b0b12abd15620890523af2d455fef3902ad545).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17645628
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,14 +48,15 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    +  if (System.getenv("spark.worker.ui.port")!= null) {
    +    webUiPort = System.getenv("spark.worker.ui.port").toInt
    --- End diff --
    
    We should do `conf.setIfMissing`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001637
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    --- End diff --
    
    Load default Spark properties from the given file. If no file is provided, use the common defaults file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17614771
  
    --- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
    @@ -297,4 +298,16 @@ class UtilsSuite extends FunSuite {
         }
       }
     
    +  test("loading properties from file") {
    +    val outFile = File.createTempFile("test-load-spark-properties", "test")
    +    try {
    +      Files.write("spark.test.fileNameLoad true\n", outFile, Charset.forName("UTF-8"))
    +      val properties = Utils.getPropertiesFromFile(outFile.getAbsolutePath)
    +      assert(properties.filter(_._1 == "spark.test.fileNameLoad").size == 1)
    +    } finally {
    +      if (outFile != null) {
    --- End diff --
    
    would this ever be null?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-59151248
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21759/consoleFull) for   PR 2379 at commit [`4ef1cbd`](https://github.com/apache/spark/commit/4ef1cbd76a5a741fe1cad8fb0e707a60b067f7d9).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58271590
  
    I just tested this on the Worker, Master and the HistoryServer and it works as expected. Awesome.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-57041560
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20899/consoleFull) for   PR 2379 at commit [`b9320ad`](https://github.com/apache/spark/commit/b9320adac216faab345d1411370a5a0d95188c19).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56810769
  
    Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58261206
  
    LGTM, just a couple of minor things.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18002009
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    +    val path = Option(file).getOrElse(getDefaultConfigFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the default spark configuration file */
    +  def getDefaultConfigFile(): String = {
    +    val s = File.separator
    +    val configFile = sys.env.get("SPARK_CONF_DIR")
    +      .map(t => new File(s"$t${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    +
    +    configFile.getOrElse(sys.env.get("SPARK_HOME")
    +      .map(t => new File(s"${t}${s}conf${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    --- End diff --
    
    This map and filter is duplicate. We can rewrite this to avoid that:
    ```
    configFile
      .orElse(...)
      .filter(_.isFile)
      .map(_.getAbsolutePath)
      .orNull
    ```
    Then you don't need to do the same map and filter up there in L1402


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001377
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/master/MasterArguments.scala ---
    @@ -83,7 +90,9 @@ private[spark] class MasterArguments(args: Array[String], conf: SparkConf) {
           "  -i HOST, --ip HOST     Hostname to listen on (deprecated, please use --host or -h) \n" +
           "  -h HOST, --host HOST   Hostname to listen on\n" +
           "  -p PORT, --port PORT   Port to listen on (default: 7077)\n" +
    -      "  --webui-port PORT      Port for web UI (default: 8080)")
    +      "  --webui-port PORT      Port for web UI (default: 8080)\n" +
    +      "  --properties-file FILE Path to a file from which to load extra properties. If not \n" +
    +      "                         specified, this will look for conf/spark-defaults.conf.")
    --- End diff --
    
    Same here, update wording as suggested elsewhere


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18011958
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    +    val path = Option(file).getOrElse(getDefaultConfigFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the default spark configuration file */
    +  def getDefaultConfigFile(): String = {
    +    val s = File.separator
    +    val configFile = sys.env.get("SPARK_CONF_DIR")
    +      .map(t => new File(s"$t${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    +
    +    configFile.getOrElse(sys.env.get("SPARK_HOME")
    +      .map(t => new File(s"${t}${s}conf${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    --- End diff --
    
    This code is flawed.  eg: `$SPARK_CONF_DIR/spark-defaults.conf`a directory
    
    How about this?
    
    ```scala
     val s = File.separator
        def getAbsolutePath(filePath: String): String = {
          Option(filePath)
            .map(t => new File(t))
            .filter(_.isFile)
            .map(_.getAbsolutePath).orNull
        }
    
        val configFile = sys.env.get("SPARK_CONF_DIR")
          .map(t => s"$t${s}spark-defaults.conf")
          .map(getAbsolutePath)
    
        configFile.getOrElse(sys.env.get("SPARK_HOME")
          .map(t => s"${t}${s}conf${s}spark-defaults.conf")
          .map(getAbsolutePath)
          .orNull)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001154
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -18,12 +18,14 @@
     package org.apache.spark.deploy.history
     
     import org.apache.spark.SparkConf
    +import org.apache.spark.util.Utils
     
     /**
      * Command-line parser for the master.
      */
     private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]) {
    -  private var logDir: String = null
    +  var logDir: String = null
    +  var propertiesFile: String = null
    --- End diff --
    
    I think these can be private


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55991087
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20519/consoleFull) for   PR 2379 at commit [`c4a77e9`](https://github.com/apache/spark/commit/c4a77e907052322eeec604d21c729cd1c9c6a1f3).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-59147101
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21759/consoleFull) for   PR 2379 at commit [`4ef1cbd`](https://github.com/apache/spark/commit/4ef1cbd76a5a741fe1cad8fb0e707a60b067f7d9).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17644167
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -44,30 +50,19 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
         }
       }
     
    +  // Use common defaults file, if not specified by user
    +  propertiesFile = Option(propertiesFile).getOrElse(Utils.getDefaultConfigFile)
    +
       private def printUsageAndExit(exitCode: Int) {
         System.err.println(
           """
    -      |Usage: HistoryServer
    -      |
    -      |Configuration options can be set by setting the corresponding JVM system property.
    -      |History Server options are always available; additional options depend on the provider.
    -      |
    -      |History Server options:
    -      |
    -      |  spark.history.ui.port              Port where server will listen for connections
    -      |                                     (default 18080)
    -      |  spark.history.acls.enable          Whether to enable view acls for all applications
    -      |                                     (default false)
    -      |  spark.history.provider             Name of history provider class (defaults to
    -      |                                     file system-based provider)
    -      |  spark.history.retainedApplications Max number of application UIs to keep loaded in memory
    -      |                                     (default 50)
    -      |FsHistoryProvider options:
    -      |
    -      |  spark.history.fs.logDirectory      Directory where app logs are stored (required)
    -      |  spark.history.fs.updateInterval    How often to reload log data from storage (in seconds,
    -      |                                     default 10)
    -      |""".stripMargin)
    --- End diff --
    
    I want to spark. history.* instructions in the `docs/monitoring.md` file is better.
    Here is not necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55485998
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20260/consoleFull) for   PR 2379 at commit [`217280a`](https://github.com/apache/spark/commit/217280a18c2e4df4c7531b7387c8ec7bf03efb8b).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class JavaSparkContext(val sc: SparkContext)`
      * `class JavaStreamingContext(val ssc: StreamingContext) extends Closeable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58268969
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21404/consoleFull) for   PR 2379 at commit [`80b0b12`](https://github.com/apache/spark/commit/80b0b12abd15620890523af2d455fef3902ad545).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58304689
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21441/Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-59151254
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21759/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58299213
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21441/consoleFull) for   PR 2379 at commit [`e1bb650`](https://github.com/apache/spark/commit/e1bb650bd8248c0f1d034fee72058a62a3d89f61).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55777084
  
    @andrewor14 makes sense. Yeah, that would simplify the change a lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55778562
  
    Yes thanks you would want to use that one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18121404
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    +    val path = Option(file).getOrElse(getDefaultConfigFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the default spark configuration file */
    +  def getDefaultConfigFile(): String = {
    +    val s = File.separator
    +    val configFile = sys.env.get("SPARK_CONF_DIR")
    +      .map(t => new File(s"$t${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    +
    +    configFile.getOrElse(sys.env.get("SPARK_HOME")
    +      .map(t => new File(s"${t}${s}conf${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    --- End diff --
    
    OK, I see.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001709
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    +    val path = Option(file).getOrElse(getDefaultConfigFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the default spark configuration file */
    --- End diff --
    
    "Return the name of the default Spark configuration file."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56254827
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20604/consoleFull) for   PR 2379 at commit [`b911b54`](https://github.com/apache/spark/commit/b911b54978ceea2518a5bc51c4e62819d2892e6f).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001754
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    +    val path = Option(file).getOrElse(getDefaultConfigFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the default spark configuration file */
    +  def getDefaultConfigFile(): String = {
    --- End diff --
    
    We should call this `getDefaultPropertiesFile` to be consistent


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17614816
  
    --- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
    @@ -297,4 +298,16 @@ class UtilsSuite extends FunSuite {
         }
       }
     
    +  test("loading properties from file") {
    +    val outFile = File.createTempFile("test-load-spark-properties", "test")
    +    try {
    +      Files.write("spark.test.fileNameLoad true\n", outFile, Charset.forName("UTF-8"))
    +      val properties = Utils.getPropertiesFromFile(outFile.getAbsolutePath)
    +      assert(properties.filter(_._1 == "spark.test.fileNameLoad").size == 1)
    --- End diff --
    
    How about...
    ```
    assert(properties.contains("spark.test.fileNameLoad"))
    assert(properties("spark.test.fileNameLoad") === "true")
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18001799
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    --- End diff --
    
    Also I would rename `file` to `filepath` or something


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18002312
  
    --- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
    @@ -297,4 +300,21 @@ class UtilsSuite extends FunSuite {
         }
       }
     
    +  test("loading properties from file") {
    +    val outFile = File.createTempFile("test-load-spark-properties", "test")
    +    try {
    +      System.setProperty("spark.test.fileNameLoadB", "2")
    +      Files.write("spark.test.fileNameLoadA true\n" +
    +        "spark.test.fileNameLoadB 1\n", outFile, Charset.forName("UTF-8"))
    +      val properties = Utils.getPropertiesFromFile(outFile.getAbsolutePath)
    +      properties.filter { case (k, v) => k.startsWith("spark.")}.foreach {
    +        case (k, v) => sys.props.getOrElseUpdate(k, v)
    +      }
    --- End diff --
    
    Can you put the filter and the foreach on separate lines to make this more readable
    ```
    properties
      .filter { ... }
      .foreach { ... }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56255414
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20605/consoleFull) for   PR 2379 at commit [`ccc9833`](https://github.com/apache/spark/commit/ccc9833a75f5585dc029233d85debf5f02d75aaa).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-57041561
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20899/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-59118853
  
    Hey sorry @witgo there are merge conflicts now. I'll test this again and merge this once you update it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-57894236
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21285/consoleFull) for   PR 2379 at commit [`80b0b12`](https://github.com/apache/spark/commit/80b0b12abd15620890523af2d455fef3902ad545).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2379


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17615074
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/master/MasterArguments.scala ---
    @@ -38,12 +39,15 @@ private[spark] class MasterArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_MASTER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_MASTER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.master.ui.port")) {
    -    webUiPort = conf.get("spark.master.ui.port").toInt
    +  if (System.getenv("spark.master.ui.port") != null) {
    +    webUiPort = System.getenv("spark.master.ui.port").toInt
    --- End diff --
    
    Same here, not an environment variable


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17752106
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -44,30 +51,27 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
         }
       }
     
    +  // Use common defaults file, if not specified by user
    +  propertiesFile = Option(propertiesFile).getOrElse(Utils.getDefaultConfigFile)
    --- End diff --
    
    I think all daemon properties are pretty specific to the daemons and don't affect the execution of the application, so it seems safe to me. Once we add this we should probably also deprecate `SPARK_MASTER_OPTS` etc in favor of doing it through `spark-defaults.conf`. Then maybe we can add a line in `spark-defaults.conf.template` that gives an example of a Master config being set here or something.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17808054
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,55 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * If file is null, then use the getDefaultConfigFile method's return.
    +   */
    +  private[spark] def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    +    val path = Option(file).getOrElse(getDefaultConfigFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the default spark configuration file */
    +  private[spark] def getDefaultConfigFile(): String = {
    +    val s = File.separator
    +    val configFile = sys.env.get("SPARK_CONF_DIR").map(t =>
    +      new File(s"$t${s}spark-defaults.conf")).filter(_.isFile).
    +      map(_.getAbsolutePath)
    +
    +    configFile.getOrElse(sys.env.get("SPARK_HOME").map(t =>
    --- End diff --
    
    Same here. Putting each call on a separate line makes it much easier to read.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17614683
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1352,6 +1352,33 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Seq[(String, String)] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().toSeq.map(k => (k, properties(k).trim))
    +    } catch {
    +      case e: IOException =>
    +        val message = s"Failed when loading Spark properties"
    +        throw new SparkException(message, e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  private[spark] def getDefaultConfigFile: String = {
    +    val s = File.separator
    +    Seq(
    +      sys.env.get("SPARK_CONF_DIR").map(t => new File(s"$t${s}spark-defaults.conf")),
    +      sys.env.get("SPARK_HOME").map(t => new File(s"${t}${s}conf${s}spark-defaults.conf"))).
    +      filter(_.isDefined).map(_.get).find(_.exists).map(_.getAbsolutePath).orNull
    +  }
    --- End diff --
    
    Then if any of those don't fit in one line, I would do
    ```
    val sparkConfDirDefaults =
      sys.env.get("SPARK_CONF_DIR").map(...)
    val sparkHomeDefaults =
      sys.env.get("SPARK_HOME").map(...)
    sparkConfDirDefaults.orElse(sparkHomeDefaults).map(_.getAbsolutePath).orNull
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17645710
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala ---
    @@ -360,14 +360,20 @@ private[spark] class Worker(
     private[spark] object Worker extends Logging {
       def main(argStrings: Array[String]) {
         SignalLogger.register(log)
    +    val args = new WorkerArguments(argStrings)
    +    Option(args.propertiesFile).foreach { file =>
    +      for ((k, v) <- Utils.getPropertiesFromFile(file) if k.startsWith("spark.")) {
    +        sys.props.getOrElseUpdate(k,v)
    +      }
    +    }
    --- End diff --
    
    Once you added back `SparkConf` as an argument to `WorkerArgument`, you can move this code into `WorkerArgument`, and replace `sys.props.getOrElseUpdate` with `conf.setIfMissing`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56819381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20804/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18070980
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    +    val path = Option(file).getOrElse(getDefaultConfigFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the default spark configuration file */
    +  def getDefaultConfigFile(): String = {
    +    val s = File.separator
    +    val configFile = sys.env.get("SPARK_CONF_DIR")
    +      .map(t => new File(s"$t${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    +
    +    configFile.getOrElse(sys.env.get("SPARK_HOME")
    +      .map(t => new File(s"${t}${s}conf${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    --- End diff --
    
    eg:
    
    ```shell
    export  SPARK_CONF_DIR=/opt/spark/config
    export  SPARK_HOME=/opt/spark/spark_1.10
    ```
    
    `/opt/spark/config/spark-defaults.conf` does not exist.
    `/opt/spark/spark_1.10/conf/spark-defaults.conf` exists.
    
    The above code can't load `/opt/spark/spark_1.10/conf/spark-defaults.conf`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18048673
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    +    val path = Option(file).getOrElse(getDefaultConfigFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the default spark configuration file */
    +  def getDefaultConfigFile(): String = {
    +    val s = File.separator
    +    val configFile = sys.env.get("SPARK_CONF_DIR")
    +      .map(t => new File(s"$t${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    +
    +    configFile.getOrElse(sys.env.get("SPARK_HOME")
    +      .map(t => new File(s"${t}${s}conf${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    --- End diff --
    
    Wait, what do you mean? I'm suggesting that you do the map and filter after you get the file, not the directory. You can use `orElse` instead of `getOrElse` for that.
    
    e.g.
    ```
    sys.env.get("SPARK_CONF_DIR").map { t => s"$t${s}spark-defaults.conf" }
      .orElse(sys.env.get("SPARK_HOME").map { t => s"${t}${s}conf${s}spark-defaults.conf" }
      .filter(_.isFile)
      .map(_.getAbsolutePath)
      .orNull
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56790524
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20792/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-57039090
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20897/consoleFull) for   PR 2379 at commit [`d1e3f75`](https://github.com/apache/spark/commit/d1e3f75672069f7b3c897f054b3bc38283c579c2).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58253424
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58268981
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21404/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-57040597
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20897/consoleFull) for   PR 2379 at commit [`d1e3f75`](https://github.com/apache/spark/commit/d1e3f75672069f7b3c897f054b3bc38283c579c2).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58419872
  
    Thanks, I will look at this shortly. A minor request though: in the future could you keep around all the commits in your branch? It would help reviewing if I could see what lines have been changed in a particular commit that was pushed after my comments, for example.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58304684
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21441/consoleFull) for   PR 2379 at commit [`e1bb650`](https://github.com/apache/spark/commit/e1bb650bd8248c0f1d034fee72058a62a3d89f61).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56743354
  
    @witgo I made a few more comments on this patch. I believe the semantics are now correct, but the documentation and style can be improved.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17615063
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala ---
    @@ -360,14 +360,20 @@ private[spark] class Worker(
     private[spark] object Worker extends Logging {
       def main(argStrings: Array[String]) {
         SignalLogger.register(log)
    +    val args = new WorkerArguments(argStrings)
    +    Option(args.propertiesFile).foreach { file =>
    +      for ((k, v) <- Utils.getPropertiesFromFile(file) if k.startsWith("spark.")) {
    +        sys.props.getOrElseUpdate(k,v)
    +      }
    +    }
         val conf = new SparkConf
    -    val args = new WorkerArguments(argStrings, conf)
    -    val (actorSystem, _) = startSystemAndActor(args.host, args.port, args.webUiPort, args.cores,
    -      args.memory, args.masters, args.workDir)
    --- End diff --
    
    Hm looks like the import ordering is a little messed up now. Looks like here we want to load properties before initializing `SparkConf`, so `WorkerArguments` can no longer take in a conf. This logic seems a little strange to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55777341
  
    Just to make sure, a blind `conf.set` would override anything the use has set using system properties, which is probably not what we want. Probably beter to use `conf.setIfMissing`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17645640
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,14 +48,15 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    +  if (System.getenv("spark.worker.ui.port")!= null) {
    +    webUiPort = System.getenv("spark.worker.ui.port").toInt
    --- End diff --
    
    This means we need to add back `SparkConf` as an argument to `WorkerArguments`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r18121284
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1357,6 +1357,58 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /**
    +   * Load the default spark properties from the file
    +   * Use common defaults file, if not specified by user.
    +   */
    +  def loadDefaultSparkProperties(conf: SparkConf,
    +    file: String = null): String = {
    +    val path = Option(file).getOrElse(getDefaultConfigFile)
    +    Option(path).foreach { confFile =>
    +      getPropertiesFromFile(confFile).filter { case (k, v) =>
    +        k.startsWith("spark.")
    +      }.foreach { case (k, v) =>
    +        conf.setIfMissing(k, v)
    +        sys.props.getOrElseUpdate(k, v)
    +      }
    +    }
    +    path
    +  }
    +
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Map[String, String] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().map(k => (k, properties(k).trim)).toMap
    +    } catch {
    +      case e: IOException =>
    +        throw new SparkException(s"Failed when loading Spark properties from $filename", e)
    +    } finally {
    +      inReader.close()
    +    }
    +  }
    +
    +  /** Return the default spark configuration file */
    +  def getDefaultConfigFile(): String = {
    +    val s = File.separator
    +    val configFile = sys.env.get("SPARK_CONF_DIR")
    +      .map(t => new File(s"$t${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    +
    +    configFile.getOrElse(sys.env.get("SPARK_HOME")
    +      .map(t => new File(s"${t}${s}conf${s}spark-defaults.conf"))
    +      .filter(_.isFile)
    +      .map(_.getAbsolutePath)
    --- End diff --
    
    Isn't that what we want? If the user sets `SPARK_CONF_DIR` then we should respect it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17807812
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -44,10 +51,18 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
         }
       }
     
    +  // Use common defaults file, if not specified by user
    +  propertiesFile = Utils.loadDefaultSparkProperties(conf, propertiesFile)
    --- End diff --
    
    `propertiesFile` is not used after this, so why do you need to re-assign it? It makes the signature of `loadDefaultSparkProperties` a little weird.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17614862
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -1352,6 +1352,33 @@ private[spark] object Utils extends Logging {
         }
       }
     
    +  /** Load properties present in the given file. */
    +  def getPropertiesFromFile(filename: String): Seq[(String, String)] = {
    +    val file = new File(filename)
    +    require(file.exists(), s"Properties file $file does not exist")
    +    require(file.isFile(), s"Properties file $file is not a normal file")
    +
    +    val inReader = new InputStreamReader(new FileInputStream(file), "UTF-8")
    +    try {
    +      val properties = new Properties()
    +      properties.load(inReader)
    +      properties.stringPropertyNames().toSeq.map(k => (k, properties(k).trim))
    +    } catch {
    +      case e: IOException =>
    +        val message = s"Failed when loading Spark properties"
    --- End diff --
    
    You probably don't need string interpolation here. Actually you probably don't need to put this in a variable at all.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-57894238
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21285/Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55926886
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20468/consoleFull) for   PR 2379 at commit [`dae0120`](https://github.com/apache/spark/commit/dae012094195737e28b7cbbb6d4ab1f1ab0d01b3).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56253333
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20604/consoleFull) for   PR 2379 at commit [`b911b54`](https://github.com/apache/spark/commit/b911b54978ceea2518a5bc51c4e62819d2892e6f).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17766151
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServerArguments.scala ---
    @@ -44,30 +50,19 @@ private[spark] class HistoryServerArguments(conf: SparkConf, args: Array[String]
         }
       }
     
    +  // Use common defaults file, if not specified by user
    +  propertiesFile = Option(propertiesFile).getOrElse(Utils.getDefaultConfigFile)
    +
       private def printUsageAndExit(exitCode: Int) {
         System.err.println(
           """
    -      |Usage: HistoryServer
    -      |
    -      |Configuration options can be set by setting the corresponding JVM system property.
    -      |History Server options are always available; additional options depend on the provider.
    -      |
    -      |History Server options:
    -      |
    -      |  spark.history.ui.port              Port where server will listen for connections
    -      |                                     (default 18080)
    -      |  spark.history.acls.enable          Whether to enable view acls for all applications
    -      |                                     (default false)
    -      |  spark.history.provider             Name of history provider class (defaults to
    -      |                                     file system-based provider)
    -      |  spark.history.retainedApplications Max number of application UIs to keep loaded in memory
    -      |                                     (default 50)
    -      |FsHistoryProvider options:
    -      |
    -      |  spark.history.fs.logDirectory      Directory where app logs are stored (required)
    -      |  spark.history.fs.updateInterval    How often to reload log data from storage (in seconds,
    -      |                                     default 10)
    -      |""".stripMargin)
    --- End diff --
    
    Only here to show the `spark.*` options is confusing. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by vanzin <gi...@git.apache.org>.

Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-55766467
  
    I think I have an opposite view from Andrew in that I dislike using sys.props as an IPC mechanism, but other than that, looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56207066
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20573/consoleFull) for   PR 2379 at commit [`5acc167`](https://github.com/apache/spark/commit/5acc16712f031d5e3269b9088acee8e7e6c8d431).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-58305582
  
    @andrewor14 The code has been updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by andrewor14 <gi...@git.apache.org>.

Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2379#discussion_r17680984
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/WorkerArguments.scala ---
    @@ -47,14 +48,15 @@ private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
       if (System.getenv("SPARK_WORKER_WEBUI_PORT") != null) {
         webUiPort = System.getenv("SPARK_WORKER_WEBUI_PORT").toInt
       }
    -  if (conf.contains("spark.worker.ui.port")) {
    -    webUiPort = conf.get("spark.worker.ui.port").toInt
    +  if (System.getenv("spark.worker.ui.port")!= null) {
    +    webUiPort = System.getenv("spark.worker.ui.port").toInt
    --- End diff --
    
    對，这一点我明白。我的意思是说要是我们在`SparkConf`初始化以后调用`conf.setIfMissing`，我们就不需要在`SparkConf`初始化之前设置有关的`sys.props`。这样的话我们就可以保持现有的初始化顺序。同时，因为`spark-defaults.conf`也可以设置`spark.worker.ui.port`，我们要先读取`spark-defaults.conf`才可以调用`conf.get("spark.worker.ui.port")`。这些逻辑全都可以在`WorkerArgument`以内实现，那`Worker`就不需要怎么改。你懂我意思吗？


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2379#issuecomment-56207390
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20573/consoleFull) for   PR 2379 at commit [`5acc167`](https://github.com/apache/spark/commit/5acc16712f031d5e3269b9088acee8e7e6c8d431).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `sealed trait Matrix extends Serializable `
      * `class SparseMatrix(`
      * `sealed trait Vector extends Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org