You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by pgandhi999 <gi...@git.apache.org> on 2018/05/31 14:41:10 UTC

[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

GitHub user pgandhi999 opened a pull request:

    https://github.com/apache/spark/pull/21468

    [SPARK-22151] : PYTHONPATH not picked up from the spark.yarn.appMaste…

    …rEnv properly
    
    Running in yarn cluster mode and trying to set pythonpath via spark.yarn.appMasterEnv.PYTHONPATH doesn't work.
    
    the yarn Client code looks at the env variables:
    val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
    But when you set spark.yarn.appMasterEnv it puts it into the local env.
    
    So the python path set in spark.yarn.appMasterEnv isn't properly set.
    
    You can work around if you are running in cluster mode by setting it on the client like:
    
    PYTHONPATH=./addon/python/ spark-submit
    
    ## What changes were proposed in this pull request?
    In Client.scala, PYTHONPATH was being overridden, so changed code to append values to PYTHONPATH instead of overriding them.
    
    ## How was this patch tested?
    Added log statements to ApplicationMaster.scala to check for environment variable PYTHONPATH, ran a spark job in cluster mode before the change and verified the issue. Performed the same test after the change and verified the fix. 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pgandhi999/spark SPARK-22151

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21468.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21468
    
----
commit 0aee8faad9cb60721b153c9bc2187f87a4036b9e
Author: pgandhi <pg...@...>
Date:   2018-05-31T14:36:13Z

    [SPARK-22151] : PYTHONPATH not picked up from the spark.yarn.appMasterEnv properly
    
    Running in yarn cluster mode and trying to set pythonpath via spark.yarn.appMasterEnv.PYTHONPATH doesn't work.
    
    the yarn Client code looks at the env variables:
    val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
    But when you set spark.yarn.appMasterEnv it puts it into the local env.
    
    So the python path set in spark.yarn.appMasterEnv isn't properly set.
    
    You can work around if you are running in cluster mode by setting it on the client like:
    
    PYTHONPATH=./addon/python/ spark-submit
    
    In Client.scala, PYTHONPATH was being overridden, so changed code to append values to PYTHONPATH

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    **[Test build #91345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91345/testReport)** for PR 21468 at commit [`0aee8fa`](https://github.com/apache/spark/commit/0aee8faad9cb60721b153c9bc2187f87a4036b9e).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    sorry for the delay on this, will get to next monday, 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    **[Test build #92434 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92434/testReport)** for PR 21468 at commit [`6ba543e`](https://github.com/apache/spark/commit/6ba543e9ac994a6940b71463c1ab6867166a13ef).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r193842887
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -813,8 +813,14 @@ private[spark] class Client(
         if (pythonPath.nonEmpty) {
           val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    --- End diff --
    
    good questions
    
    - precedence: So right now you can work around this issue by exporting PYTHONPATH before you launch spark-submit, I think this is something that could just be in someone's env on the launcher box and might not be what you want in a yarn container.  I would think that specifying explicit pythonpath via spark.yarn.appMasterEnv would take precedence over that since you explicitly configured. Now the second question is where that fails with the py-files and that one isn't as clear to me since like you said its explicitly specified.    Maybe we do py-files then spark.yarn.appMasterEnv.PYTHONPATH and then last env PYTHONPATH.  that is different from the way it is now though. thoughts?
    
    - agree this should not be reflected in the executors so if it is we shouldn't do that. We should make sure the spark. executorEnv.PYTHONPATH works


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r202713900
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -813,8 +813,14 @@ private[spark] class Client(
         if (pythonPath.nonEmpty) {
           val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    --- End diff --
    
    Also note as @vanzin  said you can just use the ++=: operator with the listbuffer to prepend and get rid of the if conditions before converting to string.
    env.get("PYTHONPATH") ++=: (sys.env.get("PYTHONPATH") ++=: pythonPath)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93128/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by pgandhi999 <gi...@git.apache.org>.
Github user pgandhi999 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r198935817
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -811,10 +811,18 @@ private[spark] class Client(
     
         // Finally, update the Spark config to propagate PYTHONPATH to the AM and executors.
         if (pythonPath.nonEmpty) {
    -      val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
    +      val pythonPathStr = (sys.env.get("PYTHONPATH") ++=: pythonPath)
    --- End diff --
    
    Makes sense, have changed it to ++


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    **[Test build #91603 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91603/testReport)** for PR 21468 at commit [`5e733ae`](https://github.com/apache/spark/commit/5e733aeee66e7aedcccd9eed76f539ce77919e78).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    **[Test build #91603 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91603/testReport)** for PR 21468 at commit [`5e733ae`](https://github.com/apache/spark/commit/5e733aeee66e7aedcccd9eed76f539ce77919e78).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21468


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    **[Test build #93128 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93128/testReport)** for PR 21468 at commit [`5423bef`](https://github.com/apache/spark/commit/5423befa2c27affc0a5a54f02144a34a77af34c4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91603/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r197971111
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -811,10 +811,18 @@ private[spark] class Client(
     
         // Finally, update the Spark config to propagate PYTHONPATH to the AM and executors.
         if (pythonPath.nonEmpty) {
    -      val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
    +      val pythonPathStr = (sys.env.get("PYTHONPATH") ++=: pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    --- End diff --
    
    This seems unnecessary. What you're trying to do here is concatenate the values of the current `PYTHONPATH` env variable, the `PYTHONPATH` created by Spark, and the `PYTHONPATH` set in `spark.yarn.appMasterEnv`.
    
    That's, respectively, `sys.env.get("PYTHONPATH")`, `pythonPath`, and `env.get("PYTHONPATH")`. So just concatenate those.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r197971721
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -811,10 +811,18 @@ private[spark] class Client(
     
         // Finally, update the Spark config to propagate PYTHONPATH to the AM and executors.
         if (pythonPath.nonEmpty) {
    -      val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
    +      val pythonPathStr = (sys.env.get("PYTHONPATH") ++=: pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    +        if (env.contains("PYTHONPATH")) {
    +          env("PYTHONPATH") + ApplicationConstants.CLASS_PATH_SEPARATOR  + pythonPathStr
    +        } else {
    +          pythonPathStr
    +        }
    +      env("PYTHONPATH") = newValue
    +      if (!sparkConf.getExecutorEnv.toMap.contains("PYTHONPATH")) {
    --- End diff --
    
    I see that the previous code was overriding this in the executor env; but perhaps the right thing here is to concatenate them, otherwise the executor might be missing the py4j/pyspark stuff this class adds.
    
    So, basically, what you want is:
    
    - driver: env.get(pp) ++ sys.env.get(pp) ++ pythonPath
    - executor: pythonPath ++ sparkConf.getExecutorEnv(pp)



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r192857531
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -813,8 +813,14 @@ private[spark] class Client(
         if (pythonPath.nonEmpty) {
           val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    --- End diff --
    
    You could just say `env.get("PYTHONPATH") ++=: pythonPath` before turning the list into a string.
    
    But there's also two extra questions here:
    - precedence; should the env come before or after the files added with `py-files`? I kinda think after makes more sense, since files are generally provided in the command line.
    - should `appMasterEnv` be reflected in executors? With your code it is. I'm not so sure it should.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by pgandhi999 <gi...@git.apache.org>.
Github user pgandhi999 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r202839172
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -813,8 +813,14 @@ private[spark] class Client(
         if (pythonPath.nonEmpty) {
           val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    --- End diff --
    
    @tgravescs Have replaced the if-else code with ++ operator.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91345/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    **[Test build #91571 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91571/testReport)** for PR 21468 at commit [`5e733ae`](https://github.com/apache/spark/commit/5e733aeee66e7aedcccd9eed76f539ce77919e78).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    **[Test build #93140 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93140/testReport)** for PR 21468 at commit [`49f37a8`](https://github.com/apache/spark/commit/49f37a80bb274efa50f2ec295d6e9009eeb7b24a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by redsanket <gi...@git.apache.org>.
Github user redsanket commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    LGTM @pgandhi999 Hope @tgravescs can confirm it


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91571/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92434/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    **[Test build #93128 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93128/testReport)** for PR 21468 at commit [`5423bef`](https://github.com/apache/spark/commit/5423befa2c27affc0a5a54f02144a34a77af34c4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    merged thanks @pgandhi999 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by pgandhi999 <gi...@git.apache.org>.
Github user pgandhi999 commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Thank you for your feedback, @redsanket 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    **[Test build #92434 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92434/testReport)** for PR 21468 at commit [`6ba543e`](https://github.com/apache/spark/commit/6ba543e9ac994a6940b71463c1ab6867166a13ef).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    **[Test build #91571 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91571/testReport)** for PR 21468 at commit [`5e733ae`](https://github.com/apache/spark/commit/5e733aeee66e7aedcccd9eed76f539ce77919e78).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by pgandhi999 <gi...@git.apache.org>.
Github user pgandhi999 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r198935970
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -811,10 +811,18 @@ private[spark] class Client(
     
         // Finally, update the Spark config to propagate PYTHONPATH to the AM and executors.
         if (pythonPath.nonEmpty) {
    -      val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
    +      val pythonPathStr = (sys.env.get("PYTHONPATH") ++=: pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    +        if (env.contains("PYTHONPATH")) {
    +          env("PYTHONPATH") + ApplicationConstants.CLASS_PATH_SEPARATOR  + pythonPathStr
    +        } else {
    +          pythonPathStr
    +        }
    +      env("PYTHONPATH") = newValue
    +      if (!sparkConf.getExecutorEnv.toMap.contains("PYTHONPATH")) {
    --- End diff --
    
    That is indeed an issue, fixed it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r198864713
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -811,10 +811,18 @@ private[spark] class Client(
     
         // Finally, update the Spark config to propagate PYTHONPATH to the AM and executors.
         if (pythonPath.nonEmpty) {
    -      val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
    +      val pythonPathStr = (sys.env.get("PYTHONPATH") ++=: pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    +        if (env.contains("PYTHONPATH")) {
    +          env("PYTHONPATH") + ApplicationConstants.CLASS_PATH_SEPARATOR  + pythonPathStr
    +        } else {
    +          pythonPathStr
    +        }
    +      env("PYTHONPATH") = newValue
    +      if (!sparkConf.getExecutorEnv.toMap.contains("PYTHONPATH")) {
    +        sparkConf.setExecutorEnv("PYTHONPATH", newValue)
    --- End diff --
    
    when I did some testing of this, this was causing the job to fail, Parth is investigating.    I agree with Vanzin on the executor path comment as well.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Jenkins, ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    **[Test build #93140 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93140/testReport)** for PR 21468 at commit [`49f37a8`](https://github.com/apache/spark/commit/49f37a80bb274efa50f2ec295d6e9009eeb7b24a).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by pgandhi999 <gi...@git.apache.org>.
Github user pgandhi999 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r194148280
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -813,8 +813,14 @@ private[spark] class Client(
         if (pythonPath.nonEmpty) {
           val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    --- End diff --
    
    Thank you for your comments @vanzin , have made the necessary changes. As far as precedence is concerned, I am still not sure whether I understood your question at first, however @tgravescs clarified it for me.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    +1


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by pgandhi999 <gi...@git.apache.org>.
Github user pgandhi999 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r202782358
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -813,8 +813,20 @@ private[spark] class Client(
         if (pythonPath.nonEmpty) {
           val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    +        if (env.contains("PYTHONPATH")) {
    +          env("PYTHONPATH") + ApplicationConstants.CLASS_PATH_SEPARATOR  + pythonPathStr
    +        } else {
    +          pythonPathStr
    +        }
    +      env("PYTHONPATH") = newValue
    +      if (!sparkConf.getExecutorEnv.toMap.contains("PYTHONPATH")) {
    +        sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      } else {
    +        val pythonPathExecutorEnv = sparkConf.getExecutorEnv.toMap.get("PYTHONPATH").get +
    +          ApplicationConstants.CLASS_PATH_SEPARATOR + pythonPathStr
    +        sparkConf.setExecutorEnv("PYTHONPATH", pythonPathExecutorEnv)
    --- End diff --
    
    Done


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    **[Test build #91345 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91345/testReport)** for PR 21468 at commit [`0aee8fa`](https://github.com/apache/spark/commit/0aee8faad9cb60721b153c9bc2187f87a4036b9e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r197970463
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -811,10 +811,18 @@ private[spark] class Client(
     
         // Finally, update the Spark config to propagate PYTHONPATH to the AM and executors.
         if (pythonPath.nonEmpty) {
    -      val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
    +      val pythonPathStr = (sys.env.get("PYTHONPATH") ++=: pythonPath)
    --- End diff --
    
    This is modifying `pythonPath`, but it's not used again after this line, so why?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93140/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by pgandhi999 <gi...@git.apache.org>.
Github user pgandhi999 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r198936274
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -811,10 +811,18 @@ private[spark] class Client(
     
         // Finally, update the Spark config to propagate PYTHONPATH to the AM and executors.
         if (pythonPath.nonEmpty) {
    -      val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
    +      val pythonPathStr = (sys.env.get("PYTHONPATH") ++=: pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    --- End diff --
    
    I would have done it, but I need the pythonPathStr variable to set it in executorEnv so have not modified that bit. Let me know if you think otherwise.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21468: [SPARK-22151] : PYTHONPATH not picked up from the...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r202171390
  
    --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -813,8 +813,20 @@ private[spark] class Client(
         if (pythonPath.nonEmpty) {
           val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    +        if (env.contains("PYTHONPATH")) {
    +          env("PYTHONPATH") + ApplicationConstants.CLASS_PATH_SEPARATOR  + pythonPathStr
    +        } else {
    +          pythonPathStr
    +        }
    +      env("PYTHONPATH") = newValue
    +      if (!sparkConf.getExecutorEnv.toMap.contains("PYTHONPATH")) {
    +        sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      } else {
    +        val pythonPathExecutorEnv = sparkConf.getExecutorEnv.toMap.get("PYTHONPATH").get +
    +          ApplicationConstants.CLASS_PATH_SEPARATOR + pythonPathStr
    +        sparkConf.setExecutorEnv("PYTHONPATH", pythonPathExecutorEnv)
    --- End diff --
    
    move the setExecutorEnv outside of the if and have the if return the value.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21468: [SPARK-22151] : PYTHONPATH not picked up from the spark....

Posted by pgandhi999 <gi...@git.apache.org>.
Github user pgandhi999 commented on the issue:

    https://github.com/apache/spark/pull/21468
  
    Thank you @HyukjinKwon 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org