You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by HyukjinKwon <gi...@git.apache.org> on 2018/05/24 18:17:17 UTC

[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/21426

    [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correctly into PythonRunner in submit with client mode in spark-submit

    ## What changes were proposed in this pull request?
    
    In client side before context initialization specifically,  .py file doesn't work in client side before context initialization when the application is a Python file. See below:
    
    ```
    $ cat /home/spark/tmp.py
    def testtest():
        return 1
    ```
    
    This works:
    
    ```
    $ cat app.py
    import pyspark
    pyspark.sql.SparkSession.builder.getOrCreate()
    import tmp
    print("************************%s" % tmp.testtest())
    
    $ ./bin/spark-submit --master yarn --deploy-mode client --py-files /home/spark/tmp.py app.py
    ...
    ************************1
    ```
    
    but this doesn't:
    
    ```
    $ cat app.py
    import pyspark
    import tmp
    pyspark.sql.SparkSession.builder.getOrCreate()
    print("************************%s" % tmp.testtest())
    
    $ ./bin/spark-submit --master yarn --deploy-mode client --py-files /home/spark/tmp.py app.py
    Traceback (most recent call last):
      File "/home/spark/spark/app.py", line 2, in <module>
        import tmp
    ImportError: No module named tmp
    ```
    
    ### How did it happen?
    
    In client mode specifically, the paths are being added into PythonRunner as are:
    
    https://github.com/apache/spark/blob/628c7b517969c4a7ccb26ea67ab3dd61266073ca/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L430
    
    https://github.com/apache/spark/blob/628c7b517969c4a7ccb26ea67ab3dd61266073ca/core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala#L49-L88
    
    The problem here is, .py file shouldn't be added as are since `PYTHONPATH` expects a directory or an archive like zip or egg.
    
    ### How does this PR fix?
    
    We shouldn't simply just add its parent directory because other files in the parent directory could also be added into the `PYTHONPATH` in client mode before context initialization.
    
    Therefore, we copy .py files into a temp directory for .py files and add it to `PYTHONPATH`.
    
    ## How was this patch tested?
    
    Unit tests are added and manually tested in both standalond and yarn client modes with submit.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-24384

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21426.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21426
    
----
commit b76854dc58b4cd5c73933cff2b8b7d8e3ffb23ac
Author: hyukjinkwon <gu...@...>
Date:   2018-05-24T17:34:31Z

    Add .py files correctly into PythonRunner in submit with client mode in spark-submit

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21426#discussion_r190683345
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -372,8 +376,27 @@ private[spark] class SparkSubmit extends Logging {
           localJars = Option(args.jars).map {
             downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
           }.orNull
    -      localPyFiles = Option(args.pyFiles).map {
    -        downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
    +      localPyFiles = Option(args.pyFiles).map { pyFiles =>
    +        if (isClientPythonSubmit) {
    +          // In case of client with submit, the python paths should be set before context
    +          // initialization.
    +          // In case of shell, the context initialization is done ahead so we are
    +          // fine but in case of client with submit, the context initialization can be done later.
    +          // We will copy the local .py files because .py file shouldn't be added
    +          // alone but its parent directory. See SPARK-24384.
    +          localPyFilesTargetDir = Utils.createTempDir(namePrefix = "localPyFiles")
    +          Utils.stringToSeq(pyFiles).map { pyFile =>
    --- End diff --
    
    This logic is copied from `downloadFileList`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91185 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91185/testReport)** for PR 21426 at commit [`90b38b9`](https://github.com/apache/spark/commit/90b38b9ed395bca7c1a872a1ceeac536e8196550).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91140/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21426#discussion_r190778033
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -372,8 +376,27 @@ private[spark] class SparkSubmit extends Logging {
           localJars = Option(args.jars).map {
             downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
           }.orNull
    -      localPyFiles = Option(args.pyFiles).map {
    -        downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
    +      localPyFiles = Option(args.pyFiles).map { pyFiles =>
    +        if (isClientPythonSubmit) {
    --- End diff --
    
    Yup, it can be. Will try.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91249 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91249/testReport)** for PR 21426 at commit [`f015e0d`](https://github.com/apache/spark/commit/f015e0d587c8d9f8cd359fecc325a19362a59c55).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21426#discussion_r191039095
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala ---
    @@ -153,4 +154,30 @@ object PythonRunner {
           .map { p => formatPath(p, testWindows) }
       }
     
    +  /**
    +   * Resolves the ".py" files. ".py" file should not be added as is because PYTHONPATH does
    +   * not expect a file. This method creates a temporary directory and puts the ".py" files
    +   * if exist in the given paths.
    +   */
    +  private def resolvePyFiles(pyFiles: Array[String]): Array[String] = {
    +    val dest = Utils.createTempDir(namePrefix = "localPyFiles")
    +    pyFiles.flatMap { pyFile =>
    +      // In case of client with submit, the python paths should be set before context
    +      // initialization because the context initialization can be done later.
    +      // We will copy the local ".py" files because ".py" file shouldn't be added
    +      // alone but its parent directory in PYTHONPATH. See SPARK-24384.
    +      if (pyFile.endsWith(".py")) {
    +        val source = new File(pyFile)
    +        if (source.exists() && source.canRead) {
    --- End diff --
    
    @vanzin, do you mean that this should be checked ahead (for example in SparkSubmit) before we are in this logic?
    
    Just for clarification, this is just a sanity check. The previous behaviour was that the path is added but it's ignored and the current behaviour is that it doesn't add the path.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3662/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21426#discussion_r191855502
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala ---
    @@ -153,4 +154,30 @@ object PythonRunner {
           .map { p => formatPath(p, testWindows) }
       }
     
    +  /**
    +   * Resolves the ".py" files. ".py" file should not be added as is because PYTHONPATH does
    +   * not expect a file. This method creates a temporary directory and puts the ".py" files
    +   * if exist in the given paths.
    +   */
    +  private def resolvePyFiles(pyFiles: Array[String]): Array[String] = {
    +    lazy val dest = Utils.createTempDir(namePrefix = "localPyFiles")
    +    pyFiles.flatMap { pyFile =>
    +      // In case of client with submit, the python paths should be set before context
    +      // initialization because the context initialization can be done later.
    +      // We will copy the local ".py" files because ".py" file shouldn't be added
    +      // alone but its parent directory in PYTHONPATH. See SPARK-24384.
    +      if (pyFile.endsWith(".py")) {
    +        val source = new File(pyFile)
    +        if (source.exists() && source.isFile && source.canRead) {
    --- End diff --
    
    Using both `exists` and `isFile` is redundant, but no biggie.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21426#discussion_r191010819
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala ---
    @@ -153,4 +154,30 @@ object PythonRunner {
           .map { p => formatPath(p, testWindows) }
       }
     
    +  /**
    +   * Resolves the ".py" files. ".py" file should not be added as is because PYTHONPATH does
    +   * not expect a file. This method creates a temporary directory and puts the ".py" files
    +   * if exist in the given paths.
    +   */
    +  private def resolvePyFiles(pyFiles: Array[String]): Array[String] = {
    +    val dest = Utils.createTempDir(namePrefix = "localPyFiles")
    +    pyFiles.flatMap { pyFile =>
    +      // In case of client with submit, the python paths should be set before context
    +      // initialization because the context initialization can be done later.
    +      // We will copy the local ".py" files because ".py" file shouldn't be added
    +      // alone but its parent directory in PYTHONPATH. See SPARK-24384.
    +      if (pyFile.endsWith(".py")) {
    +        val source = new File(pyFile)
    +        if (source.exists() && source.canRead) {
    --- End diff --
    
    `source.isFile() && source.canRead()`
    
    re: unreadable files, is there a check for it anywhere else? If not, that should be added, or the app might fail with some hard to debug exception.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91139/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3559/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91185 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91185/testReport)** for PR 21426 at commit [`90b38b9`](https://github.com/apache/spark/commit/90b38b9ed395bca7c1a872a1ceeac536e8196550).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91139 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91139/testReport)** for PR 21426 at commit [`15d6ae2`](https://github.com/apache/spark/commit/15d6ae219ac134a277a74f5e4884e4ebc6cfcf34).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3654/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91140 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91140/testReport)** for PR 21426 at commit [`39b10c5`](https://github.com/apache/spark/commit/39b10c5656a48f813a95d48d752e2d44ccb2c0d9).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    @vanzin and @jerryshao, thanks you so much.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21426#discussion_r190778192
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -372,8 +376,27 @@ private[spark] class SparkSubmit extends Logging {
           localJars = Option(args.jars).map {
             downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
           }.orNull
    -      localPyFiles = Option(args.pyFiles).map {
    -        downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
    +      localPyFiles = Option(args.pyFiles).map { pyFiles =>
    +        if (isClientPythonSubmit) {
    --- End diff --
    
    Agreed with @vanzin , we can move this logic to python related code.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91139 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91139/testReport)** for PR 21426 at commit [`15d6ae2`](https://github.com/apache/spark/commit/15d6ae219ac134a277a74f5e4884e4ebc6cfcf34).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91118 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91118/testReport)** for PR 21426 at commit [`b76854d`](https://github.com/apache/spark/commit/b76854dc58b4cd5c73933cff2b8b7d8e3ffb23ac).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91249/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91240 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91240/testReport)** for PR 21426 at commit [`f015e0d`](https://github.com/apache/spark/commit/f015e0d587c8d9f8cd359fecc325a19362a59c55).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91118 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91118/testReport)** for PR 21426 at commit [`b76854d`](https://github.com/apache/spark/commit/b76854dc58b4cd5c73933cff2b8b7d8e3ffb23ac).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    cc @vanzin and @jerryshao.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Did you try remote py files, does it have similar issue?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91118/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3583/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21426#discussion_r190803966
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala ---
    @@ -153,4 +154,25 @@ object PythonRunner {
           .map { p => formatPath(p, testWindows) }
       }
     
    +  /**
    +   * Resolves the ".py" files. ".py" file should not be added as is because PYTHONPATH does
    +   * not expect a file. This method creates a temporary directory and puts the ".py" files
    +   * if exist in the given paths.
    +   */
    +  private def resolvePyFiles(pyFiles: Array[String]): Array[String] = {
    +    val dest = Utils.createTempDir(namePrefix = "localPyFiles")
    +    pyFiles.map { pyFile =>
    +      // In case of client with submit, the python paths should be set before context
    +      // initialization because the context initialization can be done later.
    +      // We will copy the local ".py" files because ".py" file shouldn't be added
    +      // alone but its parent directory in PYTHONPATH. See SPARK-24384.
    +      if (pyFile.endsWith(".py")) {
    +        val source = new File(pyFile)
    --- End diff --
    
    Shall we check if the file is existed or not?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21426#discussion_r190808099
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala ---
    @@ -153,4 +154,25 @@ object PythonRunner {
           .map { p => formatPath(p, testWindows) }
       }
     
    +  /**
    +   * Resolves the ".py" files. ".py" file should not be added as is because PYTHONPATH does
    +   * not expect a file. This method creates a temporary directory and puts the ".py" files
    +   * if exist in the given paths.
    +   */
    +  private def resolvePyFiles(pyFiles: Array[String]): Array[String] = {
    +    val dest = Utils.createTempDir(namePrefix = "localPyFiles")
    +    pyFiles.map { pyFile =>
    +      // In case of client with submit, the python paths should be set before context
    +      // initialization because the context initialization can be done later.
    +      // We will copy the local ".py" files because ".py" file shouldn't be added
    +      // alone but its parent directory in PYTHONPATH. See SPARK-24384.
    +      if (pyFile.endsWith(".py")) {
    +        val source = new File(pyFile)
    --- End diff --
    
    Yeap


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91240/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21426


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91150 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91150/testReport)** for PR 21426 at commit [`3db9bad`](https://github.com/apache/spark/commit/3db9bad9375594b01916da5311273f41cb571b76).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    I tested:
    
    submit with yarn client: .py local
    submit with yarn client: .py remote
    submit with standalone client: .py local
    submit with standalone client: .py remote
    
    they all work fine.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91150/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91150 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91150/testReport)** for PR 21426 at commit [`3db9bad`](https://github.com/apache/spark/commit/3db9bad9375594b01916da5311273f41cb571b76).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3572/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    @vanzin, for https://github.com/apache/spark/pull/21426#discussion_r191010819, mind if we proceed in a separate ticket? From my look, it needs some changes to verify this to address this comment. I think we can't simply raise an exception since we can't recognise if that file is downloaded or not in `deploy.PythonRunner`'s perspective. 
    
    The most appropriate place seems to be in `SparkSubmit` and  `DependencyUtils.downloadFile`. seems we should inject some codes in `DependencyUtils.downloadFile` since that's where we know the original path and where we download the file into local when needed, and I would like to avoid add such changes here. It probably needs another review iteration and the current change doesn't actually target or change the previous behaviour, really.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3609/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91184/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91185/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3574/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21426#discussion_r190761869
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -372,8 +376,27 @@ private[spark] class SparkSubmit extends Logging {
           localJars = Option(args.jars).map {
             downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
           }.orNull
    -      localPyFiles = Option(args.pyFiles).map {
    -        downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
    +      localPyFiles = Option(args.pyFiles).map { pyFiles =>
    +        if (isClientPythonSubmit) {
    --- End diff --
    
    Couldn't this logic be in `PythonRunner`? That's basically what SparkSubmit runs when the conditions you use to create `isClientPythonSubmit` are met.
    
    This class is already pretty hard to navigate, it'd be better to avoid adding more special cases to it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91184 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91184/testReport)** for PR 21426 at commit [`e0e9e00`](https://github.com/apache/spark/commit/e0e9e002039f65dac09ce38c5e5d94cdf9014333).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91249 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91249/testReport)** for PR 21426 at commit [`f015e0d`](https://github.com/apache/spark/commit/f015e0d587c8d9f8cd359fecc325a19362a59c55).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91240 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91240/testReport)** for PR 21426 at commit [`f015e0d`](https://github.com/apache/spark/commit/f015e0d587c8d9f8cd359fecc325a19362a59c55).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91140 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91140/testReport)** for PR 21426 at commit [`39b10c5`](https://github.com/apache/spark/commit/39b10c5656a48f813a95d48d752e2d44ccb2c0d9).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    **[Test build #91184 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91184/testReport)** for PR 21426 at commit [`e0e9e00`](https://github.com/apache/spark/commit/e0e9e002039f65dac09ce38c5e5d94cdf9014333).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21426#discussion_r191491127
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala ---
    @@ -153,4 +154,30 @@ object PythonRunner {
           .map { p => formatPath(p, testWindows) }
       }
     
    +  /**
    +   * Resolves the ".py" files. ".py" file should not be added as is because PYTHONPATH does
    +   * not expect a file. This method creates a temporary directory and puts the ".py" files
    +   * if exist in the given paths.
    +   */
    +  private def resolvePyFiles(pyFiles: Array[String]): Array[String] = {
    +    val dest = Utils.createTempDir(namePrefix = "localPyFiles")
    +    pyFiles.flatMap { pyFile =>
    +      // In case of client with submit, the python paths should be set before context
    +      // initialization because the context initialization can be done later.
    +      // We will copy the local ".py" files because ".py" file shouldn't be added
    +      // alone but its parent directory in PYTHONPATH. See SPARK-24384.
    +      if (pyFile.endsWith(".py")) {
    +        val source = new File(pyFile)
    +        if (source.exists() && source.canRead) {
    --- End diff --
    
    I think providing a non-existent file to spark-submit should result in an error. Whether the error happens here or somewhere else it doesn't really matter.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    I haven't tried yet but I believe it has since It downloads into local. It has the assumption that the file is local within deploy.PythonRunner side too. Will check for doubly sure.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21426#discussion_r191010406
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala ---
    @@ -153,4 +154,30 @@ object PythonRunner {
           .map { p => formatPath(p, testWindows) }
       }
     
    +  /**
    +   * Resolves the ".py" files. ".py" file should not be added as is because PYTHONPATH does
    +   * not expect a file. This method creates a temporary directory and puts the ".py" files
    +   * if exist in the given paths.
    +   */
    +  private def resolvePyFiles(pyFiles: Array[String]): Array[String] = {
    +    val dest = Utils.createTempDir(namePrefix = "localPyFiles")
    --- End diff --
    
    `lazy`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3573/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21426: [SPARK-24384][PYTHON][SPARK SUBMIT] Add .py files correc...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org