You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by loneknightpy <gi...@git.apache.org> on 2017/10/10 20:10:56 UTC

[GitHub] spark pull request #19466: [CORE] Fix spark submit file download for standal...

GitHub user loneknightpy opened a pull request:

    https://github.com/apache/spark/pull/19466

    [CORE] Fix spark submit file download for standalone client mode

    ## What changes were proposed in this pull request?
    
    This PR makes spark-submit script to use downloaded files in local/standalone client mode.
    
    ## How was this patch tested?
    
    (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
    (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/loneknightpy/spark fix-spark-submit-for-standalone

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19466.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19466
    
----
commit 1bde00c0d5c90cebdd2a8514078ace8cf33c42f8
Author: Yu Peng <lo...@gmail.com>
Date:   2017-10-10T20:06:37Z

    Fix spark submit file download for standalone client mode

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [SPARK-22237] [CORE] Fix spark submit file download for ...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    @jerryshao 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [CORE] Fix spark submit file download for standalone cli...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    cc @yhuai @jiangxb1987 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [SPARK-22237] [CORE] Fix spark submit file download for ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82604/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19466: [SPARK-22237] [CORE] Fix spark submit file downlo...

Posted by loneknightpy <gi...@git.apache.org>.
Github user loneknightpy commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19466#discussion_r144100747
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -366,6 +366,16 @@ object SparkSubmit extends CommandLineUtils with Logging {
           localPyFiles = Option(args.pyFiles).map {
             downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
           }.orNull
    +
    +      if (clusterManager == STANDALONE || clusterManager == LOCAL) {
    +        // Use local files for standalone client mode.
    +        args.primaryResource = localPrimaryResource
    --- End diff --
    
    Thanks for your input. It seems the issue is caused by a conflict with our internal patch. I will close this PR for now. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [SPARK-22237] [CORE] Fix spark submit file download for ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [SPARK-22237] [CORE] Fix spark submit file download for ...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    Would you please show us an example of how it breaks? The codes here which assigning all resources to local ones might work, but it covers which line is really broken, can you please describe more? Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19466: [SPARK-22237] [CORE] Fix spark submit file downlo...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19466#discussion_r143863570
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -366,6 +366,16 @@ object SparkSubmit extends CommandLineUtils with Logging {
           localPyFiles = Option(args.pyFiles).map {
             downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
           }.orNull
    +
    +      if (clusterManager == STANDALONE || clusterManager == LOCAL) {
    --- End diff --
    
    Seems to me like what you want here is `clusterManager != YARN` instead.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [SPARK-22237] [CORE] Fix spark submit file download for ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82603/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [SPARK-22237] [CORE] Fix spark submit file download for ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    **[Test build #82603 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82603/testReport)** for PR 19466 at commit [`86e62f6`](https://github.com/apache/spark/commit/86e62f6906bc19fb4f3b91523ab112d4315f2105).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19466: [SPARK-22237] [CORE] Fix spark submit file downlo...

Posted by loneknightpy <gi...@git.apache.org>.
Github user loneknightpy closed the pull request at:

    https://github.com/apache/spark/pull/19466


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [SPARK-22237] [CORE] Fix spark submit file download for ...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    > here with this change all the resources should be fetched from local driver
    
    That's a good point. You should download resources just to add them to the driver's classpath, but executors can download them directly from the source.
    
    As I requested, a test case would really help in understanding what is broken here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [SPARK-22237] [CORE] Fix spark submit file download for ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    **[Test build #82604 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82604/testReport)** for PR 19466 at commit [`807a767`](https://github.com/apache/spark/commit/807a767f8e1f90789299e83c995d9e6e50015a96).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [SPARK-22237] [CORE] Fix spark submit file download for ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19466: [SPARK-22237] [CORE] Fix spark submit file downlo...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19466#discussion_r143863177
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -366,6 +366,16 @@ object SparkSubmit extends CommandLineUtils with Logging {
           localPyFiles = Option(args.pyFiles).map {
             downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
           }.orNull
    +
    +      if (clusterManager == STANDALONE || clusterManager == LOCAL) {
    +        // Use local files for standalone client mode.
    +        args.primaryResource = localPrimaryResource
    --- End diff --
    
    Isn't this handled by the code in L522 already?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [CORE] Fix spark submit file download for standalone cli...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    **[Test build #82603 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82603/testReport)** for PR 19466 at commit [`86e62f6`](https://github.com/apache/spark/commit/86e62f6906bc19fb4f3b91523ab112d4315f2105).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [SPARK-22237] [CORE] Fix spark submit file download for ...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    Also please add tests, otherwise this will be broken again.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19466: [SPARK-22237] [CORE] Fix spark submit file downlo...

Posted by loneknightpy <gi...@git.apache.org>.
Github user loneknightpy commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19466#discussion_r143883926
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -366,6 +366,16 @@ object SparkSubmit extends CommandLineUtils with Logging {
           localPyFiles = Option(args.pyFiles).map {
             downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
           }.orNull
    +
    +      if (clusterManager == STANDALONE || clusterManager == LOCAL) {
    +        // Use local files for standalone client mode.
    +        args.primaryResource = localPrimaryResource
    --- End diff --
    
    It doesn't seem so. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19466: [SPARK-22237] [CORE] Fix spark submit file downlo...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19466#discussion_r143890640
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -366,6 +366,16 @@ object SparkSubmit extends CommandLineUtils with Logging {
           localPyFiles = Option(args.pyFiles).map {
             downloadFileList(_, targetDir, sparkConf, hadoopConf, secMgr)
           }.orNull
    +
    +      if (clusterManager == STANDALONE || clusterManager == LOCAL) {
    +        // Use local files for standalone client mode.
    +        args.primaryResource = localPrimaryResource
    --- End diff --
    
    Can you explain what breaks without this line?
    
    The code in L522 is adding the local copy of the primary resource to the user's classpath, which should be all that's needed, right?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19466: [SPARK-22237] [CORE] Fix spark submit file download for ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19466
  
    **[Test build #82604 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82604/testReport)** for PR 19466 at commit [`807a767`](https://github.com/apache/spark/commit/807a767f8e1f90789299e83c995d9e6e50015a96).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org