You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by loneknightpy <gi...@git.apache.org> on 2017/05/24 00:00:03 UTC

[GitHub] spark pull request #18078: [SPARK-20860] Make spark-submit download remote f...

GitHub user loneknightpy opened a pull request:

    https://github.com/apache/spark/pull/18078

    [SPARK-20860] Make spark-submit download remote files to local in client mode

    ## What changes were proposed in this pull request?
    
    This PR makes spark-submit script download remote files to local file system for local/standalone client mode. 
    
    ## How was this patch tested?
    
    - Unit tests
    - Manual tests by adding s3a jar and testing against file on s3. 
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/loneknightpy/spark download-jar-in-spark-submit

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18078.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18078
    
----
commit 4cdeeed3f04f0ec62c6909e43ffe2d9824d863f7
Author: Yu Peng <lo...@gmail.com>
Date:   2017-05-04T17:06:43Z

    Make spark-submit download remote files to local file system for local/standalone client mode

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77412 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77412/testReport)** for PR 18078 at commit [`7eb5d1a`](https://github.com/apache/spark/commit/7eb5d1a0f87e22902101f5caeb52c3620c315e18).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] [Core] Make spark-submit download r...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18078#discussion_r118644247
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -825,6 +836,41 @@ object SparkSubmit extends CommandLineUtils {
                           .mkString(",")
         if (merged == "") null else merged
       }
    +
    +  /**
    +   * Download a list of remote files to temp local files. If the file is local, the original file
    +   * will be returned.
    +   * @param fileList A comma separated file list.
    +   * @return A comma separated local files list.
    +   */
    +  private[deploy] def downloadFileList(
    +      fileList: String,
    +      hadoopConf: HadoopConfiguration): String = {
    +    require(fileList != null, "fileList cannot be null.")
    +    fileList.split(",").map(downloadFile(_, hadoopConf)).mkString(",")
    +  }
    +
    +  /**
    +   * Download a file from the remote to a local temporary directory. If the input path points to
    +   * a local path, returns it with no operation.
    +   */
    +  private[deploy] def downloadFile(path: String, hadoopConf: HadoopConfiguration): String = {
    +    require(path != null, "path cannot be null.")
    +    val uri = Utils.resolveURI(path)
    +    uri.getScheme match {
    +      case "file" | "local" =>
    +        path
    +
    +      case _ =>
    +        val fs = FileSystem.get(uri, hadoopConf)
    +        val tmpFile = new File(Files.createTempDirectory("tmp").toFile, uri.getPath)
    +        // scalastyle:off println
    +        printStream.println(s"Downloading ${uri.toString} to ${tmpFile.getAbsolutePath}.")
    +        // scalastyle:on println
    +        fs.copyToLocalFile(new Path(uri), new Path(tmpFile.getAbsolutePath))
    +        s"file:${tmpFile.getAbsolutePath}"
    --- End diff --
    
    Or calling `Utils.resolveURI(tmpFile.getAbsolutePath).toString`? 
    
    It sounds `Utils.resolveURI` is commonly used for this purpose?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] [Core] Make spark-submit download r...

Posted by loneknightpy <gi...@git.apache.org>.
Github user loneknightpy commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18078#discussion_r118627131
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -308,6 +311,15 @@ object SparkSubmit extends CommandLineUtils {
           RPackageUtils.checkAndBuildRPackage(args.jars, printStream, args.verbose)
         }
     
    +    // In client mode, download remote files.
    +    if (deployMode == CLIENT) {
    --- End diff --
    
    It seems it can handle remote files in Yarn/Mesos cluster mode. I haven't tested it, because we are using client mode. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-20860] Make spark-submit download remote files to...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77268 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77268/testReport)** for PR 18078 at commit [`4cdeeed`](https://github.com/apache/spark/commit/4cdeeed3f04f0ec62c6909e43ffe2d9824d863f7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] [Core] Make spark-submit download r...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18078#discussion_r118635800
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -825,6 +837,43 @@ object SparkSubmit extends CommandLineUtils {
                           .mkString(",")
         if (merged == "") null else merged
       }
    +
    +  /**
    +   * Download a list of remote files to temp local files. If the file is local, the original file
    +   * will be returned.
    +   * @param fileList A comma separated file list, it cannot be null.
    --- End diff --
    
    Nit: no need to add the comment `it cannot be null`. Just add an assert at the beginning of the function to ensure it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77388 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77388/testReport)** for PR 18078 at commit [`e5171ca`](https://github.com/apache/spark/commit/e5171caf0370951265a1a139a91a03532bbda657).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77434 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77434/testReport)** for PR 18078 at commit [`2d6f2cd`](https://github.com/apache/spark/commit/2d6f2cdafb30a80ec441ef921238844ace2cd283).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] [Core] Make spark-submit download r...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18078#discussion_r118639691
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -825,6 +837,43 @@ object SparkSubmit extends CommandLineUtils {
                           .mkString(",")
         if (merged == "") null else merged
       }
    +
    +  /**
    +   * Download a list of remote files to temp local files. If the file is local, the original file
    +   * will be returned.
    +   * @param fileList A comma separated file list, it cannot be null.
    +   * @return A comma separated local files list.
    +   */
    +  private[deploy] def downloadFileList(
    +      fileList: String,
    +      hadoopConf: HadoopConfiguration): String = {
    +    fileList.split(",").map(downloadFile(_, hadoopConf)).mkString(",")
    +  }
    +
    +  /**
    +   * Download remote file to a temporary local file. If the file is local, the original file
    +   * will be returned.
    +   */
    +  private[deploy] def downloadFile(path: String, hadoopConf: HadoopConfiguration): String = {
    +    val uri = Utils.resolveURI(path)
    +    uri.getScheme match {
    +      case "file" | "local" =>
    +        path
    +
    +      case _ =>
    +        val fs = FileSystem.get(uri, hadoopConf)
    +        val tmpFile = new File(Files.createTempDirectory("tmp").toFile, uri.getPath)
    +        // scalastyle:off println
    +        printStream.println(s"Downloading ${uri.toString} to ${tmpFile.getAbsolutePath}.")
    +        // scalastyle:on println
    +        fs.copyToLocalFile(new Path(uri), new Path(tmpFile.getAbsolutePath))
    +        UriBuilder
    +          .fromPath(tmpFile.getAbsolutePath)
    +          .scheme("file")
    +          .build()
    +          .toString
    --- End diff --
    
    ```Scala
            val localPath = new Path(tmpFile.getAbsolutePath)
            fs.copyToLocalFile(new Path(uri), localPath)
    
            val localFS: FileSystem = localPath.getFileSystem(hadoopConf)
            localFS.makeQualified(localPath).toString
    ```
    
    Does this work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77435 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77435/testReport)** for PR 18078 at commit [`2d6f2cd`](https://github.com/apache/spark/commit/2d6f2cdafb30a80ec441ef921238844ace2cd283).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77434 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77434/testReport)** for PR 18078 at commit [`2d6f2cd`](https://github.com/apache/spark/commit/2d6f2cdafb30a80ec441ef921238844ace2cd283).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] [Core] Make spark-submit download r...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18078#discussion_r118639880
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -825,6 +837,43 @@ object SparkSubmit extends CommandLineUtils {
                           .mkString(",")
         if (merged == "") null else merged
       }
    +
    +  /**
    +   * Download a list of remote files to temp local files. If the file is local, the original file
    +   * will be returned.
    +   * @param fileList A comma separated file list, it cannot be null.
    +   * @return A comma separated local files list.
    +   */
    +  private[deploy] def downloadFileList(
    +      fileList: String,
    +      hadoopConf: HadoopConfiguration): String = {
    +    fileList.split(",").map(downloadFile(_, hadoopConf)).mkString(",")
    +  }
    +
    +  /**
    +   * Download remote file to a temporary local file. If the file is local, the original file
    +   * will be returned.
    +   */
    +  private[deploy] def downloadFile(path: String, hadoopConf: HadoopConfiguration): String = {
    +    val uri = Utils.resolveURI(path)
    +    uri.getScheme match {
    +      case "file" | "local" =>
    +        path
    +
    +      case _ =>
    +        val fs = FileSystem.get(uri, hadoopConf)
    +        val tmpFile = new File(Files.createTempDirectory("tmp").toFile, uri.getPath)
    +        // scalastyle:off println
    +        printStream.println(s"Downloading ${uri.toString} to ${tmpFile.getAbsolutePath}.")
    +        // scalastyle:on println
    +        fs.copyToLocalFile(new Path(uri), new Path(tmpFile.getAbsolutePath))
    +        UriBuilder
    +          .fromPath(tmpFile.getAbsolutePath)
    +          .scheme("file")
    +          .build()
    +          .toString
    --- End diff --
    
    It sounds like our code base never calls `UriBuilder` directly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] Make spark-submit download remote files to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] Make spark-submit download remote f...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18078#discussion_r118580623
  
    --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala ---
    @@ -535,7 +538,7 @@ class SparkSubmitSuite
     
       test("resolves command line argument paths correctly") {
         val jars = "/jar1,/jar2"                 // --jars
    -    val files = "hdfs:/file1,file2"          // --files
    +    val files = "local:/file1,file2"          // --files
    --- End diff --
    
    Could you expand on why we are changing this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77418 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77418/testReport)** for PR 18078 at commit [`3f18b81`](https://github.com/apache/spark/commit/3f18b81fd781593525619da54daa0ee3dc9cc6fb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77413/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] Make spark-submit download remote files to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77268/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-20860] Make spark-submit download remote files to...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Please reference the existing bug (SPARK-10643) instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] [Core] Make spark-submit download r...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18078#discussion_r118626377
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -308,6 +311,15 @@ object SparkSubmit extends CommandLineUtils {
           RPackageUtils.checkAndBuildRPackage(args.jars, printStream, args.verbose)
         }
     
    +    // In client mode, download remote files.
    +    if (deployMode == CLIENT) {
    --- End diff --
    
    sorry I may not have enough background knowledge, why we only do this for client mode?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77389 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77389/testReport)** for PR 18078 at commit [`62e57df`](https://github.com/apache/spark/commit/62e57df1039435c6b98dfc756ab54320dfbb627a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] [Core] Make spark-submit download r...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18078#discussion_r118593252
  
    --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala ---
    @@ -535,7 +538,7 @@ class SparkSubmitSuite
     
       test("resolves command line argument paths correctly") {
         val jars = "/jar1,/jar2"                 // --jars
    -    val files = "hdfs:/file1,file2"          // --files
    +    val files = "local:/file1,file2"          // --files
    --- End diff --
    
    It is kinda difficult to test download file from hdfs now, but we should cover this scene in the future.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] Make spark-submit download remote files to...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77268 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77268/testReport)** for PR 18078 at commit [`4cdeeed`](https://github.com/apache/spark/commit/4cdeeed3f04f0ec62c6909e43ffe2d9824d863f7).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    LGTM pending Jenkins


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77427/testReport)** for PR 18078 at commit [`2d6f2cd`](https://github.com/apache/spark/commit/2d6f2cdafb30a80ec441ef921238844ace2cd283).
     * This patch **fails from timeout after a configured wait of \`250m\`**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77435 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77435/testReport)** for PR 18078 at commit [`2d6f2cd`](https://github.com/apache/spark/commit/2d6f2cdafb30a80ec441ef921238844ace2cd283).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77418/testReport)** for PR 18078 at commit [`3f18b81`](https://github.com/apache/spark/commit/3f18b81fd781593525619da54daa0ee3dc9cc6fb).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by loneknightpy <gi...@git.apache.org>.
Github user loneknightpy commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77412/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] Make spark-submit download remote f...

Posted by loneknightpy <gi...@git.apache.org>.
Github user loneknightpy commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18078#discussion_r118583624
  
    --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala ---
    @@ -535,7 +538,7 @@ class SparkSubmitSuite
     
       test("resolves command line argument paths correctly") {
         val jars = "/jar1,/jar2"                 // --jars
    -    val files = "hdfs:/file1,file2"          // --files
    +    val files = "local:/file1,file2"          // --files
    --- End diff --
    
    To make it not try to download file from hdfs


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] Make spark-submit download remote files to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77271/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] [Core] Make spark-submit download r...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18078


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] Make spark-submit download remote files to...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77271 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77271/testReport)** for PR 18078 at commit [`6e86290`](https://github.com/apache/spark/commit/6e86290a149269b681f3aab3b32f2d829f9d41a1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77434/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] Make spark-submit download remote files to...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Could you also add "[Core]" tag in the title? @loneknightpy 
    Also cc @cloud-fan @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] Make spark-submit download remote files to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] Make spark-submit download remote f...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18078#discussion_r118580006
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -308,6 +311,15 @@ object SparkSubmit extends CommandLineUtils {
           RPackageUtils.checkAndBuildRPackage(args.jars, printStream, args.verbose)
         }
     
    +    // In client mode, download remotes files.
    --- End diff --
    
    nit: "remotes" -> "remote"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Thanks! Merging to master/2.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] [Core] Make spark-submit download r...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18078#discussion_r118636289
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -825,6 +837,43 @@ object SparkSubmit extends CommandLineUtils {
                           .mkString(",")
         if (merged == "") null else merged
       }
    +
    +  /**
    +   * Download a list of remote files to temp local files. If the file is local, the original file
    +   * will be returned.
    +   * @param fileList A comma separated file list, it cannot be null.
    +   * @return A comma separated local files list.
    +   */
    +  private[deploy] def downloadFileList(
    +      fileList: String,
    +      hadoopConf: HadoopConfiguration): String = {
    +    fileList.split(",").map(downloadFile(_, hadoopConf)).mkString(",")
    +  }
    +
    +  /**
    +   * Download remote file to a temporary local file. If the file is local, the original file
    --- End diff --
    
    How about?
    
    > Downloads a file from the remote to a local temporary directory. If the input path points to a local path, returns it with no operation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77435/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] Make spark-submit download remote files to...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77271 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77271/testReport)** for PR 18078 at commit [`6e86290`](https://github.com/apache/spark/commit/6e86290a149269b681f3aab3b32f2d829f9d41a1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77427/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18078: [SPARK-10643] [Core] Make spark-submit download r...

Posted by loneknightpy <gi...@git.apache.org>.
Github user loneknightpy commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18078#discussion_r118641792
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -825,6 +837,43 @@ object SparkSubmit extends CommandLineUtils {
                           .mkString(",")
         if (merged == "") null else merged
       }
    +
    +  /**
    +   * Download a list of remote files to temp local files. If the file is local, the original file
    +   * will be returned.
    +   * @param fileList A comma separated file list, it cannot be null.
    +   * @return A comma separated local files list.
    +   */
    +  private[deploy] def downloadFileList(
    +      fileList: String,
    +      hadoopConf: HadoopConfiguration): String = {
    +    fileList.split(",").map(downloadFile(_, hadoopConf)).mkString(",")
    +  }
    +
    +  /**
    +   * Download remote file to a temporary local file. If the file is local, the original file
    +   * will be returned.
    +   */
    +  private[deploy] def downloadFile(path: String, hadoopConf: HadoopConfiguration): String = {
    +    val uri = Utils.resolveURI(path)
    +    uri.getScheme match {
    +      case "file" | "local" =>
    +        path
    +
    +      case _ =>
    +        val fs = FileSystem.get(uri, hadoopConf)
    +        val tmpFile = new File(Files.createTempDirectory("tmp").toFile, uri.getPath)
    +        // scalastyle:off println
    +        printStream.println(s"Downloading ${uri.toString} to ${tmpFile.getAbsolutePath}.")
    +        // scalastyle:on println
    +        fs.copyToLocalFile(new Path(uri), new Path(tmpFile.getAbsolutePath))
    +        UriBuilder
    +          .fromPath(tmpFile.getAbsolutePath)
    +          .scheme("file")
    +          .build()
    +          .toString
    --- End diff --
    
    If UriBuilder is a concern, we can just use `file:${tmpFile.getAbsolutePath}`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77427/testReport)** for PR 18078 at commit [`2d6f2cd`](https://github.com/apache/spark/commit/2d6f2cdafb30a80ec441ef921238844ace2cd283).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77418/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18078
  
    **[Test build #77413 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77413/testReport)** for PR 18078 at commit [`3f18b81`](https://github.com/apache/spark/commit/3f18b81fd781593525619da54daa0ee3dc9cc6fb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org