You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by xuanyuanking <gi...@git.apache.org> on 2018/06/12 01:56:54 UTC

[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

GitHub user xuanyuanking opened a pull request:

    https://github.com/apache/spark/pull/21533

    [SPARK-24195][Core] Bug fix for local:/ path in SparkContext.addFile

    ## What changes were proposed in this pull request?
    
    In the chagnes in [SPARK-6300](https://issues.apache.org/jira/browse/SPARK-6300), essentially it change schemePath to
    ```
    new File(path).getCanonicalFile.toURI.toString
    ```
    . This has problem when path is local:, as `java.io.File` doesn't handle it.
    
    eg.
    
    new File("local:///home/user/demo/logger.config").getCanonicalFile.toURI.toString
    res1: String = file:/user/anotheruser/local:/home/user/demo/logger.config
    
    ## How was this patch tested?
    
    Add test in `SparkContextSuite`.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xuanyuanking/spark SPARK-24195

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21533.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21533
    
----
commit f922fd8c995164cada4a8b72e92c369a827def16
Author: Yuanjian Li <xy...@...>
Date:   2018-06-12T01:51:44Z

    bug fix for local:/ path in sc.addFile

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r195133122
  
    --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala ---
    @@ -116,49 +116,52 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu
       test("basic case for addFile and listFiles") {
         val dir = Utils.createTempDir()
     
    +    // file and absolute path for normal path
         val file1 = File.createTempFile("someprefix1", "somesuffix1", dir)
         val absolutePath1 = file1.getAbsolutePath
     
    +    // file and absolute path for relative path
         val file2 = File.createTempFile("someprefix2", "somesuffix2", dir)
         val relativePath = file2.getParent + "/../" + file2.getParentFile.getName + "/" + file2.getName
         val absolutePath2 = file2.getAbsolutePath
     
    +    // file and absolute path for path with local scheme
    +    val file3 = File.createTempFile("someprefix3", "somesuffix3", dir)
    +    val localPath = "local://" + file3.getParent + "/../" + file3.getParentFile.getName +
    +      "/" + file3.getName
    +    val absolutePath3 = file3.getAbsolutePath
    +
         try {
           Files.write("somewords1", file1, StandardCharsets.UTF_8)
           Files.write("somewords2", file2, StandardCharsets.UTF_8)
    -      val length1 = file1.length()
    -      val length2 = file2.length()
    +      Files.write("somewords3", file3, StandardCharsets.UTF_8)
     
    -      sc = new SparkContext(new SparkConf().setAppName("test").setMaster("local"))
    -      sc.addFile(file1.getAbsolutePath)
    -      sc.addFile(relativePath)
    -      sc.parallelize(Array(1), 1).map(x => {
    -        val gotten1 = new File(SparkFiles.get(file1.getName))
    -        val gotten2 = new File(SparkFiles.get(file2.getName))
    -        if (!gotten1.exists()) {
    +      def checkGottenFile(file: File, absolutePath: String): Unit = {
    +        val length = file.length()
    +        val gotten = new File(SparkFiles.get(file.getName))
    +        if (!gotten.exists()) {
               throw new SparkException("file doesn't exist : " + absolutePath1)
             }
    -        if (!gotten2.exists()) {
    -          throw new SparkException("file doesn't exist : " + absolutePath2)
    -        }
     
    -        if (length1 != gotten1.length()) {
    +        if (file.length() != gotten.length()) {
               throw new SparkException(
    -            s"file has different length $length1 than added file ${gotten1.length()} : " +
    +            s"file has different length $length than added file ${gotten.length()} : " +
                   absolutePath1)
             }
    -        if (length2 != gotten2.length()) {
    -          throw new SparkException(
    -            s"file has different length $length2 than added file ${gotten2.length()} : " +
    -              absolutePath2)
    -        }
     
    -        if (absolutePath1 == gotten1.getAbsolutePath) {
    +        if (absolutePath == gotten.getAbsolutePath) {
               throw new SparkException("file should have been copied :" + absolutePath1)
             }
    -        if (absolutePath2 == gotten2.getAbsolutePath) {
    -          throw new SparkException("file should have been copied : " + absolutePath2)
    -        }
    +      }
    +
    +      sc = new SparkContext(new SparkConf().setAppName("test").setMaster("local"))
    +      sc.addFile(file1.getAbsolutePath)
    +      sc.addFile(relativePath)
    +      sc.addFile(localPath)
    +      sc.parallelize(Array(1), 1).map(x => {
    --- End diff --
    
    Got it, fix in next commit.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r197068331
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1517,9 +1517,19 @@ class SparkContext(config: SparkConf) extends Logging {
        * only supported for Hadoop-supported filesystems.
        */
       def addFile(path: String, recursive: Boolean): Unit = {
    -    val uri = new Path(path).toUri
    +    var uri = new Path(path).toUri
    +    // mark the original path's scheme is local or not, there is no need to add the local file
    +    // in file server.
    +    var localFile = false
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null =>
    +        new File(path).getCanonicalFile.toURI.toString
    +      case "local" =>
    +        localFile = true
    +        val tmpPath = new File(uri.getPath).getCanonicalFile.toURI.toString
    +        // SPARK-24195: Local is not a valid scheme for FileSystem, we should only keep path here.
    +        uri = new Path(uri.getPath).toUri
    +        tmpPath
    --- End diff --
    
    I think the change here make file with "local" scheme a no-op. This makes me think whether supporting "local" scheme in `addFile` is meaningful or not? Because file with "local" scheme is already existed on every node, also it should be aware by the user, so adding it seems not meaingful.
    
    By looking at the similar method `addJar`, there "local" jar is properly treated without adding to fileServer, and properly convert to the right scheme used by classloader.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #92027 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92027/testReport)** for PR 21533 at commit [`5daf804`](https://github.com/apache/spark/commit/5daf8042a5f3b6613a462e34bb1d7b1d18ffc4fc).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3922/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Ignore the files with "local" scheme...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    SGTM too


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #92410 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92410/testReport)** for PR 21533 at commit [`eb46ccf`](https://github.com/apache/spark/commit/eb46ccfec084c2439a26eee38015381f091fe164).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r197603726
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1517,9 +1517,19 @@ class SparkContext(config: SparkConf) extends Logging {
        * only supported for Hadoop-supported filesystems.
        */
       def addFile(path: String, recursive: Boolean): Unit = {
    -    val uri = new Path(path).toUri
    +    var uri = new Path(path).toUri
    +    // mark the original path's scheme is local or not, there is no need to add the local file
    +    // in file server.
    +    var localFile = false
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null =>
    +        new File(path).getCanonicalFile.toURI.toString
    +      case "local" =>
    +        localFile = true
    +        val tmpPath = new File(uri.getPath).getCanonicalFile.toURI.toString
    +        // SPARK-24195: Local is not a valid scheme for FileSystem, we should only keep path here.
    +        uri = new Path(uri.getPath).toUri
    +        tmpPath
    --- End diff --
    
    `This makes me think whether supporting "local" scheme in addFile is meaningful or not? Because file with "local" scheme is already existed on every node, also it should be aware by the user, so adding it seems not meaingful.`
    Yeah, agree with you. The last change wants to treat "local" file without adding to fileServer and correct its scheme to "file:", but maybe add a local file, the behavior itself is a no-op? So we just forbidden user pass a file with "local" scheme in addFile? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r195143756
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1517,9 +1517,12 @@ class SparkContext(config: SparkConf) extends Logging {
        * only supported for Hadoop-supported filesystems.
        */
       def addFile(path: String, recursive: Boolean): Unit = {
    -    val uri = new Path(path).toUri
    +    var uri = new Path(path).toUri
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null | "local" =>
    +        // SPARK-24195: Local is not a valid scheme for FileSystem, we should only keep path here.
    +        uri = new Path(uri.getPath).toUri
    --- End diff --
    
    I mean we `getPath` doesn't include scheme:
    
    ```
    scala> new Path("local://a/b/c")
    res0: org.apache.hadoop.fs.Path = local://a/b/c
    
    scala> new Path("local://a/b/c").toUri
    res1: java.net.URI = local://a/b/c
    
    scala> new Path("local://a/b/c").toUri.getScheme
    res2: String = local
    
    scala> new Path("local://a/b/c").toUri.getPath
    res3: String = /b/c
    ```
    
    why should we do this again?
    
    ```
    scala> new Path(new Path("local://a/b/c").toUri.getPath).toUri.getPath
    res4: String = /b/c
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #92415 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92415/testReport)** for PR 21533 at commit [`eb46ccf`](https://github.com/apache/spark/commit/eb46ccfec084c2439a26eee38015381f091fe164).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    cc @jiangxb1987 @jerryshao 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #92415 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92415/testReport)** for PR 21533 at commit [`eb46ccf`](https://github.com/apache/spark/commit/eb46ccfec084c2439a26eee38015381f091fe164).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/539/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Ignore the files with "local" scheme...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    @jiangxb1987 Thanks for reminding, rephrase done.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r194953025
  
    --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala ---
    @@ -116,49 +116,52 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu
       test("basic case for addFile and listFiles") {
         val dir = Utils.createTempDir()
     
    +    // file and absolute path for normal path
         val file1 = File.createTempFile("someprefix1", "somesuffix1", dir)
         val absolutePath1 = file1.getAbsolutePath
     
    +    // file and absolute path for relative path
         val file2 = File.createTempFile("someprefix2", "somesuffix2", dir)
         val relativePath = file2.getParent + "/../" + file2.getParentFile.getName + "/" + file2.getName
         val absolutePath2 = file2.getAbsolutePath
     
    +    // file and absolute path for path with local scheme
    +    val file3 = File.createTempFile("someprefix3", "somesuffix3", dir)
    +    val localPath = "local://" + file3.getParent + "/../" + file3.getParentFile.getName +
    +      "/" + file3.getName
    +    val absolutePath3 = file3.getAbsolutePath
    +
         try {
           Files.write("somewords1", file1, StandardCharsets.UTF_8)
           Files.write("somewords2", file2, StandardCharsets.UTF_8)
    -      val length1 = file1.length()
    -      val length2 = file2.length()
    +      Files.write("somewords3", file3, StandardCharsets.UTF_8)
     
    -      sc = new SparkContext(new SparkConf().setAppName("test").setMaster("local"))
    -      sc.addFile(file1.getAbsolutePath)
    -      sc.addFile(relativePath)
    -      sc.parallelize(Array(1), 1).map(x => {
    -        val gotten1 = new File(SparkFiles.get(file1.getName))
    -        val gotten2 = new File(SparkFiles.get(file2.getName))
    -        if (!gotten1.exists()) {
    +      def checkGottenFile(file: File, absolutePath: String): Unit = {
    +        val length = file.length()
    +        val gotten = new File(SparkFiles.get(file.getName))
    +        if (!gotten.exists()) {
               throw new SparkException("file doesn't exist : " + absolutePath1)
             }
    -        if (!gotten2.exists()) {
    -          throw new SparkException("file doesn't exist : " + absolutePath2)
    -        }
     
    -        if (length1 != gotten1.length()) {
    +        if (file.length() != gotten.length()) {
               throw new SparkException(
    -            s"file has different length $length1 than added file ${gotten1.length()} : " +
    +            s"file has different length $length than added file ${gotten.length()} : " +
                   absolutePath1)
             }
    -        if (length2 != gotten2.length()) {
    -          throw new SparkException(
    -            s"file has different length $length2 than added file ${gotten2.length()} : " +
    -              absolutePath2)
    -        }
     
    -        if (absolutePath1 == gotten1.getAbsolutePath) {
    +        if (absolutePath == gotten.getAbsolutePath) {
               throw new SparkException("file should have been copied :" + absolutePath1)
             }
    -        if (absolutePath2 == gotten2.getAbsolutePath) {
    -          throw new SparkException("file should have been copied : " + absolutePath2)
    -        }
    --- End diff --
    
    can we not change the existing test?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r194812872
  
    --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala ---
    @@ -116,49 +116,52 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu
       test("basic case for addFile and listFiles") {
         val dir = Utils.createTempDir()
     
    +    // file and absolute path for normal path
         val file1 = File.createTempFile("someprefix1", "somesuffix1", dir)
         val absolutePath1 = file1.getAbsolutePath
     
    +    // file and absolute path for relative path
         val file2 = File.createTempFile("someprefix2", "somesuffix2", dir)
         val relativePath = file2.getParent + "/../" + file2.getParentFile.getName + "/" + file2.getName
         val absolutePath2 = file2.getAbsolutePath
     
    +    // file and absolute path for path with local scheme
    +    val file3 = File.createTempFile("someprefix3", "somesuffix3", dir)
    +    val localPath = "local://" + file3.getParent + "/../" + file3.getParentFile.getName +
    +      "/" + file3.getName
    +    val absolutePath3 = file3.getAbsolutePath
    +
         try {
           Files.write("somewords1", file1, StandardCharsets.UTF_8)
           Files.write("somewords2", file2, StandardCharsets.UTF_8)
    -      val length1 = file1.length()
    -      val length2 = file2.length()
    +      Files.write("somewords3", file3, StandardCharsets.UTF_8)
     
    -      sc = new SparkContext(new SparkConf().setAppName("test").setMaster("local"))
    -      sc.addFile(file1.getAbsolutePath)
    -      sc.addFile(relativePath)
    -      sc.parallelize(Array(1), 1).map(x => {
    -        val gotten1 = new File(SparkFiles.get(file1.getName))
    -        val gotten2 = new File(SparkFiles.get(file2.getName))
    -        if (!gotten1.exists()) {
    +      def checkGottenFile(file: File, absolutePath: String): Unit = {
    +        val length = file.length()
    +        val gotten = new File(SparkFiles.get(file.getName))
    +        if (!gotten.exists()) {
               throw new SparkException("file doesn't exist : " + absolutePath1)
             }
    -        if (!gotten2.exists()) {
    -          throw new SparkException("file doesn't exist : " + absolutePath2)
    -        }
     
    -        if (length1 != gotten1.length()) {
    +        if (file.length() != gotten.length()) {
               throw new SparkException(
    -            s"file has different length $length1 than added file ${gotten1.length()} : " +
    +            s"file has different length $length than added file ${gotten.length()} : " +
                   absolutePath1)
             }
    -        if (length2 != gotten2.length()) {
    -          throw new SparkException(
    -            s"file has different length $length2 than added file ${gotten2.length()} : " +
    -              absolutePath2)
    -        }
     
    -        if (absolutePath1 == gotten1.getAbsolutePath) {
    +        if (absolutePath == gotten.getAbsolutePath) {
               throw new SparkException("file should have been copied :" + absolutePath1)
             }
    -        if (absolutePath2 == gotten2.getAbsolutePath) {
    -          throw new SparkException("file should have been copied : " + absolutePath2)
    -        }
    +      }
    +
    +      sc = new SparkContext(new SparkConf().setAppName("test").setMaster("local"))
    +      sc.addFile(file1.getAbsolutePath)
    +      sc.addFile(relativePath)
    +      sc.addFile(localPath)
    +      sc.parallelize(Array(1), 1).map(x => {
    --- End diff --
    
    nit:
    
    ```
    map { x =>
      ...
    }
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91780/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r195655196
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1517,9 +1517,12 @@ class SparkContext(config: SparkConf) extends Logging {
        * only supported for Hadoop-supported filesystems.
        */
       def addFile(path: String, recursive: Boolean): Unit = {
    -    val uri = new Path(path).toUri
    +    var uri = new Path(path).toUri
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null | "local" =>
    +        // SPARK-24195: Local is not a valid scheme for FileSystem, we should only keep path here.
    +        uri = new Path(uri.getPath).toUri
    --- End diff --
    
    @HyukjinKwon @jiangxb1987 
    Thanks for your explain, I think I know what's your meaning about `we getPath doesn't include scheme`. Actually the purpose of this code `uri = new Path(uri.getPath).toUri`, is to reassign the var in +1520, we don't want the uri including local scheme.
    ```
    Can't we just do new File(uri.getPath).getCanonicalFile.toURI.toString without this line?
    ```
    We can't because like I explained above, if we didn't do `uri = new Path(uri.getPath).toUri`, will get a exception like below:
    ```
    No FileSystem for scheme: local
    java.io.IOException: No FileSystem for scheme: local
    	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2586)
    	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2593)
    	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
    	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632)
    	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614)
    	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
    	at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1830)
    	at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:690)
    	at org.apache.spark.util.Utils$.fetchFile(Utils.scala:486)
    	at org.apache.spark.SparkContext.addFile(SparkContext.scala:1557)
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #92410 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92410/testReport)** for PR 21533 at commit [`eb46ccf`](https://github.com/apache/spark/commit/eb46ccfec084c2439a26eee38015381f091fe164).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r195660767
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1517,9 +1517,12 @@ class SparkContext(config: SparkConf) extends Logging {
        * only supported for Hadoop-supported filesystems.
        */
       def addFile(path: String, recursive: Boolean): Unit = {
    -    val uri = new Path(path).toUri
    +    var uri = new Path(path).toUri
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null | "local" =>
    +        // SPARK-24195: Local is not a valid scheme for FileSystem, we should only keep path here.
    +        uri = new Path(uri.getPath).toUri
    --- End diff --
    
    I mean, at least we can do:
    
    ```
    val a = new File(uri.getPath).getCanonicalFile.toURI.toString
    uri = new Path(uri.getPath).toUri
    a
    ```
    
    `new Path(uri.getPath).toUri` for trimming the scheme looks not quite clean though. It's a-okay at least to me.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #92001 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92001/testReport)** for PR 21533 at commit [`5daf804`](https://github.com/apache/spark/commit/5daf8042a5f3b6613a462e34bb1d7b1d18ffc4fc).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #4220 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4220/testReport)** for PR 21533 at commit [`eb46ccf`](https://github.com/apache/spark/commit/eb46ccfec084c2439a26eee38015381f091fe164).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r195134460
  
    --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala ---
    @@ -116,49 +116,52 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu
       test("basic case for addFile and listFiles") {
         val dir = Utils.createTempDir()
     
    +    // file and absolute path for normal path
         val file1 = File.createTempFile("someprefix1", "somesuffix1", dir)
         val absolutePath1 = file1.getAbsolutePath
     
    +    // file and absolute path for relative path
         val file2 = File.createTempFile("someprefix2", "somesuffix2", dir)
         val relativePath = file2.getParent + "/../" + file2.getParentFile.getName + "/" + file2.getName
         val absolutePath2 = file2.getAbsolutePath
     
    +    // file and absolute path for path with local scheme
    +    val file3 = File.createTempFile("someprefix3", "somesuffix3", dir)
    +    val localPath = "local://" + file3.getParent + "/../" + file3.getParentFile.getName +
    +      "/" + file3.getName
    +    val absolutePath3 = file3.getAbsolutePath
    +
         try {
           Files.write("somewords1", file1, StandardCharsets.UTF_8)
           Files.write("somewords2", file2, StandardCharsets.UTF_8)
    -      val length1 = file1.length()
    -      val length2 = file2.length()
    +      Files.write("somewords3", file3, StandardCharsets.UTF_8)
     
    -      sc = new SparkContext(new SparkConf().setAppName("test").setMaster("local"))
    -      sc.addFile(file1.getAbsolutePath)
    -      sc.addFile(relativePath)
    -      sc.parallelize(Array(1), 1).map(x => {
    -        val gotten1 = new File(SparkFiles.get(file1.getName))
    -        val gotten2 = new File(SparkFiles.get(file2.getName))
    -        if (!gotten1.exists()) {
    +      def checkGottenFile(file: File, absolutePath: String): Unit = {
    +        val length = file.length()
    +        val gotten = new File(SparkFiles.get(file.getName))
    +        if (!gotten.exists()) {
               throw new SparkException("file doesn't exist : " + absolutePath1)
             }
    -        if (!gotten2.exists()) {
    -          throw new SparkException("file doesn't exist : " + absolutePath2)
    -        }
     
    -        if (length1 != gotten1.length()) {
    +        if (file.length() != gotten.length()) {
               throw new SparkException(
    -            s"file has different length $length1 than added file ${gotten1.length()} : " +
    +            s"file has different length $length than added file ${gotten.length()} : " +
                   absolutePath1)
             }
    -        if (length2 != gotten2.length()) {
    -          throw new SparkException(
    -            s"file has different length $length2 than added file ${gotten2.length()} : " +
    -              absolutePath2)
    -        }
     
    -        if (absolutePath1 == gotten1.getAbsolutePath) {
    +        if (absolutePath == gotten.getAbsolutePath) {
               throw new SparkException("file should have been copied :" + absolutePath1)
             }
    -        if (absolutePath2 == gotten2.getAbsolutePath) {
    -          throw new SparkException("file should have been copied : " + absolutePath2)
    -        }
    --- End diff --
    
    Actually I keep all existing test and just do clean work for reducing common code line by adding a function checkGottenFile in https://github.com/apache/spark/pull/21533/files/f922fd8c995164cada4a8b72e92c369a827def16#diff-8d5858d578a2dda1a2edb0d8cefa4f24R139. If you think it's unnecessary, I just change it back.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #93259 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93259/testReport)** for PR 21533 at commit [`eb46ccf`](https://github.com/apache/spark/commit/eb46ccfec084c2439a26eee38015381f091fe164).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/511/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92410/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #92027 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92027/testReport)** for PR 21533 at commit [`5daf804`](https://github.com/apache/spark/commit/5daf8042a5f3b6613a462e34bb1d7b1d18ffc4fc).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Ignore the files with "local" scheme...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merging to master branch. Thanks all!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r194953034
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1517,9 +1517,12 @@ class SparkContext(config: SparkConf) extends Logging {
        * only supported for Hadoop-supported filesystems.
        */
       def addFile(path: String, recursive: Boolean): Unit = {
    -    val uri = new Path(path).toUri
    +    var uri = new Path(path).toUri
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null | "local" =>
    +        // SPARK-24195: Local is not a valid scheme for FileSystem, we should only keep path here.
    +        uri = new Path(uri.getPath).toUri
    --- End diff --
    
    it changes `uri` - which is reference again below.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r194812741
  
    --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala ---
    @@ -116,49 +116,52 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu
       test("basic case for addFile and listFiles") {
         val dir = Utils.createTempDir()
     
    +    // file and absolute path for normal path
         val file1 = File.createTempFile("someprefix1", "somesuffix1", dir)
         val absolutePath1 = file1.getAbsolutePath
     
    +    // file and absolute path for relative path
         val file2 = File.createTempFile("someprefix2", "somesuffix2", dir)
         val relativePath = file2.getParent + "/../" + file2.getParentFile.getName + "/" + file2.getName
         val absolutePath2 = file2.getAbsolutePath
     
    +    // file and absolute path for path with local scheme
    +    val file3 = File.createTempFile("someprefix3", "somesuffix3", dir)
    +    val localPath = "local://" + file3.getParent + "/../" + file3.getParentFile.getName +
    --- End diff --
    
    Let's use string interpolation.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #4219 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4219/testReport)** for PR 21533 at commit [`eb46ccf`](https://github.com/apache/spark/commit/eb46ccfec084c2439a26eee38015381f091fe164).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #92379 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92379/testReport)** for PR 21533 at commit [`ac12568`](https://github.com/apache/spark/commit/ac12568f90e27a1748c73de57f3d63190c823278).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r194927085
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1517,9 +1517,12 @@ class SparkContext(config: SparkConf) extends Logging {
        * only supported for Hadoop-supported filesystems.
        */
       def addFile(path: String, recursive: Boolean): Unit = {
    -    val uri = new Path(path).toUri
    +    var uri = new Path(path).toUri
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null | "local" =>
    +        // SPARK-24195: Local is not a valid scheme for FileSystem, we should only keep path here.
    +        uri = new Path(uri.getPath).toUri
    --- End diff --
    
    Yes, same question. The above line seems not useful.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r195132870
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1517,9 +1517,12 @@ class SparkContext(config: SparkConf) extends Logging {
        * only supported for Hadoop-supported filesystems.
        */
       def addFile(path: String, recursive: Boolean): Unit = {
    -    val uri = new Path(path).toUri
    +    var uri = new Path(path).toUri
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null | "local" =>
    +        // SPARK-24195: Local is not a valid scheme for FileSystem, we should only keep path here.
    +        uri = new Path(uri.getPath).toUri
    --- End diff --
    
    Yes, just as @felixcheung said, this because we will use uri in https://github.com/apache/spark/pull/21533/files/f922fd8c995164cada4a8b72e92c369a827def16#diff-364713d7776956cb8b0a771e9b62f82dR1557, if the uri with local scheme, we'll get an exception cause local is not a valid scheme for FileSystem.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/534/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r198682844
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1519,7 +1519,12 @@ class SparkContext(config: SparkConf) extends Logging {
       def addFile(path: String, recursive: Boolean): Unit = {
         val uri = new Path(path).toUri
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null => new File(path).getCanonicalFile.toURI.toString
    +      case "local" =>
    +        logWarning("We do not support add a local file here because file with local scheme is " +
    +          "already existed on every node, there is no need to call addFile to add it again. " +
    +          "(See more discussion about this in SPARK-24195.)")
    --- End diff --
    
    Can we please rephrase to "File with 'local' scheme is not supported to add to file server, since it is already available on every node."?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    I think maybe we could:
    
    1) either ignore the files with "local" scheme, and let user to decide how to fetch the files, like what current fix.
    2) or copy the 'local' scheme files to the `SparkFiles#getRootDirectory` both in driver and executor. The change would be in `Utils#fetchFile`.
    
    @jiangxb1987 @vanzin what's your option?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4173/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r198705671
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1519,7 +1519,12 @@ class SparkContext(config: SparkConf) extends Logging {
       def addFile(path: String, recursive: Boolean): Unit = {
         val uri = new Path(path).toUri
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null => new File(path).getCanonicalFile.toURI.toString
    +      case "local" =>
    +        logWarning("We do not support add a local file here because file with local scheme is " +
    +          "already existed on every node, there is no need to call addFile to add it again. " +
    +          "(See more discussion about this in SPARK-24195.)")
    --- End diff --
    
    Got it, rephrase done.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Just take another look on this issue. I think the fix is just to make it work, but not make it work correctly.
    
    The fix here and the original code actually treats scheme "local" to "file", actually they're different in Spark.
    
    In Spark "local" scheme means resources are already on the driver/executor nodes, which means Spark doesn't need to ship resources from driver to executors via fileserver. But here it treats as "file" which will be shipped via fileserver to executors. This is semantically not correct.
    
    I think for "local" scheme, the fix should:
    
    1. Make it accessible both from driver and executors via `SparkFiles#get`. By copying resource to the folders.
    2. It should not be added into fileServer.  
    
    
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Please also update the title and PR description because we changed the proposed solution in the middle.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91682/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r195931509
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1517,9 +1517,12 @@ class SparkContext(config: SparkConf) extends Logging {
        * only supported for Hadoop-supported filesystems.
        */
       def addFile(path: String, recursive: Boolean): Unit = {
    -    val uri = new Path(path).toUri
    +    var uri = new Path(path).toUri
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null | "local" =>
    +        // SPARK-24195: Local is not a valid scheme for FileSystem, we should only keep path here.
    +        uri = new Path(uri.getPath).toUri
    --- End diff --
    
    Ah, I see, thanks. I'll do this in the next commit. Thanks for your patient explain.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3988/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4153/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    @jerryshao Great thanks for your review and detailed explain, based on your guidance, I found the behavior about the file in `local` scheme added in fileServer was introduced by this PR: https://github.com/apache/spark/commit/c4b1108c3f9658adebbdf8508d325528c3206f16#diff-364713d7776956cb8b0a771e9b62f82dL1023.
    So my last commit try to fix this as your guidance, keep the same behavior of local scheme file before https://github.com/apache/spark/commit/c4b1108c3f9658adebbdf8508d325528c3206f16, please help me check whether it is the correct semantics we want. Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1110/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    LGTM except for the comment from @HyukjinKwon 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    @jerryshao Great thanks for your review and detailed explain, based on your guidance, I found the behavior about the file in `local` scheme added in fileServer was introduced by the PR https://github.com/apache/spark/commit/c4b1108c3f9658adebbdf8508d325528c3206f16, and before that, the file in `local` scheme is treated as local files: https://github.com/apache/spark/commit/c4b1108c3f9658adebbdf8508d325528c3206f16#diff-364713d7776956cb8b0a771e9b62f82dL1023.
    So my last commit try to fix this as your guidance, keep the same behavior of local scheme file before https://github.com/apache/spark/commit/c4b1108c3f9658adebbdf8508d325528c3206f16, please help me check whether it is the correct semantics we want. Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92415/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/32/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    cc @felixcheung. Please take a look about this when you have time. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92379/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93259/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #93259 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93259/testReport)** for PR 21533 at commit [`eb46ccf`](https://github.com/apache/spark/commit/eb46ccfec084c2439a26eee38015381f091fe164).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #4220 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4220/testReport)** for PR 21533 at commit [`eb46ccf`](https://github.com/apache/spark/commit/eb46ccfec084c2439a26eee38015381f091fe164).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92001/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #91682 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91682/testReport)** for PR 21533 at commit [`f922fd8`](https://github.com/apache/spark/commit/f922fd8c995164cada4a8b72e92c369a827def16).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #91780 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91780/testReport)** for PR 21533 at commit [`797cefe`](https://github.com/apache/spark/commit/797cefe599cfd4500b2bf5296420dcc402cf4eff).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r195133036
  
    --- Diff: core/src/test/scala/org/apache/spark/SparkContextSuite.scala ---
    @@ -116,49 +116,52 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu
       test("basic case for addFile and listFiles") {
         val dir = Utils.createTempDir()
     
    +    // file and absolute path for normal path
         val file1 = File.createTempFile("someprefix1", "somesuffix1", dir)
         val absolutePath1 = file1.getAbsolutePath
     
    +    // file and absolute path for relative path
         val file2 = File.createTempFile("someprefix2", "somesuffix2", dir)
         val relativePath = file2.getParent + "/../" + file2.getParentFile.getName + "/" + file2.getName
         val absolutePath2 = file2.getAbsolutePath
     
    +    // file and absolute path for path with local scheme
    +    val file3 = File.createTempFile("someprefix3", "somesuffix3", dir)
    +    val localPath = "local://" + file3.getParent + "/../" + file3.getParentFile.getName +
    --- End diff --
    
    Got it, thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92027/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/258/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Seems fine to me.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r195310633
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1517,9 +1517,12 @@ class SparkContext(config: SparkConf) extends Logging {
        * only supported for Hadoop-supported filesystems.
        */
       def addFile(path: String, recursive: Boolean): Unit = {
    -    val uri = new Path(path).toUri
    +    var uri = new Path(path).toUri
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null | "local" =>
    +        // SPARK-24195: Local is not a valid scheme for FileSystem, we should only keep path here.
    +        uri = new Path(uri.getPath).toUri
    --- End diff --
    
    Yea we can simplify this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Ignore the files with "local"...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21533


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Ah, then maybe I missed some histories in the transitive changes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #91780 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91780/testReport)** for PR 21533 at commit [`797cefe`](https://github.com/apache/spark/commit/797cefe599cfd4500b2bf5296420dcc402cf4eff).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Exactly I agree with ^. It should be the best to implement that logic. I thought it seems we have never implemented "local" logic as described so far. So, I thought it's kind of okay.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #92001 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92001/testReport)** for PR 21533 at commit [`5daf804`](https://github.com/apache/spark/commit/5daf8042a5f3b6613a462e34bb1d7b1d18ffc4fc).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    "local" scheme was supported long ago for users who already deploy jars on every node. HDI heavily uses this feature.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Ignore the files with "local" scheme...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Thanks everyone for your help!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #4219 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4219/testReport)** for PR 21533 at commit [`eb46ccf`](https://github.com/apache/spark/commit/eb46ccfec084c2439a26eee38015381f091fe164).
     * This patch **fails Spark unit tests**.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #92379 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92379/testReport)** for PR 21533 at commit [`ac12568`](https://github.com/apache/spark/commit/ac12568f90e27a1748c73de57f3d63190c823278).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    **[Test build #91682 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91682/testReport)** for PR 21533 at commit [`f922fd8`](https://github.com/apache/spark/commit/f922fd8c995164cada4a8b72e92c369a827def16).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/97/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21533#discussion_r194812492
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -1517,9 +1517,12 @@ class SparkContext(config: SparkConf) extends Logging {
        * only supported for Hadoop-supported filesystems.
        */
       def addFile(path: String, recursive: Boolean): Unit = {
    -    val uri = new Path(path).toUri
    +    var uri = new Path(path).toUri
         val schemeCorrectedPath = uri.getScheme match {
    -      case null | "local" => new File(path).getCanonicalFile.toURI.toString
    +      case null | "local" =>
    +        // SPARK-24195: Local is not a valid scheme for FileSystem, we should only keep path here.
    +        uri = new Path(uri.getPath).toUri
    --- End diff --
    
    Why is this needed? Can't we just do `new File(uri.getPath).getCanonicalFile.toURI.toString` without this line?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/279/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21533
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org