You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by srowen <gi...@git.apache.org> on 2014/10/06 12:22:04 UTC

[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

GitHub user srowen opened a pull request:

    https://github.com/apache/spark/pull/2670

    SPARK-3811 [CORE] More robust / standard Utils.deleteRecursively, Utils.createTempDir

    I noticed a few issues with how temp directories are created and deleted:
    
    *Minor*
    
    * Guava's `Files.createTempDir()` plus `File.deleteOnExit()` is used in many tests to make a temp dir, but `Utils.createTempDir()` seems to be the standard Spark mechanism
    * Call to `File.deleteOnExit()` could be pushed into `Utils.createTempDir()` as well, along with this replacement
    * _I messed up the message in an exception in `Utils` in SPARK-3794; fixed here_
    
    *Bit Less Minor*
    
    * `Utils.deleteRecursively()` fails immediately if any `IOException` occurs, instead of trying to delete any remaining files and subdirectories. I've observed this leave temp dirs around. I suggest changing it to continue in the face of an exception and throw one of the possibly several exceptions that occur at the end.
    * `Utils.createTempDir()` will add a JVM shutdown hook every time the method is called. Even if the subdir is the parent of another parent dir, since this check is inside the hook. However `Utils` manages a set of all dirs to delete on shutdown already, called `shutdownDeletePaths`. A single hook can be registered to delete all of these on exit. This is how Tachyon temp paths are cleaned up in `TachyonBlockManager`.
    
    I noticed a few other things that might be changed but wanted to ask first:
    
    * Shouldn't the set of dirs to delete be `File`, not just `String` paths?
    * `Utils` manages the set of `TachyonFile` that have been registered for deletion, but the shutdown hook is managed in `TachyonBlockManager`. Should this logic not live together, and not in `Utils`? it's more specific to Tachyon, and looks a slight bit odd to import in such a generic place.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/srowen/spark SPARK-3811

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2670.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2670
    
----
commit 3a0faa4e151cac3d9d9b4b4ee87cd024d260c9b1
Author: Sean Owen <so...@cloudera.com>
Date:   2014-10-06T10:19:01Z

    Standardize on Utils.createTempDir instead of Files.createTempDir

commit da0146de0fd21f375843afb47441a2d9a4db146d
Author: Sean Owen <so...@cloudera.com>
Date:   2014-10-06T10:19:30Z

    Make Utils.deleteRecursively try to delete all paths even when an exception occurs; use one shutdown hook instead of one per method call to delete temp dirs

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2670#discussion_r18474544
  
    --- Diff: core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala ---
    @@ -381,14 +382,13 @@ class PairRDDFunctionsSuite extends FunSuite with SharedSparkContext {
       }
     
       test("zero-partition RDD") {
    -    val emptyDir = Files.createTempDir()
    -    emptyDir.deleteOnExit()
    +    val emptyDir = Utils.createTempDir()
         val file = sc.textFile(emptyDir.getAbsolutePath)
    -    assert(file.partitions.size == 0)
    +    assert(file.partitions.isEmpty)
         assert(file.collect().toList === Nil)
         // Test that a shuffle on the file works, because this used to be a bug
         assert(file.map(line => (line, 1)).reduceByKey(_ + _).collect().toList === Nil)
    -    emptyDir.delete()
    +    Utils.deleteRecursively(emptyDir)
    --- End diff --
    
    OK sounds good. The tests are kind of inconsistent on this but I don't mind adding an extra bit of insurance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2670#issuecomment-58004841
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21321/consoleFull) for   PR 2670 at commit [`da0146d`](https://github.com/apache/spark/commit/da0146de0fd21f375843afb47441a2d9a4db146d).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2670#issuecomment-58096634
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21338/Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2670#discussion_r18473752
  
    --- Diff: core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala ---
    @@ -381,14 +382,13 @@ class PairRDDFunctionsSuite extends FunSuite with SharedSparkContext {
       }
     
       test("zero-partition RDD") {
    -    val emptyDir = Files.createTempDir()
    -    emptyDir.deleteOnExit()
    +    val emptyDir = Utils.createTempDir()
         val file = sc.textFile(emptyDir.getAbsolutePath)
    -    assert(file.partitions.size == 0)
    +    assert(file.partitions.isEmpty)
         assert(file.collect().toList === Nil)
         // Test that a shuffle on the file works, because this used to be a bug
         assert(file.map(line => (line, 1)).reduceByKey(_ + _).collect().toList === Nil)
    -    emptyDir.delete()
    +    Utils.deleteRecursively(emptyDir)
    --- End diff --
    
    I know the shutdown hook should take care of this, but for that extra warm fuzzy feeling this should be in a `finally` block.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2670#issuecomment-58062851
  
    LGTM, just a few minor things.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2670#issuecomment-58096619
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21338/consoleFull) for   PR 2670 at commit [`071ae60`](https://github.com/apache/spark/commit/071ae6050a5ea1c70b504afdfcf91acb7d73ad19).
     * This patch **passes** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2670#discussion_r18474437
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -666,15 +673,27 @@ private[spark] object Utils extends Logging {
        */
       def deleteRecursively(file: File) {
         if (file != null) {
    -      if (file.isDirectory() && !isSymlink(file)) {
    -        for (child <- listFilesSafely(file)) {
    -          deleteRecursively(child)
    +      try {
    +        if (file.isDirectory && !isSymlink(file)) {
    +          var savedIOException: IOException = null
    +          for (child <- listFilesSafely(file)) {
    +            try {
    +              deleteRecursively(child)
    +            } catch {
    +              // In case of multiple exceptions, only last one will be thrown
    +              case ioe: IOException => savedIOException = ioe
    +            }
    +          }
    +          if (savedIOException != null) {
    +            throw savedIOException
    +          }
             }
    -      }
    -      if (!file.delete()) {
    -        // Delete can also fail if the file simply did not exist
    -        if (file.exists()) {
    -          throw new IOException("Failed to delete: " + file.getAbsolutePath)
    +      } finally {
    --- End diff --
    
    The problem I was seeing is that an empty directory was listed and returned `null` for some reason, which generates an `IOException` and then leaves the parent hanging around even though it's empty. I'm not sure why it was happening but it was reliable when running `UtilsSuite` in my IDE. Hence I wanted to make this a little more defensive ... it seems as fine to fail for not listing the dir or not deleting dir?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2670#discussion_r18473543
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -251,15 +265,8 @@ private[spark] object Utils extends Logging {
           } catch { case e: IOException => ; }
         }
     
    +    dir.deleteOnExit()
    --- End diff --
    
    Yeah, it's almost just like insurance. I believe it works if the dir is empty, and under some circumstances it will delete stuff in the right order. I have no problem removing it on the grounds that it doesn't hurt, maybe helps, but, this is sort of meaningless because the code should actually delete things correctly already.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/2670


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2670#discussion_r18474568
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -666,15 +673,27 @@ private[spark] object Utils extends Logging {
        */
       def deleteRecursively(file: File) {
    --- End diff --
    
    OK will do, sure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2670#discussion_r18473624
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -666,15 +673,27 @@ private[spark] object Utils extends Logging {
        */
       def deleteRecursively(file: File) {
         if (file != null) {
    -      if (file.isDirectory() && !isSymlink(file)) {
    -        for (child <- listFilesSafely(file)) {
    -          deleteRecursively(child)
    +      try {
    +        if (file.isDirectory && !isSymlink(file)) {
    +          var savedIOException: IOException = null
    +          for (child <- listFilesSafely(file)) {
    +            try {
    +              deleteRecursively(child)
    +            } catch {
    +              // In case of multiple exceptions, only last one will be thrown
    +              case ioe: IOException => savedIOException = ioe
    +            }
    +          }
    +          if (savedIOException != null) {
    +            throw savedIOException
    +          }
             }
    -      }
    -      if (!file.delete()) {
    -        // Delete can also fail if the file simply did not exist
    -        if (file.exists()) {
    -          throw new IOException("Failed to delete: " + file.getAbsolutePath)
    +      } finally {
    --- End diff --
    
    So, putting this in a `finally` feels weird. If the code fails to delete some child file, it will throw an exception; then it will execute this `finally` block for the parent directory, which will fail (because there are still children in it) and throw another exception.
    
    So maybe this should just be in the same block as the above, with no `try..finally`. That will make sure that the actual cause of the failure (deleting the child file) will be the one reported.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2670#issuecomment-58004848
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21321/Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2670#issuecomment-58082359
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21338/consoleFull) for   PR 2670 at commit [`071ae60`](https://github.com/apache/spark/commit/071ae6050a5ea1c70b504afdfcf91acb7d73ad19).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/2670#issuecomment-57998800
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21321/consoleFull) for   PR 2670 at commit [`da0146d`](https://github.com/apache/spark/commit/da0146de0fd21f375843afb47441a2d9a4db146d).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2670#issuecomment-58059674
  
    So, I've never been a fan of `File.deleteOnExit()`, and just to make sure my dislike was not unfounded, I wrote the following code:
    
        import java.io.*;
        
        class d { public static void main(String[] args) throws Exception {
          File d = new File("/tmp/foo");
          d.mkdir();
          d.deleteOnExit();
        
          File f = new File(d, "bar");
          Writer out = new FileWriter(f);
          out.close();
        } }
    
    And sure enough, after you run it, `/tmp/foo` and its child file are still there. I don't know if this applies to your patch (will be going through it next), but in general, whenever I see a `deleteOnExit()` call somewhere, it just feels like laziness to properly clean things up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2670#discussion_r18473333
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -251,15 +265,8 @@ private[spark] object Utils extends Logging {
           } catch { case e: IOException => ; }
         }
     
    +    dir.deleteOnExit()
    --- End diff --
    
    See my top-level comment on this not really working for directories.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2670#issuecomment-58572853
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2670#discussion_r18473851
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -666,15 +673,27 @@ private[spark] object Utils extends Logging {
        */
       def deleteRecursively(file: File) {
    --- End diff --
    
    Should this method remove stuff from `shutdownDeletePaths`? That would be mostly an optimization, but if feels like the right thing to do.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/2670#issuecomment-58601401
  
    Thanks @srowen.  Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: SPARK-3811 [CORE] More robust / standard Utils...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2670#discussion_r18474833
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -666,15 +673,27 @@ private[spark] object Utils extends Logging {
        */
       def deleteRecursively(file: File) {
         if (file != null) {
    -      if (file.isDirectory() && !isSymlink(file)) {
    -        for (child <- listFilesSafely(file)) {
    -          deleteRecursively(child)
    +      try {
    +        if (file.isDirectory && !isSymlink(file)) {
    +          var savedIOException: IOException = null
    +          for (child <- listFilesSafely(file)) {
    +            try {
    +              deleteRecursively(child)
    +            } catch {
    +              // In case of multiple exceptions, only last one will be thrown
    +              case ioe: IOException => savedIOException = ioe
    +            }
    +          }
    +          if (savedIOException != null) {
    +            throw savedIOException
    +          }
             }
    -      }
    -      if (!file.delete()) {
    -        // Delete can also fail if the file simply did not exist
    -        if (file.exists()) {
    -          throw new IOException("Failed to delete: " + file.getAbsolutePath)
    +      } finally {
    --- End diff --
    
    I see. `listFilesSafely` translates a `null` into an `IOException`; still, that should only happen if you're trying to list a directory that has wrong permissions, which would be very weird since this should only be trying to delete directories that the process itself created...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org