You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2015/12/24 14:47:57 UTC

spark git commit: [SPARK-12440][CORE] Avoid setCheckpoint warning when directory is not local

Repository: spark
Updated Branches:
  refs/heads/master 502476e45 -> ea4aab7e8


[SPARK-12440][CORE] Avoid setCheckpoint warning when directory is not local

In SparkContext method `setCheckpointDir`, a warning is issued when spark master is not local and the passed directory for the checkpoint dir appears to be local.

In practice, when relying on HDFS configuration file and using a relative path for the checkpoint directory (using an incomplete URI without HDFS scheme, ...), this warning should not be issued and might be confusing.
In fact, in this case, the checkpoint directory is successfully created, and the checkpointing mechanism works as expected.

This PR uses the `FileSystem` instance created with the given directory, and checks whether it is local or not.
(The rationale is that since this same `FileSystem` instance is used to create the checkpoint dir anyway and can therefore be reliably used to determine if it is local or not).

The warning is only issued if the directory is not local, on top of the existing conditions.

Author: pierre-borckmans <pi...@realimpactanalytics.com>

Closes #10392 from pierre-borckmans/SPARK-12440_CheckpointDir_Warning_NonLocal.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ea4aab7e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ea4aab7e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ea4aab7e

Branch: refs/heads/master
Commit: ea4aab7e87fbcf9ac90f93af79cc892b56508aa0
Parents: 502476e
Author: pierre-borckmans <pi...@realimpactanalytics.com>
Authored: Thu Dec 24 13:48:21 2015 +0000
Committer: Sean Owen <so...@cloudera.com>
Committed: Thu Dec 24 13:48:21 2015 +0000

----------------------------------------------------------------------
 core/src/main/scala/org/apache/spark/SparkContext.scala | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/ea4aab7e/core/src/main/scala/org/apache/spark/SparkContext.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala b/core/src/main/scala/org/apache/spark/SparkContext.scala
index 67230f4..d506782 100644
--- a/core/src/main/scala/org/apache/spark/SparkContext.scala
+++ b/core/src/main/scala/org/apache/spark/SparkContext.scala
@@ -2073,8 +2073,9 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
     // its own local file system, which is incorrect because the checkpoint files
     // are actually on the executor machines.
     if (!isLocal && Utils.nonLocalPaths(directory).isEmpty) {
-      logWarning("Checkpoint directory must be non-local " +
-        "if Spark is running on a cluster: " + directory)
+      logWarning("Spark is not running in local mode, therefore the checkpoint directory " +
+        s"must not be on the local filesystem. Directory '$directory' " +
+        "appears to be on the local filesystem.")
     }
 
     checkpointDir = Option(directory).map { dir =>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org