You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Josh Rosen (JIRA)" <ji...@apache.org> on 2014/08/12 02:08:12 UTC

[jira] [Updated] (SPARK-2975) SPARK_LOCAL_DIRS may cause problems when running in local mode

     [ https://issues.apache.org/jira/browse/SPARK-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Josh Rosen updated SPARK-2975:
------------------------------

    Priority: Critical  (was: Minor)

I'm raising the priority of this issue to 'critical', since it causes problems when running on a cluster if some tasks are small enough to be run locally on the driver.

Here's an example exception:

{code}
org.apache.spark.SparkException: Job aborted due to stage failure: Task 21 in stage 0.0 failed 1 times, most recent failure: Lost task 21.0 in stage 0.0 (TID 21, localhost): java.io.IOException: No such file or directory
        java.io.UnixFileSystem.createFileExclusively(Native Method)
        java.io.File.createNewFile(File.java:1006)
        java.io.File.createTempFile(File.java:1989)
        org.apache.spark.util.Utils$.fetchFile(Utils.scala:335)
        org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$3.apply(Executor.scala:342)
        org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$3.apply(Executor.scala:340)
        scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
        scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
        scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
        scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
        scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
        org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:340)
        org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:180)
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1153)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1142)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1141)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1141)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:682)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:682)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:682)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1359)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
        at akka.actor.ActorCell.invoke(ActorCell.scala:456)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
        at akka.dispatch.Mailbox.run(Mailbox.scala:219)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
{code}

> SPARK_LOCAL_DIRS may cause problems when running in local mode
> --------------------------------------------------------------
>
>                 Key: SPARK-2975
>                 URL: https://issues.apache.org/jira/browse/SPARK-2975
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.0, 1.1.0
>            Reporter: Josh Rosen
>            Priority: Critical
>
> If we're running Spark in local mode and {{SPARK_LOCAL_DIRS}} is set, the {{Executor}} modifies SparkConf so that this value overrides {{spark.local.dir}}.  Normally, this is safe because the modification takes place before SparkEnv is created.  In local mode, the Executor uses an existing SparkEnv rather than creating a new one, so it winds up with a DiskBlockManager that created local directories with the original {{spark.local.dir}} setting, but other components attempt to use directories specified in the _new_ {{spark.local.dir}}, leading to problems.
> I discovered this issue while testing Spark 1.1.0-snapshot1, but I think it will also affect Spark 1.0 (haven't confirmed this, though).
> (I posted some comments at https://github.com/apache/spark/pull/299#discussion-diff-15975800, but also opening this JIRA so this isn't forgotten.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org