You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Eran Medan (JIRA)" <ji...@apache.org> on 2015/05/13 01:21:03 UTC

[jira] [Comment Edited] (SPARK-5311) EventLoggingListener throws exception if log directory does not exist

    [ https://issues.apache.org/jira/browse/SPARK-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540972#comment-14540972 ] 

Eran Medan edited comment on SPARK-5311 at 5/12/15 11:20 PM:
-------------------------------------------------------------

Why won't fix? this used to work before 1.3


was (Author: eranation):
Why won't fix? this used to work before 1.3 
When launching an ec2 cluster, this makes it hard, as you need to create a folder and rsync it on all nodes... isn't that so?


> EventLoggingListener throws exception if log directory does not exist
> ---------------------------------------------------------------------
>
>                 Key: SPARK-5311
>                 URL: https://issues.apache.org/jira/browse/SPARK-5311
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.3.0
>            Reporter: Josh Rosen
>            Assignee: Josh Rosen
>            Priority: Blocker
>
> If the log directory does not exist, EventLoggingListener throws an IllegalArgumentException.  Here's a simple reproduction (using the master branch (1.3.0)):
> {code}
> ./bin/spark-shell --conf spark.eventLog.enabled=true --conf spark.eventLog.dir=/tmp/nonexistent-dir
> {code}
> where /tmp/nonexistent-dir is a directory that doesn't exist and /tmp exists.  This results in the following exception:
> {code}
> 15/01/18 17:10:44 INFO HttpServer: Starting HTTP Server
> 15/01/18 17:10:44 INFO Utils: Successfully started service 'HTTP file server' on port 62729.
> 15/01/18 17:10:44 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
> 15/01/18 17:10:44 INFO Utils: Successfully started service 'SparkUI' on port 4041.
> 15/01/18 17:10:44 INFO SparkUI: Started SparkUI at http://joshs-mbp.att.net:4041
> 15/01/18 17:10:45 INFO Executor: Using REPL class URI: http://192.168.1.248:62726
> 15/01/18 17:10:45 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@joshs-mbp.att.net:62728/user/HeartbeatReceiver
> 15/01/18 17:10:45 INFO NettyBlockTransferService: Server created on 62730
> 15/01/18 17:10:45 INFO BlockManagerMaster: Trying to register BlockManager
> 15/01/18 17:10:45 INFO BlockManagerMasterActor: Registering block manager localhost:62730 with 265.4 MB RAM, BlockManagerId(<driver>, localhost, 62730)
> 15/01/18 17:10:45 INFO BlockManagerMaster: Registered BlockManager
> java.lang.IllegalArgumentException: Log directory /tmp/nonexistent-dir does not exist.
> 	at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:90)
> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:363)
> 	at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
> 	at $iwC$$iwC.<init>(<console>:9)
> 	at $iwC.<init>(<console>:18)
> 	at <init>(<console>:20)
> 	at .<init>(<console>:24)
> 	at .<clinit>(<console>)
> 	at .<init>(<console>:7)
> 	at .<clinit>(<console>)
> 	at $print(<console>)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:852)
> 	at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1125)
> 	at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:674)
> 	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:705)
> 	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:669)
> 	at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:828)
> 	at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:873)
> 	at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:785)
> 	at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:123)
> 	at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:122)
> 	at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:270)
> 	at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:122)
> 	at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:60)
> 	at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:945)
> 	at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:147)
> 	at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:60)
> 	at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:106)
> 	at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:60)
> 	at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:962)
> 	at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
> 	at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
> 	at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
> 	at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:916)
> 	at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1011)
> 	at org.apache.spark.repl.Main$.main(Main.scala:31)
> 	at org.apache.spark.repl.Main.main(Main.scala)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:365)
> 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
> 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {code}
> It looks like the directory existence check was introduced in https://github.com/apache/spark/commit/456451911d11cc0b6738f31b1e17869b1fb51c87?diff=unified.  This is a change of behavior / regression from earlier Spark versions, which would create the event log directory if it did not exist.
> I think the intent of this check may have been to handle cases where the event directory path corresponds to an existing file, so maybe we can guard the `!isDirectory` check with an `exists` check first and change the error message to be more specific.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org