You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Imran Rashid (JIRA)" <ji...@apache.org> on 2016/09/27 02:53:20 UTC

[jira] [Created] (SPARK-17676) FsHistoryProvider should ignore hidden files

Imran Rashid created SPARK-17676:
------------------------------------

             Summary: FsHistoryProvider should ignore hidden files
                 Key: SPARK-17676
                 URL: https://issues.apache.org/jira/browse/SPARK-17676
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
            Reporter: Imran Rashid
            Assignee: Imran Rashid
            Priority: Minor


FsHistoryProvider currently reads hidden files (beginning with ".") from the log dir.  However, it is writing a hidden file *itself* to that dir, which cannot be parsed, as part of a trick to find the scan time according to the file system:

{code}
    val fileName = "." + UUID.randomUUID().toString
    val path = new Path(logDir, fileName)
    val fos = fs.create(path)
{code}

It does delete the tmp file immediately, but we've seen cases where that race ends badly, and there is a logged error.  The error is harmless (the log file is ignored and spark moves on to the other log files), but the logged error is very confusing for users, so we should avoid it.

{noformat}
2016-09-26 09:10:03,016 ERROR org.apache.spark.deploy.history.FsHistoryProvider: Exception encountered when attempting to load application log hdfs://XXX/user/spark/applicationHistory/.3a5e987c-ace5-4568-9ccd-6285010e399a 
java.lang.IllegalArgumentException: Codec [3a5e987c-ace5-4568-9ccd-6285010e399a] is not available. Consider setting spark.io.compression.codec=lzf 
at org.apache.spark.io.CompressionCodec$$anonfun$createCodec$1.apply(CompressionCodec.scala:72) 
at org.apache.spark.io.CompressionCodec$$anonfun$createCodec$1.apply(CompressionCodec.scala:72) 
at scala.Option.getOrElse(Option.scala:120) 
at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:72) 
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$8$$anonfun$apply$1.apply(EventLoggingListener.scala:309) 
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$8$$anonfun$apply$1.apply(EventLoggingListener.scala:309) 
at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:189) 
at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:91) 
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$8.apply(EventLoggingListener.scala:309) 
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$8.apply(EventLoggingListener.scala:308) 
at scala.Option.map(Option.scala:145) 
at org.apache.spark.scheduler.EventLoggingListener$.openEventLog(EventLoggingListener.scala:308) 
at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$replay(FsHistoryProvider.scala:518) 
at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$10.apply(FsHistoryProvider.scala:359) 
at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$10.apply(FsHistoryProvider.scala:356) 
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) 
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) 
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) 
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) 
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) 
at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$mergeApplicationListing(FsHistoryProvider.scala:356)
at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$checkForLogs$1$$anon$4.run(FsHistoryProvider.scala:277) 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org