You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Imran Rashid (JIRA)" <ji...@apache.org> on 2016/09/27 02:53:20 UTC
[jira] [Created] (SPARK-17676) FsHistoryProvider should ignore
hidden files
Imran Rashid created SPARK-17676:
------------------------------------
Summary: FsHistoryProvider should ignore hidden files
Key: SPARK-17676
URL: https://issues.apache.org/jira/browse/SPARK-17676
Project: Spark
Issue Type: Bug
Components: Spark Core
Reporter: Imran Rashid
Assignee: Imran Rashid
Priority: Minor
FsHistoryProvider currently reads hidden files (beginning with ".") from the log dir. However, it is writing a hidden file *itself* to that dir, which cannot be parsed, as part of a trick to find the scan time according to the file system:
{code}
val fileName = "." + UUID.randomUUID().toString
val path = new Path(logDir, fileName)
val fos = fs.create(path)
{code}
It does delete the tmp file immediately, but we've seen cases where that race ends badly, and there is a logged error. The error is harmless (the log file is ignored and spark moves on to the other log files), but the logged error is very confusing for users, so we should avoid it.
{noformat}
2016-09-26 09:10:03,016 ERROR org.apache.spark.deploy.history.FsHistoryProvider: Exception encountered when attempting to load application log hdfs://XXX/user/spark/applicationHistory/.3a5e987c-ace5-4568-9ccd-6285010e399a
java.lang.IllegalArgumentException: Codec [3a5e987c-ace5-4568-9ccd-6285010e399a] is not available. Consider setting spark.io.compression.codec=lzf
at org.apache.spark.io.CompressionCodec$$anonfun$createCodec$1.apply(CompressionCodec.scala:72)
at org.apache.spark.io.CompressionCodec$$anonfun$createCodec$1.apply(CompressionCodec.scala:72)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:72)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$8$$anonfun$apply$1.apply(EventLoggingListener.scala:309)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$8$$anonfun$apply$1.apply(EventLoggingListener.scala:309)
at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:189)
at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:91)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$8.apply(EventLoggingListener.scala:309)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$8.apply(EventLoggingListener.scala:308)
at scala.Option.map(Option.scala:145)
at org.apache.spark.scheduler.EventLoggingListener$.openEventLog(EventLoggingListener.scala:308)
at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$replay(FsHistoryProvider.scala:518)
at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$10.apply(FsHistoryProvider.scala:359)
at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$10.apply(FsHistoryProvider.scala:356)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$mergeApplicationListing(FsHistoryProvider.scala:356)
at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$checkForLogs$1$$anon$4.run(FsHistoryProvider.scala:277)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org