You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2017/07/18 17:21:00 UTC

[jira] [Commented] (SPARK-21447) Spark history server fails to render compressed inprogress history file in some cases.

    [ https://issues.apache.org/jira/browse/SPARK-21447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16091834#comment-16091834 ] 

Apache Spark commented on SPARK-21447:
--------------------------------------

User 'ericvandenbergfb' has created a pull request for this issue:
https://github.com/apache/spark/pull/18673

> Spark history server fails to render compressed inprogress history file in some cases.
> --------------------------------------------------------------------------------------
>
>                 Key: SPARK-21447
>                 URL: https://issues.apache.org/jira/browse/SPARK-21447
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI
>    Affects Versions: 2.0.0
>         Environment: Spark History Server
>            Reporter: Eric Vandenberg
>            Priority: Minor
>
> We've observed the Spark History Server sometimes fails to load event data from a compressed .inprogress spark history file.  Note the existing logic in ReplayListenerBus is to read each line, if it can't json parse the last line and it's inprogress (maybeTruncated) then it is accepted as best effort.
> In the case of compressed files, the output stream will compress on the fly json serialized event data.  The output is periodically flushed to disk when internal buffers are full.  A consequence of that is a partially compressed frame may be flushed, and not being a complete frame, it can not be decompressed.  If the spark history server attempts to read such an .inprogress compressed file it will throw an EOFException.  This is really analogous to the case of failing to json parse the last line in the file (because the full line was not flushed), the difference is that since the file is compressed it is possible the compression frame was not flushed, and trying to decompress a partial frame fails in a different way the code doesn't currently handle.
> 17/07/13 17:24:59 ERROR FsHistoryProvider: Exception encountered when attempting to load application log hdfs://********/user/hadoop/******/spark/logs/job_**********-*************-*****.lz4.inprogress
> java.io.EOFException: Stream ended prematurely
>         at org.apache.spark.io.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:230)
>         at org.apache.spark.io.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:203)
>         at org.apache.spark.io.LZ4BlockInputStream.read(LZ4BlockInputStream.java:125)
>         at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
>         at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
>         at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
>         at java.io.InputStreamReader.read(InputStreamReader.java:184)
>         at java.io.BufferedReader.fill(BufferedReader.java:161)
>         at java.io.BufferedReader.readLine(BufferedReader.java:324)
>         at java.io.BufferedReader.readLine(BufferedReader.java:389)
>         at scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:72)
>         at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:836)
>         at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:461)
>         at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:66)
>         at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$replay(FsHistoryProvider.scala:601)
>         at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$mergeApplicationListing(FsHistoryProvider.scala:409)
>         at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$checkForLogs$3$$anon$4.run(FsHistoryProvider.scala:310)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org