You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Brian Schrameck (JIRA)" <ji...@apache.org> on 2016/08/03 18:55:21 UTC
[jira] [Created] (SPARK-16882) Failures in JobGenerator Thread are Swallowed, Job Does Not Fail

Brian Schrameck created SPARK-16882:
---------------------------------------

             Summary: Failures in JobGenerator Thread are Swallowed, Job Does Not Fail
                 Key: SPARK-16882
                 URL: https://issues.apache.org/jira/browse/SPARK-16882
             Project: Spark
          Issue Type: Bug
          Components: Scheduler, Streaming
    Affects Versions: 1.5.0
         Environment: CDH 5.6.1, CentOS 6.7
            Reporter: Brian Schrameck


Using the fileStream functionality and reading a directory with a large number of files over a long period of time, JVM garbage collection limits can be reached. In this case, the JobGenerator thread threw the exception, but it was completely swallowed and did not cause the job to fail. There were no errors in the ApplicationMaster, and the job just silently sat there not processing any further batches.

It would be expected that any fatal exception, not necessarily specific to this OutOfMemoryError, be handled appropriately and the job should be killed with the correct failure code.

We are running in YARN cluster mode on a CDH 5.6.1 cluster.

{noformat}Exception in thread "JobGenerator" java.lang.OutOfMemoryError: GC overhead limit exceeded
	at java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:68)
	at java.lang.StringBuilder.<init>(StringBuilder.java:89)
	at org.apache.hadoop.fs.Path.<init>(Path.java:109)
	at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:430)
	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1494)
	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1534)
	at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:569)
	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1494)
	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1534)
	at org.apache.spark.streaming.dstream.FileInputDStream.findNewFiles(FileInputDStream.scala:195)
	at org.apache.spark.streaming.dstream.FileInputDStream.compute(FileInputDStream.scala:146)
	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:350)
	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:350)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:349)
	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:349)
	at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:399)
	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:344)
	at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:342)
	at scala.Option.orElse(Option.scala:257)
	at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:339)
	at org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:38)
	at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:120)
	at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:120)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
	at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
	at org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:120)
	at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$2.apply(JobGenerator.scala:247){noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org