You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Colin Woodbury (JIRA)" <ji...@apache.org> on 2017/06/08 14:49:18 UTC
[jira] [Updated] (SPARK-21022) foreach swallows exceptions
[ https://issues.apache.org/jira/browse/SPARK-21022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Colin Woodbury updated SPARK-21022:
-----------------------------------
Description:
A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown inside its closure, but not if the exception was thrown earlier in the call chain. An example:
{code:none}
package examples
import org.apache.spark._
object Shpark {
def main(args: Array[String]) {
implicit val sc: SparkContext = new SparkContext(
new SparkConf().setMaster("local[*]").setAppName("blahfoobar")
)
/* DOESN'T THROW
sc.parallelize(0 until 10000000)
.foreachPartition { _.map { i =>
println("BEFORE THROW")
throw new Exception("Testing exception handling")
println(i)
}}
*/
/* DOESN'T THROW, nor does anything print.
* Commenting out the exception runs the prints.
* (i.e. `foreach` is sufficient to "run" an RDD)
sc.parallelize(0 until 100000)
.foreach({ i =>
println("BEFORE THROW")
throw new Exception("Testing exception handling")
println(i)
})
*/
/* Throws! */
sc.parallelize(0 until 100000)
.map({ i =>
println("BEFORE THROW")
throw new Exception("Testing exception handling")
i
})
.foreach(i => println(i))
println("JOB DONE!")
System.in.read
sc.stop()
}
}
{code}
When exceptions are swallowed, the jobs don't seem to fail, and the driver exits normally. When one _is_ thrown, as in the last example, the exception successfully rises up to the driver and can be caught with try/catch.
The expected behaviour is for exceptions in `foreach` to throw and crash the driver, as they would with `map`.
was:
A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown inside its closure, but not if the exception was thrown earlier in the call chain. An example:
{code:none}
package examples
import org.apache.spark._
object Shpark {
def main(args: Array[String]) {
implicit val sc: SparkContext = new SparkContext(
new SparkConf().setMaster("local[*]").setAppName("blahfoobar")
)
/* DOESN'T THROW
sc.parallelize(0 until 10000000)
.foreachPartition { _.map { i =>
println("BEFORE THROW")
throw new Exception("Testing exception handling")
println(i)
}}
*/
/* DOESN'T THROW, nor does anything print.
* Commenting out the exception runs the prints.
* (i.e. `foreach` is sufficient to "run" an RDD)
sc.parallelize(0 until 100000)
.foreach({ i =>
println("BEFORE THROW")
throw new Exception("Testing exception handling")
println(i)
})
*/
/* Throws! */
sc.parallelize(0 until 100000)
.map({ i =>
println("BEFORE THROW")
throw new Exception("Testing exception handling")
i
})
.foreach(i => println(i))
println("JOB DONE!")
System.in.read
sc.stop()
}
}
{code}
When exceptions are swallowed, the jobs don't seem to fail, and the driver exits normally. When one _is_ thrown, as in the last example, the exception successfully rises up to the driver and can be caught with `catch`.
The expected behaviour is for exceptions in `foreach` to throw and crash the driver, as they would with `map`.
> foreach swallows exceptions
> ---------------------------
>
> Key: SPARK-21022
> URL: https://issues.apache.org/jira/browse/SPARK-21022
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.1.1
> Reporter: Colin Woodbury
> Priority: Minor
>
> A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown inside its closure, but not if the exception was thrown earlier in the call chain. An example:
> {code:none}
> package examples
> import org.apache.spark._
> object Shpark {
> def main(args: Array[String]) {
> implicit val sc: SparkContext = new SparkContext(
> new SparkConf().setMaster("local[*]").setAppName("blahfoobar")
> )
> /* DOESN'T THROW
> sc.parallelize(0 until 10000000)
> .foreachPartition { _.map { i =>
> println("BEFORE THROW")
> throw new Exception("Testing exception handling")
> println(i)
> }}
> */
> /* DOESN'T THROW, nor does anything print.
> * Commenting out the exception runs the prints.
> * (i.e. `foreach` is sufficient to "run" an RDD)
> sc.parallelize(0 until 100000)
> .foreach({ i =>
> println("BEFORE THROW")
> throw new Exception("Testing exception handling")
> println(i)
> })
> */
> /* Throws! */
> sc.parallelize(0 until 100000)
> .map({ i =>
> println("BEFORE THROW")
> throw new Exception("Testing exception handling")
> i
> })
> .foreach(i => println(i))
> println("JOB DONE!")
> System.in.read
> sc.stop()
> }
> }
> {code}
> When exceptions are swallowed, the jobs don't seem to fail, and the driver exits normally. When one _is_ thrown, as in the last example, the exception successfully rises up to the driver and can be caught with try/catch.
> The expected behaviour is for exceptions in `foreach` to throw and crash the driver, as they would with `map`.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org