You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Colin Woodbury (JIRA)" <ji...@apache.org> on 2017/06/08 14:49:18 UTC

[jira] [Updated] (SPARK-21022) foreach swallows exceptions

     [ https://issues.apache.org/jira/browse/SPARK-21022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Colin Woodbury updated SPARK-21022:
-----------------------------------
    Description: 
A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown inside its closure, but not if the exception was thrown earlier in the call chain. An example:

{code:none}
 package examples

 import org.apache.spark._

 object Shpark {
   def main(args: Array[String]) {
     implicit val sc: SparkContext = new SparkContext(
       new SparkConf().setMaster("local[*]").setAppName("blahfoobar")
     )

     /* DOESN'T THROW                                                                                     
     sc.parallelize(0 until 10000000)                                                                     
       .foreachPartition { _.map { i =>                                                                   
         println("BEFORE THROW")                                                                          
         throw new Exception("Testing exception handling")                                                
         println(i)                                                                                       
       }}                                                                                                 
      */

     /* DOESN'T THROW, nor does anything print.                                                           
      * Commenting out the exception runs the prints.                                                     
      * (i.e. `foreach` is sufficient to "run" an RDD)                                                    
     sc.parallelize(0 until 100000)                                                                       
       .foreach({ i =>                                                                                    
         println("BEFORE THROW")                                                                          
         throw new Exception("Testing exception handling")                                                
         println(i)                                                                                       
       })                                                                                                 
      */

     /* Throws! */
     sc.parallelize(0 until 100000)
       .map({ i =>
         println("BEFORE THROW")
         throw new Exception("Testing exception handling")
         i
       })
       .foreach(i => println(i))

     println("JOB DONE!")

     System.in.read

     sc.stop()
   }
 }
{code}

When exceptions are swallowed, the jobs don't seem to fail, and the driver exits normally. When one _is_ thrown, as in the last example, the exception successfully rises up to the driver and can be caught with try/catch.

The expected behaviour is for exceptions in `foreach` to throw and crash the driver, as they would with `map`.

  was:
A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown inside its closure, but not if the exception was thrown earlier in the call chain. An example:

{code:none}
 package examples

 import org.apache.spark._

 object Shpark {
   def main(args: Array[String]) {
     implicit val sc: SparkContext = new SparkContext(
       new SparkConf().setMaster("local[*]").setAppName("blahfoobar")
     )

     /* DOESN'T THROW                                                                                     
     sc.parallelize(0 until 10000000)                                                                     
       .foreachPartition { _.map { i =>                                                                   
         println("BEFORE THROW")                                                                          
         throw new Exception("Testing exception handling")                                                
         println(i)                                                                                       
       }}                                                                                                 
      */

     /* DOESN'T THROW, nor does anything print.                                                           
      * Commenting out the exception runs the prints.                                                     
      * (i.e. `foreach` is sufficient to "run" an RDD)                                                    
     sc.parallelize(0 until 100000)                                                                       
       .foreach({ i =>                                                                                    
         println("BEFORE THROW")                                                                          
         throw new Exception("Testing exception handling")                                                
         println(i)                                                                                       
       })                                                                                                 
      */

     /* Throws! */
     sc.parallelize(0 until 100000)
       .map({ i =>
         println("BEFORE THROW")
         throw new Exception("Testing exception handling")
         i
       })
       .foreach(i => println(i))

     println("JOB DONE!")

     System.in.read

     sc.stop()
   }
 }
{code}

When exceptions are swallowed, the jobs don't seem to fail, and the driver exits normally. When one _is_ thrown, as in the last example, the exception successfully rises up to the driver and can be caught with `catch`.

The expected behaviour is for exceptions in `foreach` to throw and crash the driver, as they would with `map`.


> foreach swallows exceptions
> ---------------------------
>
>                 Key: SPARK-21022
>                 URL: https://issues.apache.org/jira/browse/SPARK-21022
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.1
>            Reporter: Colin Woodbury
>            Priority: Minor
>
> A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown inside its closure, but not if the exception was thrown earlier in the call chain. An example:
> {code:none}
>  package examples
>  import org.apache.spark._
>  object Shpark {
>    def main(args: Array[String]) {
>      implicit val sc: SparkContext = new SparkContext(
>        new SparkConf().setMaster("local[*]").setAppName("blahfoobar")
>      )
>      /* DOESN'T THROW                                                                                     
>      sc.parallelize(0 until 10000000)                                                                     
>        .foreachPartition { _.map { i =>                                                                   
>          println("BEFORE THROW")                                                                          
>          throw new Exception("Testing exception handling")                                                
>          println(i)                                                                                       
>        }}                                                                                                 
>       */
>      /* DOESN'T THROW, nor does anything print.                                                           
>       * Commenting out the exception runs the prints.                                                     
>       * (i.e. `foreach` is sufficient to "run" an RDD)                                                    
>      sc.parallelize(0 until 100000)                                                                       
>        .foreach({ i =>                                                                                    
>          println("BEFORE THROW")                                                                          
>          throw new Exception("Testing exception handling")                                                
>          println(i)                                                                                       
>        })                                                                                                 
>       */
>      /* Throws! */
>      sc.parallelize(0 until 100000)
>        .map({ i =>
>          println("BEFORE THROW")
>          throw new Exception("Testing exception handling")
>          i
>        })
>        .foreach(i => println(i))
>      println("JOB DONE!")
>      System.in.read
>      sc.stop()
>    }
>  }
> {code}
> When exceptions are swallowed, the jobs don't seem to fail, and the driver exits normally. When one _is_ thrown, as in the last example, the exception successfully rises up to the driver and can be caught with try/catch.
> The expected behaviour is for exceptions in `foreach` to throw and crash the driver, as they would with `map`.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org