You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Josh Rosen (JIRA)" <ji...@apache.org> on 2015/01/03 04:35:34 UTC

[jira] [Created] (SPARK-5063) Raise more helpful errors when SparkContext methods are called inside of transformations

Josh Rosen created SPARK-5063:
---------------------------------

             Summary: Raise more helpful errors when SparkContext methods are called inside of transformations
                 Key: SPARK-5063
                 URL: https://issues.apache.org/jira/browse/SPARK-5063
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
            Reporter: Josh Rosen
            Assignee: Josh Rosen


Spark does not support nested RDDs or performing Spark actions inside of transformations; this usually leads to NullPointerExceptions (see SPARK-718 as one example).  The confusing NPE is one of the most common sources of Spark questions on StackOverflow:

- https://stackoverflow.com/questions/13770218/call-of-distinct-and-map-together-throws-npe-in-spark-library/14130534#14130534
- https://stackoverflow.com/questions/23793117/nullpointerexception-in-scala-spark-appears-to-be-caused-be-collection-type/23793399#23793399
- https://stackoverflow.com/questions/25997558/graphx-ive-got-nullpointerexception-inside-mapvertices/26003674#26003674

(those are just a sample of the ones that I've answered personally; there are many others).

I think that we should add some logic to attempt to detect these sorts of errors: we can use a DynamicVariable to check whether we're inside a task and throw more useful errors when the RDD constructor is called from inside a task or when the SparkContext job submission methods are called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org