You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Josh Rosen (JIRA)" <ji...@apache.org> on 2015/01/03 04:35:34 UTC

[jira] [Updated] (SPARK-5063) Raise more helpful errors when RDD actions or transformations are called inside of transformations

     [ https://issues.apache.org/jira/browse/SPARK-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Josh Rosen updated SPARK-5063:
------------------------------
    Summary: Raise more helpful errors when RDD actions or transformations are called inside of transformations  (was: Raise more helpful errors when SparkContext methods are called inside of transformations)

> Raise more helpful errors when RDD actions or transformations are called inside of transformations
> --------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-5063
>                 URL: https://issues.apache.org/jira/browse/SPARK-5063
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>            Reporter: Josh Rosen
>            Assignee: Josh Rosen
>
> Spark does not support nested RDDs or performing Spark actions inside of transformations; this usually leads to NullPointerExceptions (see SPARK-718 as one example).  The confusing NPE is one of the most common sources of Spark questions on StackOverflow:
> - https://stackoverflow.com/questions/13770218/call-of-distinct-and-map-together-throws-npe-in-spark-library/14130534#14130534
> - https://stackoverflow.com/questions/23793117/nullpointerexception-in-scala-spark-appears-to-be-caused-be-collection-type/23793399#23793399
> - https://stackoverflow.com/questions/25997558/graphx-ive-got-nullpointerexception-inside-mapvertices/26003674#26003674
> (those are just a sample of the ones that I've answered personally; there are many others).
> I think that we should add some logic to attempt to detect these sorts of errors: we can use a DynamicVariable to check whether we're inside a task and throw more useful errors when the RDD constructor is called from inside a task or when the SparkContext job submission methods are called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org