You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Patrick Wendell (JIRA)" <ji...@apache.org> on 2014/09/29 20:35:33 UTC

[jira] [Comment Edited] (SPARK-2331) SparkContext.emptyRDD has wrong return type

    [ https://issues.apache.org/jira/browse/SPARK-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152018#comment-14152018 ] 

Patrick Wendell edited comment on SPARK-2331 at 9/29/14 6:34 PM:
-----------------------------------------------------------------

Yeah we could have made this a wider type in the public signature. However, it is not possible to do that while maintaining compatibility (others may be relying on this returning an EmptyRDD).

This issue is not related to covariance because here the type parameter is always String. So here the compiler actually does understand that EmptyRDD[String] is a sub type of RDD[String].

The issue with the original example is that the Scala compiler will always try to infer the narrowest type it can. So in a foldLeft expression it will by default assume the resulting type is EmptyRDD unless you up cast it to a more general type like you are doing. And the union operation requires an exact type match on the two RDD's, including the type parameter.


was (Author: pwendell):
Yeah we could have made this a wider type in the public signature. However, it is not possible to do that while maintaining compatibility (others may be relying on this returning an EmptyRDD).

For now though you can safely cast it to work around this:

{code}
scala> sc.emptyRDD[String].asInstanceOf[RDD[String]]
res7: org.apache.spark.rdd.RDD[String] = EmptyRDD[3] at emptyRDD at <console>:14
{code}

> SparkContext.emptyRDD has wrong return type
> -------------------------------------------
>
>                 Key: SPARK-2331
>                 URL: https://issues.apache.org/jira/browse/SPARK-2331
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Ian Hummel
>
> The return type for SparkContext.emptyRDD is EmptyRDD[T].
> It should be RDD[T].  That means you have to add extra type annotations on code like the below (which creates a union of RDDs over some subset of paths in a folder)
> val rdds = Seq("a", "b", "c").foldLeft[RDD[String]](sc.emptyRDD[String]) { (rdd, path) ⇒
>           rdd.union(sc.textFile(path))
>         }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org