You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Pavlos Katsogridakis <ka...@ics.forth.gr> on 2014/09/10 14:57:19 UTC

nested rdd operation

Hi ,

I have a question on spark
this programm on spark-shell

val filerdd = sc.textFile("NOTICE",2)
val maprdd = filerdd.map( word => filerdd.map( word2 => (word2+word)  ) )
maprdd.collect()

throws NULL pointer exception ,
can somebody explain why i cannot have a nested rdd operation ?

--pavlos

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: nested rdd operation

Posted by Sean Owen <so...@cloudera.com>.
You can't use an RDD inside an operation on an RDD. Here you have
filerdd in your map function. It sort of looks like you want a
cartesian product of the RDD with itself, so look at the cartesian()
method. It may not be a good idea to compute such a thing.

On Wed, Sep 10, 2014 at 1:57 PM, Pavlos Katsogridakis
<ka...@ics.forth.gr> wrote:
> Hi ,
>
> I have a question on spark
> this programm on spark-shell
>
> val filerdd = sc.textFile("NOTICE",2)
> val maprdd = filerdd.map( word => filerdd.map( word2 => (word2+word)  ) )
> maprdd.collect()
>
> throws NULL pointer exception ,
> can somebody explain why i cannot have a nested rdd operation ?
>
> --pavlos
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org