You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Daniel Haviv <da...@veracity-group.com> on 2016/06/06 17:43:09 UTC
groupByKey returns an emptyRDD
Hi,
I'm wrapped the following code into a jar:
val test = sc.parallelize(Seq(("daniel", "a"), ("daniel", "b"), ("test", "1)")))
val agg = test.groupByKey()
agg.collect.foreach(r=>{println(r._1)})
The result of groupByKey is an empty RDD, when I'm trying the same
code using the spark-shell it's running as expected.
Any ideas?
Thank you,
Daniel
Re: groupByKey returns an emptyRDD
Posted by Ted Yu <yu...@gmail.com>.
Can you give us a bit more information ?
how you packaged the code into jar
command you used for execution
version of Spark
related log snippet
Thanks
On Mon, Jun 6, 2016 at 10:43 AM, Daniel Haviv <
daniel.haviv@veracity-group.com> wrote:
> Hi,
> I'm wrapped the following code into a jar:
>
> val test = sc.parallelize(Seq(("daniel", "a"), ("daniel", "b"), ("test", "1)")))
>
> val agg = test.groupByKey()
> agg.collect.foreach(r=>{println(r._1)})
>
>
> The result of groupByKey is an empty RDD, when I'm trying the same code using the spark-shell it's running as expected.
>
>
> Any ideas?
>
>
> Thank you,
>
> Daniel
>
>