You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Adrian Mocanu <am...@verticalscope.com> on 2014/03/24 17:44:59 UTC
remove duplicates
I have a DStream like this:
..RDD[a,b],RDD[b,c]..
Is there a way to remove duplicates across the entire DStream? Ie: I would like the output to be (by removing one of the b's):
..RDD[a],RDD[b,c].. or ..RDD[a,b],RDD[c]..
Thanks
-Adrian