You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Adrian Mocanu <am...@verticalscope.com> on 2014/04/24 17:26:08 UTC

reduceByKeyAndWindow - spark internals

If I have this code:
val stream1= doublesInputStream.window(Seconds(10), Seconds(2))
val stream2= stream1.reduceByKeyAndWindow(_ + _, Seconds(10), Seconds(10))

Does reduceByKeyAndWindow merge all RDDs from stream1 that came in the 10 second window?

Example, in the first 10 secs stream1 will have 5 RDDS. Does reduceByKeyAndWindow merge these 5RDDs into 1 RDD and remove duplicates?

-Adrian


FW: reduceByKeyAndWindow - spark internals

Posted by Adrian Mocanu <am...@verticalscope.com>.
Any suggestions where I can find this in the documentation or elsewhere?

Thanks

From: Adrian Mocanu [mailto:amocanu@verticalscope.com]
Sent: April-24-14 11:26 AM
To: user@spark.incubator.apache.org
Subject: reduceByKeyAndWindow - spark internals

If I have this code:
val stream1= doublesInputStream.window(Seconds(10), Seconds(2))
val stream2= stream1.reduceByKeyAndWindow(_ + _, Seconds(10), Seconds(10))

Does reduceByKeyAndWindow merge all RDDs from stream1 that came in the 10 second window?

Example, in the first 10 secs stream1 will have 5 RDDS. Does reduceByKeyAndWindow merge these 5RDDs into 1 RDD and remove duplicates?

-Adrian