You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by aecc <al...@gmail.com> on 2014/01/23 21:12:19 UTC

data within batchduration in RDD of a Dstream reliable?

Hi.

I know that every RDD received in a DStream are replicated to 2 nodes by
default. However if i choose a big batchDuration (let's say 5 min), data
that is received in the stream is also reliably stored? How? As far as I
know are the RDDs the ones that stored reliably (once the RDD has it's
complete data from the batchDuration).



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/data-within-batchduration-in-RDD-of-a-Dstream-reliable-tp835.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.