You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ji ZHANG <zh...@gmail.com> on 2014/09/17 10:49:14 UTC

Do I Need to Set Checkpoint Interval for Every DStream?

Hi,

I'm using spark streaming 1.0. I create dstream with kafkautils and
apply some operations on it. There's a reduceByWindow operation at
last so I suppose the checkpoint interval should be automatically set
to more than 10 seconds. But what I see is it still checkpoint every 2
seconds (my batch interval), and from the log I see:

[2014-09-17 16:43:25,096] INFO Checkpoint interval automatically set
to 12000 ms (org.apache.spark.streaming.dstream.ReducedWindowedDStream)
[2014-09-17 16:43:25,105] INFO Checkpoint interval = null
(org.apache.spark.streaming.kafka.KafkaInputDStream)
[2014-09-17 16:43:25,107] INFO Checkpoint interval = null
(org.apache.spark.streaming.dstream.MappedDStream)
[2014-09-17 16:43:25,108] INFO Checkpoint interval = null
(org.apache.spark.streaming.dstream.MappedDStream)
[2014-09-17 16:43:25,108] INFO Checkpoint interval = null
(org.apache.spark.streaming.dstream.FilteredDStream)
[2014-09-17 16:43:25,109] INFO Checkpoint interval = null
(org.apache.spark.streaming.dstream.FlatMappedDStream)
[2014-09-17 16:43:25,110] INFO Checkpoint interval = null
(org.apache.spark.streaming.dstream.FlatMappedDStream)
[2014-09-17 16:43:25,110] INFO Checkpoint interval = null
(org.apache.spark.streaming.dstream.ShuffledDStream)
[2014-09-17 16:43:25,111] INFO Checkpoint interval = 12000 ms
(org.apache.spark.streaming.dstream.ReducedWindowedDStream)
[2014-09-17 16:43:25,111] INFO Checkpoint interval = null
(org.apache.spark.streaming.dstream.ForEachDStream)

So does it mean I have to set checkpoint interval for all the dstreams?

Thanks.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org