You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by KhajaAsmath Mohammed <md...@gmail.com> on 2018/01/22 17:36:03 UTC

Production Critical : Data loss in spark streaming

Hi,

I have been using the spark streaming with kafka. I have to restart the
application daily due to kms issue and after restart the offsets are not
matching with the point I left. I am creating checkpoint directory with

val streamingContext = StreamingContext.getOrCreate(checkPointDir, () =>
createStreamingContext(checkPointDir, sparkSession, batchInt, kafkaParams,
topicsSet, config, sparkConfig))

Batch 1:


Batch 2: After Restart and completion of two batches.


[image: Inline image 1]
Thanks,
Asmath