You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by KhajaAsmath Mohammed <md...@gmail.com> on 2018/01/22 17:36:03 UTC
Production Critical : Data loss in spark streaming
Hi,
I have been using the spark streaming with kafka. I have to restart the
application daily due to kms issue and after restart the offsets are not
matching with the point I left. I am creating checkpoint directory with
val streamingContext = StreamingContext.getOrCreate(checkPointDir, () =>
createStreamingContext(checkPointDir, sparkSession, batchInt, kafkaParams,
topicsSet, config, sparkConfig))
Batch 1:
Batch 2: After Restart and completion of two batches.
[image: Inline image 1]
Thanks,
Asmath