You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by "Charles O. Bajomo" <ch...@pretechconsulting.co.uk> on 2017/02/23 14:20:25 UTC

[Spark Streaming] Batch versus streaming

Hello, 

I am reading data from a JMS queue and I need to prevent any data loss so I have custom java receiver that only acks messages once they have been stored. Sometimes my program crashes because I can't control the flow rate from the queue and it overwhelms the job and I end up losing data. So I tried converting it into a batch job were the driver reads off the JMS queue indefinably, generates an RDD to process the data and then acks the messages once it's done. I was surprise it worked. Is there any reason not to just use this in place of streaming? I am using Spark 1.5.0 by the way 

Thanks