You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Udbhav Agarwal <ud...@syncoms.com> on 2017/02/09 05:37:06 UTC

Spark stream parallel streaming

Hi,
I am using spark streaming for processing messages from kafka for real time analytics. I am trying to fine tune my streaming process. Currently my spark streaming system is reading a batch of messages from kafka topic and processing each message one at a time. I have set properties in spark streaming for increasing parallelism for tasks it performs to process that one message.
               The problem here is that the processing of one message picked up in a batch is still taking lot of time as my workflow is like that. What I would like to implement is a way in which other messages picked up in that batch can be sent for processing in parallel along with that first message. This scenario will reduce the overall processing time as some messages may take time and some may not and others are not waiting for the one message to process.
               Is this kind of implementation possible with spark streaming ? If not then do I need to use some other tool along with spark streaming to include this kind of processing ? What are the possible options for me?
Thanks in advance.

Thanks,
Udbhav Agarwal