You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by UMESH CHAUDHARY <um...@gmail.com> on 2015/08/13 17:55:30 UTC

Streaming on Exponential Data

Hi,
I was working with non-reliable receiver version of Spark-Kafka streaming
i.e.
KafkaUtils,createStream... where for testing purpose I was getting data at
constant rate from kafka and it was acting as expected.
But when there was exponential data in Kafka, my program started crashing
saying
"Cannot Compute split on input data..." also I found on console logs that
it was adding data continuously in memory while receiving from Kafka.

How Spark Streaming behaves towards exponential data.

Re: Streaming on Exponential Data

Posted by Hemant Bhanawat <he...@gmail.com>.
What does exponential data means? Does this mean that the amount of the
data that is being received from the stream in a batchinterval is
increasing exponentially as the time progresses?

Does your process have enough memory to handle the data for a batch
interval?

You may want to share Spark task UI snapshots and logs.



On Thu, Aug 13, 2015 at 9:25 PM, UMESH CHAUDHARY <um...@gmail.com>
wrote:

> Hi,
> I was working with non-reliable receiver version of Spark-Kafka streaming
> i.e.
> KafkaUtils,createStream... where for testing purpose I was getting data at
> constant rate from kafka and it was acting as expected.
> But when there was exponential data in Kafka, my program started crashing
> saying
> "Cannot Compute split on input data..." also I found on console logs that
> it was adding data continuously in memory while receiving from Kafka.
>
> How Spark Streaming behaves towards exponential data.
>