You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Jeniba Johnson <Je...@lntinfotech.com> on 2014/12/16 11:32:21 UTC

Data Loss - Spark streaming

Hi,

I need a clarification, while running streaming examples, suppose the batch interval is set to 5 minutes, after collecting the data from the input source(FLUME) and  processing till 5 minutes.
What will happen to the data which is flowing continuously from the input source to spark streaming ? Will that data be stored somewhere or else the data will be lost ?
Or else what is the solution to capture each and every data without any loss in Spark streaming.

Awaiting for your kind reply.


Regards,
Jeniba Johnson


________________________________
The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and using or disseminating the information, and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"

Re: Data Loss - Spark streaming

Posted by Ryan Williams <ry...@gmail.com>.
TD's portion seems to start at 27:24: http://youtu.be/jcJq3ZalXD8?t=27m24s

On Tue Dec 16 2014 at 7:13:43 AM Gerard Maas <ge...@gmail.com> wrote:

> Hi Jeniba,
>
> The second part of this meetup recording has a very good answer to your
> question.  TD explains the current behavior and the on-going work in Spark
> Streaming to fix HA.
> https://www.youtube.com/watch?v=jcJq3ZalXD8
>
>
> -kr, Gerard.
>
> On Tue, Dec 16, 2014 at 11:32 AM, Jeniba Johnson <
> Jeniba.Johnson@lntinfotech.com> wrote:
>>
>> Hi,
>>
>> I need a clarification, while running streaming examples, suppose the
>> batch interval is set to 5 minutes, after collecting the data from the
>> input source(FLUME) and  processing till 5 minutes.
>> What will happen to the data which is flowing continuously from the input
>> source to spark streaming ? Will that data be stored somewhere or else the
>> data will be lost ?
>> Or else what is the solution to capture each and every data without any
>> loss in Spark streaming.
>>
>> Awaiting for your kind reply.
>>
>>
>> Regards,
>> Jeniba Johnson
>>
>>
>> ________________________________
>> The contents of this e-mail and any attachment(s) may contain
>> confidential or privileged information for the intended recipient(s).
>> Unintended recipients are prohibited from taking action on the basis of
>> information in this e-mail and using or disseminating the information, and
>> must notify the sender and delete it from their system. L&T Infotech will
>> not accept responsibility or liability for the accuracy or completeness of,
>> or the presence of any virus or disabling code in this e-mail"
>>
>

Re: Data Loss - Spark streaming

Posted by Gerard Maas <ge...@gmail.com>.
Hi Jeniba,

The second part of this meetup recording has a very good answer to your
question.  TD explains the current behavior and the on-going work in Spark
Streaming to fix HA.
https://www.youtube.com/watch?v=jcJq3ZalXD8


-kr, Gerard.

On Tue, Dec 16, 2014 at 11:32 AM, Jeniba Johnson <
Jeniba.Johnson@lntinfotech.com> wrote:
>
> Hi,
>
> I need a clarification, while running streaming examples, suppose the
> batch interval is set to 5 minutes, after collecting the data from the
> input source(FLUME) and  processing till 5 minutes.
> What will happen to the data which is flowing continuously from the input
> source to spark streaming ? Will that data be stored somewhere or else the
> data will be lost ?
> Or else what is the solution to capture each and every data without any
> loss in Spark streaming.
>
> Awaiting for your kind reply.
>
>
> Regards,
> Jeniba Johnson
>
>
> ________________________________
> The contents of this e-mail and any attachment(s) may contain confidential
> or privileged information for the intended recipient(s). Unintended
> recipients are prohibited from taking action on the basis of information in
> this e-mail and using or disseminating the information, and must notify the
> sender and delete it from their system. L&T Infotech will not accept
> responsibility or liability for the accuracy or completeness of, or the
> presence of any virus or disabling code in this e-mail"
>