You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Yan Fang <ya...@gmail.com> on 2014/07/08 18:19:16 UTC

How does Samza buffer messages?

I was a little confusing by the statement "Samza takes a different approach
to buffering. We buffer to disk at every hop between a StreamTask.
<http://samza.incubator.apache.org/learn/documentation/0.7.0/comparisons/storm.html>
 ".

What does "buffer to disk" mean here? I actually do not get how we deal
with the situation when the processing is slower than receiving messages.
Thank you.

Cheers,

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108

Re: How does Samza buffer messages?

Posted by Yan Fang <ya...@gmail.com>.
Cool. Got it now. Thank you.

Cheers,

Fang, Yan
yanfang724@gmail.com
+1 (206) 849-4108


On Tue, Jul 8, 2014 at 9:33 AM, Jakob Homan <jg...@gmail.com> wrote:

> This is referring to the usage of Kafka as where we deposit messages after
> processing by one job and (potentially) before processing by the next.
> Since Kafka writes all messages to disk and we (generally) write all
> messages to Kafka, this is our buffering to disk. The statement could be
> made a bit more explicit that this is the case when using Kafka and not
> necessarily other producers or consumers.
>
> This approach is in contrast to other systems that try to keep messages in
> memory before passing them to another processor.
> -jg
>
>
> On Tue, Jul 8, 2014 at 9:19 AM, Yan Fang <ya...@gmail.com> wrote:
>
> > I was a little confusing by the statement "Samza takes a different
> approach
> > to buffering. We buffer to disk at every hop between a StreamTask.
> > <
> >
> http://samza.incubator.apache.org/learn/documentation/0.7.0/comparisons/storm.html
> > >
> >  ".
> >
> > What does "buffer to disk" mean here? I actually do not get how we deal
> > with the situation when the processing is slower than receiving messages.
> > Thank you.
> >
> > Cheers,
> >
> > Fang, Yan
> > yanfang724@gmail.com
> > +1 (206) 849-4108
> >
>

Re: How does Samza buffer messages?

Posted by Jakob Homan <jg...@gmail.com>.
This is referring to the usage of Kafka as where we deposit messages after
processing by one job and (potentially) before processing by the next.
Since Kafka writes all messages to disk and we (generally) write all
messages to Kafka, this is our buffering to disk. The statement could be
made a bit more explicit that this is the case when using Kafka and not
necessarily other producers or consumers.

This approach is in contrast to other systems that try to keep messages in
memory before passing them to another processor.
-jg


On Tue, Jul 8, 2014 at 9:19 AM, Yan Fang <ya...@gmail.com> wrote:

> I was a little confusing by the statement "Samza takes a different approach
> to buffering. We buffer to disk at every hop between a StreamTask.
> <
> http://samza.incubator.apache.org/learn/documentation/0.7.0/comparisons/storm.html
> >
>  ".
>
> What does "buffer to disk" mean here? I actually do not get how we deal
> with the situation when the processing is slower than receiving messages.
> Thank you.
>
> Cheers,
>
> Fang, Yan
> yanfang724@gmail.com
> +1 (206) 849-4108
>