You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Jinhua Luo <lu...@gmail.com> on 2017/12/21 05:09:18 UTC

Does storm support incremental windowing operation?

Hi All,

The window may buffer many tuples before evaluation, does storm
support incremental aggregation upon the window, just like flink does?

Re: Does storm support incremental windowing operation?

Posted by Jungtaek Lim <ka...@gmail.com>.
Jinhua,

I thought about this a bit more, and found that incremental windowing
operation may not be suitable (more clear, not effective) by nature of
Storm Acker model.
(JStorm recently dropped acker model which may be why you can't see tuple
ack in their code.)

In acker model, we should store the tuple until the tuple is expired to
ensure we send correct XOR value to Acker,
(The tuple should be retained while emitting new tuples with anchoring the
tuple.)

While we could still have a room for optimization via dropping the tuple's
values to reduce the size, distributed snapshotting looks better to live
with window & stateful operations. I think we could take distributed
snapshot into account because Acker has been pointed out to performance
bottleneck for a long time since it requires calculating target task of
Acker based on message ID, and sending metadata tuple to Acker (incurring
network transfer) for every tuple.

Thanks,
Jungtaek Lim (HeartSaVioR)

2017년 12월 22일 (금) 오후 5:47, Jungtaek Lim <ka...@gmail.com>님이 작성:

> Thanks again.
>
> I guess windowing implementation in JStorm is basically inspired by Storm
> (meaning copying the codebase, which is allowed on Apache Software License.
> Heron is doing the same thing.), but looks like they have been going
> forward afterwards. By "technically", we could also migrate the code from
> JStorm, but better to try to understand the code and implement our own, if
> that's feasible.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2017년 12월 22일 (금) 오후 5:29, Jinhua Luo <lu...@gmail.com>님이 작성:
>
>> Yes, you could check the jstorm codes relation to this topic:
>>
>>
>> https://github.com/alibaba/jstorm/blob/master/jstorm-core/src/main/java/com/alibaba/jstorm/window/WindowedBoltExecutor.java
>>
>> But it seems that it does not consider tuple ack.
>>
>>
>> 2017-12-22 16:18 GMT+08:00 Jungtaek Lim <ka...@gmail.com>:
>> > Thanks for the pointer. Looks like it can be achieved with stateful
>> > windowing, but it's just based on skimming so there may be hard parts in
>> > detail.
>> > I'll spend time to read it and see how we can achieve it. (Both core
>> API and
>> > stream API which will be brought to Storm 2.0.0.)
>> >
>> > Thanks,
>> > Jungtaek Lim (HeartSaVioR)
>> >
>> > 2017년 12월 22일 (금) 오후 5:08, Jinhua Luo <lu...@gmail.com>님이 작성:
>> >>
>> >>
>> >>
>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#reducefunction
>> >>
>> >>
>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#processwindowfunction-with-incremental-aggregation
>> >>
>> >> Jstorm also supports incremental aggregation upon the window.
>> >>
>> >> 2017-12-22 13:31 GMT+08:00 Jungtaek Lim <ka...@gmail.com>:
>> >> > Hi Jinhua,
>> >> >
>> >> > could you refer the link of doc for Flink? I'm not exactly aware of
>> >> > incremental aggregation upon the window, so let me take a look at.
>> >> >
>> >> > 2017년 12월 21일 (목) 오후 2:09, Jinhua Luo <lu...@gmail.com>님이 작성:
>> >> >>
>> >> >> Hi All,
>> >> >>
>> >> >> The window may buffer many tuples before evaluation, does storm
>> >> >> support incremental aggregation upon the window, just like flink
>> does?
>>
>

Re: Does storm support incremental windowing operation?

Posted by Jungtaek Lim <ka...@gmail.com>.
Thanks again.

I guess windowing implementation in JStorm is basically inspired by Storm
(meaning copying the codebase, which is allowed on Apache Software License.
Heron is doing the same thing.), but looks like they have been going
forward afterwards. By "technically", we could also migrate the code from
JStorm, but better to try to understand the code and implement our own, if
that's feasible.

Thanks,
Jungtaek Lim (HeartSaVioR)

2017년 12월 22일 (금) 오후 5:29, Jinhua Luo <lu...@gmail.com>님이 작성:

> Yes, you could check the jstorm codes relation to this topic:
>
>
> https://github.com/alibaba/jstorm/blob/master/jstorm-core/src/main/java/com/alibaba/jstorm/window/WindowedBoltExecutor.java
>
> But it seems that it does not consider tuple ack.
>
>
> 2017-12-22 16:18 GMT+08:00 Jungtaek Lim <ka...@gmail.com>:
> > Thanks for the pointer. Looks like it can be achieved with stateful
> > windowing, but it's just based on skimming so there may be hard parts in
> > detail.
> > I'll spend time to read it and see how we can achieve it. (Both core API
> and
> > stream API which will be brought to Storm 2.0.0.)
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
> >
> > 2017년 12월 22일 (금) 오후 5:08, Jinhua Luo <lu...@gmail.com>님이 작성:
> >>
> >>
> >>
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#reducefunction
> >>
> >>
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#processwindowfunction-with-incremental-aggregation
> >>
> >> Jstorm also supports incremental aggregation upon the window.
> >>
> >> 2017-12-22 13:31 GMT+08:00 Jungtaek Lim <ka...@gmail.com>:
> >> > Hi Jinhua,
> >> >
> >> > could you refer the link of doc for Flink? I'm not exactly aware of
> >> > incremental aggregation upon the window, so let me take a look at.
> >> >
> >> > 2017년 12월 21일 (목) 오후 2:09, Jinhua Luo <lu...@gmail.com>님이 작성:
> >> >>
> >> >> Hi All,
> >> >>
> >> >> The window may buffer many tuples before evaluation, does storm
> >> >> support incremental aggregation upon the window, just like flink
> does?
>

Re: Does storm support incremental windowing operation?

Posted by Jinhua Luo <lu...@gmail.com>.
Yes, you could check the jstorm codes relation to this topic:

https://github.com/alibaba/jstorm/blob/master/jstorm-core/src/main/java/com/alibaba/jstorm/window/WindowedBoltExecutor.java

But it seems that it does not consider tuple ack.


2017-12-22 16:18 GMT+08:00 Jungtaek Lim <ka...@gmail.com>:
> Thanks for the pointer. Looks like it can be achieved with stateful
> windowing, but it's just based on skimming so there may be hard parts in
> detail.
> I'll spend time to read it and see how we can achieve it. (Both core API and
> stream API which will be brought to Storm 2.0.0.)
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2017년 12월 22일 (금) 오후 5:08, Jinhua Luo <lu...@gmail.com>님이 작성:
>>
>>
>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#reducefunction
>>
>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#processwindowfunction-with-incremental-aggregation
>>
>> Jstorm also supports incremental aggregation upon the window.
>>
>> 2017-12-22 13:31 GMT+08:00 Jungtaek Lim <ka...@gmail.com>:
>> > Hi Jinhua,
>> >
>> > could you refer the link of doc for Flink? I'm not exactly aware of
>> > incremental aggregation upon the window, so let me take a look at.
>> >
>> > 2017년 12월 21일 (목) 오후 2:09, Jinhua Luo <lu...@gmail.com>님이 작성:
>> >>
>> >> Hi All,
>> >>
>> >> The window may buffer many tuples before evaluation, does storm
>> >> support incremental aggregation upon the window, just like flink does?

Re: Does storm support incremental windowing operation?

Posted by Jungtaek Lim <ka...@gmail.com>.
Thanks for the pointer. Looks like it can be achieved with stateful
windowing, but it's just based on skimming so there may be hard parts in
detail.
I'll spend time to read it and see how we can achieve it. (Both core API
and stream API which will be brought to Storm 2.0.0.)

Thanks,
Jungtaek Lim (HeartSaVioR)

2017년 12월 22일 (금) 오후 5:08, Jinhua Luo <lu...@gmail.com>님이 작성:

>
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#reducefunction
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#processwindowfunction-with-incremental-aggregation
>
> Jstorm also supports incremental aggregation upon the window.
>
> 2017-12-22 13:31 GMT+08:00 Jungtaek Lim <ka...@gmail.com>:
> > Hi Jinhua,
> >
> > could you refer the link of doc for Flink? I'm not exactly aware of
> > incremental aggregation upon the window, so let me take a look at.
> >
> > 2017년 12월 21일 (목) 오후 2:09, Jinhua Luo <lu...@gmail.com>님이 작성:
> >>
> >> Hi All,
> >>
> >> The window may buffer many tuples before evaluation, does storm
> >> support incremental aggregation upon the window, just like flink does?
>

Re: Does storm support incremental windowing operation?

Posted by Jinhua Luo <lu...@gmail.com>.
https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#reducefunction
https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#processwindowfunction-with-incremental-aggregation

Jstorm also supports incremental aggregation upon the window.

2017-12-22 13:31 GMT+08:00 Jungtaek Lim <ka...@gmail.com>:
> Hi Jinhua,
>
> could you refer the link of doc for Flink? I'm not exactly aware of
> incremental aggregation upon the window, so let me take a look at.
>
> 2017년 12월 21일 (목) 오후 2:09, Jinhua Luo <lu...@gmail.com>님이 작성:
>>
>> Hi All,
>>
>> The window may buffer many tuples before evaluation, does storm
>> support incremental aggregation upon the window, just like flink does?

Re: Does storm support incremental windowing operation?

Posted by Jungtaek Lim <ka...@gmail.com>.
Hi Jinhua,

could you refer the link of doc for Flink? I'm not exactly aware of
incremental aggregation upon the window, so let me take a look at.

2017년 12월 21일 (목) 오후 2:09, Jinhua Luo <lu...@gmail.com>님이 작성:

> Hi All,
>
> The window may buffer many tuples before evaluation, does storm
> support incremental aggregation upon the window, just like flink does?
>

Re: Does storm support incremental windowing operation?

Posted by Manish Sharma <ma...@gmail.com>.
Guava provides cachestats out of the box, we found it pretty useful..
Other than that, you are right, we are just using it as plain old k/v map
Cheers, /Manish


On 12/21/17 8:55 PM, Stephen Powis wrote:
> Is there a reason to use Guava cache to aggregate over just a plain 
> old Map?  Curious as aggregation is a common use case for us and never 
> thought to look to Guava for it.
>
> Thanks!
>
> On Fri, Dec 22, 2017 at 1:50 PM, Manish Sharma <maaand@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     We added a guava cache in the bolt's execute method to aggregate tuples and wait for the tick signal.
>
>     You can control the tick frequency with TOPOLOGY_TICK_TUPLE_FREQ_SECSin the main topology.
>
>     This is a Spark use case IMO.
>
>     cheers, /Manish
>
>
>     On Wed, Dec 20, 2017 at 9:09 PM, Jinhua Luo <luajit.io@gmail.com
>     <ma...@gmail.com>> wrote:
>
>         Hi All,
>
>         The window may buffer many tuples before evaluation, does storm
>         support incremental aggregation upon the window, just like
>         flink does?
>
>
>


Re: Does storm support incremental windowing operation?

Posted by Stephen Powis <sp...@salesforce.com>.
Is there a reason to use Guava cache to aggregate over just a plain old
Map?  Curious as aggregation is a common use case for us and never thought
to look to Guava for it.

Thanks!

On Fri, Dec 22, 2017 at 1:50 PM, Manish Sharma <ma...@gmail.com> wrote:

> We added a guava cache in the bolt's execute method to aggregate tuples and wait for the tick signal.
>
> You can control the tick frequency with TOPOLOGY_TICK_TUPLE_FREQ_SECS in the main topology.
>
>
> This is a Spark use case IMO.
>
>
> cheers, /Manish
>
>
>
>
> On Wed, Dec 20, 2017 at 9:09 PM, Jinhua Luo <lu...@gmail.com> wrote:
>
>> Hi All,
>>
>> The window may buffer many tuples before evaluation, does storm
>> support incremental aggregation upon the window, just like flink does?
>>
>
>

Re: Does storm support incremental windowing operation?

Posted by Manish Sharma <ma...@gmail.com>.
We added a guava cache in the bolt's execute method to aggregate
tuples and wait for the tick signal.

You can control the tick frequency with TOPOLOGY_TICK_TUPLE_FREQ_SECS
in the main topology.


This is a Spark use case IMO.


cheers, /Manish




On Wed, Dec 20, 2017 at 9:09 PM, Jinhua Luo <lu...@gmail.com> wrote:

> Hi All,
>
> The window may buffer many tuples before evaluation, does storm
> support incremental aggregation upon the window, just like flink does?
>