You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Rajat Gangwar <ra...@gmail.com> on 2017/04/01 04:51:55 UTC

Re: how to consume/reset aggregated bucket

Let me rephrase my question :

Storm keeps on aggregating same bucket after segregating tuples based on
grouped fields.

- I want to consume all the aggregated content every hour, and reset
aggregation count back to zero. Storm should not be updating this bucket
while consuming.
- Storm then continue aggregating the same bucket.
- After 1 hour repeat the same process.


How can I achieve this in Storm ???
any input would help me a lot.


Thanks,
Rajat


On Tue, Mar 28, 2017 at 12:45 PM, Rajat Gangwar <ra...@gmail.com>
wrote:

> Use-Case :
>
> Every hour 100K invoices (entities) are created which are pushed to Storm.
> These invoices belong to 'n' users.
>
> So Storm does group aggregation on users, and creates aggregated buckets
> per user. Along with sum of invoices, bucket also contains invoices IDs, so
> that we know what all invoices got aggregated.
> These buckets are getting updated continuously.
>
> Problem : We need to consume these buckets on 2 rules :
>                 a) bucket has aggregated 1000 invoices, or
>                 b) bucket created 1 hour ago.
>
> In both cases we need to consume this bucket and do some action on what
> invoices are aggregated so far. Before we can consume we need to make sure
> that storm should stop updating that bucket and create/update another
> bucket for the same customer if new invoices comes.
>
>
> Is there a way to achieve this in Storm ?
>
>
> Thanks,
> Rajat
>
>
>