You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Geoffry Sumter <vi...@gmail.com> on 2014/10/28 03:42:46 UTC

Merge step at set interval

Hello!

I'm looking to process a large amount of data from weather stations around
the world, this data is sent in quite rapidly from each individual station.
I'd like to do some independent processing and quality control of each
message I receive. Then I need a step that merges all the station data into
an n-dimensional array towards the end of my flow. However, the data is
only really useful in this form if /most/ stations have reported most of
their variables—this tends to happen after about 3 minutes, which is how
long I'd like to wait before I process the merge step. Is this possible
with storm or is this waiting-batching step a bad fit for Storm? I'd love
to read any relevant blog posts!

Thanks!
- Geoffry

Re: Merge step at set interval

Posted by Jagat Singh <ja...@gmail.com>.
Did you check this

https://storm.incubator.apache.org/documentation/Trident-tutorial.html



On Tue, Oct 28, 2014 at 1:42 PM, Geoffry Sumter <vi...@gmail.com> wrote:

> Hello!
>
> I'm looking to process a large amount of data from weather stations around
> the world, this data is sent in quite rapidly from each individual station.
> I'd like to do some independent processing and quality control of each
> message I receive. Then I need a step that merges all the station data into
> an n-dimensional array towards the end of my flow. However, the data is
> only really useful in this form if /most/ stations have reported most of
> their variables—this tends to happen after about 3 minutes, which is how
> long I'd like to wait before I process the merge step. Is this possible
> with storm or is this waiting-batching step a bad fit for Storm? I'd love
> to read any relevant blog posts!
>
> Thanks!
> - Geoffry
>