You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Cheney Chen <tb...@gmail.com> on 2016/10/05 19:10:37 UTC

How to iterate through mongodb under Storm + Trident

Hi there,

I'm using storm 1.0.1. I have user case like this, given one signal
(received from kafka spout), fetch a bunch of users (in mongodb), then do
stream processing.

I'd prefer trident since the lambda like language is so convenient. But
don't have good idea about iterating through mongodb. By checking storm
source code, there's good example for insert and update but not loop.

So questions are:
1. Is it a good idea to iterate mongo in storm stream processing?
2. If so, any good idea or example?
3. If not, why?


-- 
Regards,
Qili Chen (Cheney)

E-mail: tbcql1986@gmail.com
MP: (+1) 4086217503

Re: How to iterate through mongodb under Storm + Trident

Posted by Cheney Chen <tb...@gmail.com>.
Thank you Jungtaek,

I feel like it was not a good design for streaming processing. I'm afraid
huge query to external db in one bolt will stuck the stream and causing
more problem.

Thus I'm going to change the implementation, moving out the query outside
storm.

On Thu, Oct 6, 2016 at 10:41 PM, Jungtaek Lim <ka...@gmail.com> wrote:

> Why not experimenting? :)
>
> Assuming that you're already having Kafka Spout so pulling data in mongo
> will be occurred from Bolt (Trident operator).
> Then it depends on how long your pull query elapsed for given signal,
> since pulling data from external data source is tend to contribute major
> latency. Your query, schema of the table, index, and others matter. So
> experimenting with your query and your table schema, and your infra only
> give you to valid answer.
>
> - Jungtaek Lim (HeartSaVioR)
>
> 2016년 10월 7일 (금) 오후 2:14, Cheney Chen <tb...@gmail.com>님이 작성:
>
>> Wondering if anyone has any clues?
>>
>> On Wed, Oct 5, 2016 at 12:10 PM, Cheney Chen <tb...@gmail.com> wrote:
>>
>> Hi there,
>>
>> I'm using storm 1.0.1. I have user case like this, given one signal
>> (received from kafka spout), fetch a bunch of users (in mongodb), then do
>> stream processing.
>>
>> I'd prefer trident since the lambda like language is so convenient. But
>> don't have good idea about iterating through mongodb. By checking storm
>> source code, there's good example for insert and update but not loop.
>>
>> So questions are:
>> 1. Is it a good idea to iterate mongo in storm stream processing?
>> 2. If so, any good idea or example?
>> 3. If not, why?
>>
>>
>> --
>> Regards,
>> Qili Chen (Cheney)
>>
>> E-mail: tbcql1986@gmail.com
>> MP: (+1) 4086217503
>>
>>
>>
>>
>> --
>> Regards,
>> Qili Chen (Cheney)
>>
>> E-mail: tbcql1986@gmail.com
>> MP: (+1) 4086217503 <+1%20408-621-7503>
>>
>


-- 
Regards,
Qili Chen (Cheney)

E-mail: tbcql1986@gmail.com
MP: (+1) 4086217503

Re: How to iterate through mongodb under Storm + Trident

Posted by Jungtaek Lim <ka...@gmail.com>.
Why not experimenting? :)

Assuming that you're already having Kafka Spout so pulling data in mongo
will be occurred from Bolt (Trident operator).
Then it depends on how long your pull query elapsed for given signal, since
pulling data from external data source is tend to contribute major latency.
Your query, schema of the table, index, and others matter. So experimenting
with your query and your table schema, and your infra only give you to
valid answer.

- Jungtaek Lim (HeartSaVioR)

2016년 10월 7일 (금) 오후 2:14, Cheney Chen <tb...@gmail.com>님이 작성:

> Wondering if anyone has any clues?
>
> On Wed, Oct 5, 2016 at 12:10 PM, Cheney Chen <tb...@gmail.com> wrote:
>
> Hi there,
>
> I'm using storm 1.0.1. I have user case like this, given one signal
> (received from kafka spout), fetch a bunch of users (in mongodb), then do
> stream processing.
>
> I'd prefer trident since the lambda like language is so convenient. But
> don't have good idea about iterating through mongodb. By checking storm
> source code, there's good example for insert and update but not loop.
>
> So questions are:
> 1. Is it a good idea to iterate mongo in storm stream processing?
> 2. If so, any good idea or example?
> 3. If not, why?
>
>
> --
> Regards,
> Qili Chen (Cheney)
>
> E-mail: tbcql1986@gmail.com
> MP: (+1) 4086217503
>
>
>
>
> --
> Regards,
> Qili Chen (Cheney)
>
> E-mail: tbcql1986@gmail.com
> MP: (+1) 4086217503 <+1%20408-621-7503>
>

Re: How to iterate through mongodb under Storm + Trident

Posted by Cheney Chen <tb...@gmail.com>.
Wondering if anyone has any clues?

On Wed, Oct 5, 2016 at 12:10 PM, Cheney Chen <tb...@gmail.com> wrote:

> Hi there,
>
> I'm using storm 1.0.1. I have user case like this, given one signal
> (received from kafka spout), fetch a bunch of users (in mongodb), then do
> stream processing.
>
> I'd prefer trident since the lambda like language is so convenient. But
> don't have good idea about iterating through mongodb. By checking storm
> source code, there's good example for insert and update but not loop.
>
> So questions are:
> 1. Is it a good idea to iterate mongo in storm stream processing?
> 2. If so, any good idea or example?
> 3. If not, why?
>
>
> --
> Regards,
> Qili Chen (Cheney)
>
> E-mail: tbcql1986@gmail.com
> MP: (+1) 4086217503
>



-- 
Regards,
Qili Chen (Cheney)

E-mail: tbcql1986@gmail.com
MP: (+1) 4086217503