You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by Ankur Garg <an...@gmail.com> on 2015/10/11 18:07:39 UTC

Multiple Spouts in Same topology or Topology per spout

Hi ,

So I have a situation where I want to read messages from different queues
hosted in a Rabbitmq Server .

Now , there are three ways which I can think to leverage Apache Storm here
:-

1) Use the same Spout (say Spout A) to read messages from different queues
and based on the messages received emit it to different Bolts.

2) Use different Spout (Spout A and Spout B and so on) within the same
topology (say Topology A) to read messages from different queues .

3) Use Different Spouts one within eachTopology (Topology A , Topology B
and so on) to read messages from different queues .

Which is the best way to process this considering I want high throughput
(more no of queue messages to be processed concurrently) .

Also , If In use same Topology for all Spouts (currently though requirement
is for 2 spouts)  will failure in one Spout (or its associated Bolts)
effect the second or will they both continue working separately even if
some failure is in Spout B ?

Cost wise , how much would it be to maintain two different topologies .

Looking for inputs from members here.

Thanks
Ankur

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Ankur Garg <an...@gmail.com>.
Hi Ravi,

Thanks for the reply . I got your point of using different bolts for mysql
and Mongo .

One thing though , is it a good idea to use different topologies within the
same cluster .

The idea behind above rational is if I use the same topology but different
bolts to do processing , I believe failure in any one of the bolts will
cause entire message to be replayed . Though this may not mean any real
problem in any of the database (like multiple inserts wont cause any
problem ) but overall throughput of ur topology will affect .

With different topologies , the idea is to seperate execution to different
set of spouts and bolts . So , assuming that topology which had been given
the responsibility of doing a different task fails , it wont effect the
other topologies .

If my rationale is correct , how does it effect cost wise maintaining
different topologies .Also , for simulating and testing this at my end ,
can I test this on local cluster?

Thanks
Ankur

On Mon, Oct 12, 2015 at 3:22 PM, Ravi Sharma <pi...@gmail.com> wrote:

> Hi Ankur,
>
> Storm's design is stateless, so storm cant store any info about what bolts
> were successful and which one failed.
> Idea is to replay the message again without affecting the final outcome.
> (means if mysql was success, it shudnt add two rows in case its replayed)
>
> From looking at far i would say you may be fixing an issue which hasnt
> happened yet. Assumption is that one DB will be failing a lot, i guess this
> may not be real case.
> Any of the DB can fail once in a while and replaying them shudnt affect ur
> performance. (say less then 10% Message failed) , you will be planning
> atleast 50% more capacity then ur max load.
>
>
> If you really want it to be very effective, i say use something like redis
> and store your bolt status with message id there, so every time you plan to
> start a bolt proessing check if you have already completed it succesfully,
> if yes then skip it.
> I have defined my own MessageId object and always put a retry count in it.
> So first one goes with 0, and at that moment you can avoid the redis/nosql
> checks.
> But then u r adding one more technology and it just increased the
> complexity.
>
>
> Whatever design you choose, i will still suggest to use two bolts, Monogo
> and mysql both are different cluster(hardware) and technology(software),
> they both will have different throughput and scalability. And as per your
> requirment you dont care if data hasnt reached to one exactly at same time,
> no atomicity (basically its not one transaction), so you dont want to slow
> down one system because other is slower.
>
>
> Last suggestion is to go with two spouts....  both will read from same
> topic(not queue), so all messages will be delivered to both Spouts. One
> Spout will send message to Mysql Bolt other will send to Mongo Bolt.
>
>
> Ravi.
>
>
>
>
> Ravi.
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Mon, Oct 12, 2015 at 10:14 AM, Ankur Garg <an...@gmail.com> wrote:
>
>> LOL .. I was looking for something better :) ..If you see then having
>> multiple bolts here do not help much .. It would have helped had there been
>> a provision to skip the already executed Bolts .
>>
>>
>> I believe this should be there in Storm .
>>
>> Thanks
>> Ankur
>>
>> On Mon, Oct 12, 2015 at 2:42 PM, Susheel Kumar Gadalay <
>> skgadalay@gmail.com> wrote:
>>
>>> Check and insert
>>>
>>> On 10/12/15, Ankur Garg <an...@gmail.com> wrote:
>>> > But what if MongoDb bolt has some error , in that case I suppose the
>>> entire
>>> > tuple will be replayed from Spout meaning it will have to redo the
>>> > operation of inserting into sql . Is there a way I can skip inserting
>>> into
>>> > mysql ?
>>> >
>>> > On Mon, Oct 12, 2015 at 1:54 PM, Susheel Kumar Gadalay
>>> > <sk...@gmail.com>
>>> > wrote:
>>> >
>>> >> It is better to have 2 bolts - mysql bolt and mongodb bolt.
>>> >>
>>> >> Let mysql bolt forward the tuple to mongodb bolt, so in case of error
>>> >> it won't  emit.
>>> >>
>>> >> On 10/12/15, Ankur Garg <an...@gmail.com> wrote:
>>> >> > So I have a situation where the tuple received on Spout has to be
>>> saved
>>> >> to
>>> >> > mysql database and mongoDb as well .
>>> >> >
>>> >> > What should be better . Using 1 bolt to save it into mysql and
>>> MongoDb
>>> >> or 2
>>> >> > seperate Bolts (One for saving into mysql and other for saving into
>>> >> Mongo).
>>> >> >
>>> >> > What happens when the exception occurs while saving into mysql ? I
>>> >> believe
>>> >> > I will get acknowledgement inside the fail method in my Spout . So
>>> If I
>>> >> > reprocess it using 2 bolts , I believe it will again be sent to Bolt
>>> >> > for
>>> >> > saving into Mongo database .
>>> >> >
>>> >> > If the above is true , will having 2 seperate bolts be of any
>>> advantage
>>> >> > ?
>>> >> > how can I configure things so that Failure while inserting into
>>> mysql
>>> >> does
>>> >> > not impact inserting into MongoDb .
>>> >> >
>>> >> > Thanks
>>> >> > Ankur
>>> >> >
>>> >> > On Sun, Oct 11, 2015 at 10:57 PM, Ravi Sharma <pi...@gmail.com>
>>> >> wrote:
>>> >> >
>>> >> >> That depends if ur spout error has affected jvm or normal
>>> application
>>> >> >> error
>>> >> >>
>>> >> >> performance issue in case of lot of errors, I don't think there is
>>> any
>>> >> >> issue be coz of errors themselves but ofcourse if u r retrying
>>> these
>>> >> >> messages on failure then that means u will be processing lot of
>>> >> >> messages
>>> >> >> then normal and overall throughput will go down
>>> >> >>
>>> >> >> Ravi
>>> >> >>
>>> >> >> If ur topology has enabled acknowledgment that means spout will
>>> always
>>> >> >> receive
>>> >> >> On 11 Oct 2015 18:15, "Ankur Garg" <an...@gmail.com> wrote:
>>> >> >>
>>> >> >>>
>>> >> >>> Thanks for the reply Abhishek and Ravi .
>>> >> >>>
>>> >> >>> One question though , going with One topology with multiple spouts
>>> >> >>> ...What if something goes wrong in One spout or its associated
>>> bolts
>>> >> >>> ..
>>> >> >>> Does it impact other Spout as well?
>>> >> >>>
>>> >> >>> Thanks
>>> >> >>> Ankur
>>> >> >>>
>>> >> >>> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <
>>> ping2ravi@gmail.com>
>>> >> >>> wrote:
>>> >> >>>
>>> >> >>>> No 100% right ansers , u will have to test and see what will
>>> fit..
>>> >> >>>>
>>> >> >>>> persoanlly i wud suggest Multiple spouts in one Topology and if
>>> you
>>> >> >>>> have
>>> >> >>>> N node where topology will be running then each Spout(reading
>>> from
>>> >> >>>> one
>>> >> >>>> queue) shud run N times in parallel.
>>> >> >>>>
>>> >> >>>> if 2 Queues and say 4 Nodes
>>> >> >>>> then one topolgy
>>> >> >>>> 4 Spouts reading from Queue1 in different nodes
>>> >> >>>> 4 spouts reading from Queue2 in different nodes
>>> >> >>>>
>>> >> >>>> Ravi.
>>> >> >>>>
>>> >> >>>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <
>>> >> >>>> abhishek.priya@gmail.com> wrote:
>>> >> >>>>
>>> >> >>>>> I guess this is a question where there r no really correct
>>> answers.
>>> >> >>>>> I'll certainly avoid#1 as it is better to keep logic separate
>>> and
>>> >> >>>>> lightweight.
>>> >> >>>>>
>>> >> >>>>> If your downstream bolts are same, then it makes senses to keep
>>> >> >>>>> them
>>> >> >>>>> in
>>> >> >>>>> same topology but if they r totally different, I'll keep them in
>>> >> >>>>> two
>>> >> >>>>> different topologies. That will allow me to independently deploy
>>> >> >>>>> and
>>> >> >>>>> scale
>>> >> >>>>> the topology. But if the rest of logic is same I topology
>>> scaling
>>> >> >>>>> and
>>> >> >>>>> resource utilization will be better with one topology.
>>> >> >>>>>
>>> >> >>>>> I hope this helps..
>>> >> >>>>>
>>> >> >>>>> Sent somehow....
>>> >> >>>>>
>>> >> >>>>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <ankurgarg9@gmail.com
>>> >
>>> >> >>>>> > wrote:
>>> >> >>>>> >
>>> >> >>>>> > Hi ,
>>> >> >>>>> >
>>> >> >>>>> > So I have a situation where I want to read messages from
>>> >> >>>>> > different
>>> >> >>>>> queues hosted in a Rabbitmq Server .
>>> >> >>>>> >
>>> >> >>>>> > Now , there are three ways which I can think to leverage
>>> Apache
>>> >> >>>>> > Storm
>>> >> >>>>> here :-
>>> >> >>>>> >
>>> >> >>>>> > 1) Use the same Spout (say Spout A) to read messages from
>>> >> >>>>> > different
>>> >> >>>>> queues and based on the messages received emit it to different
>>> >> >>>>> Bolts.
>>> >> >>>>> >
>>> >> >>>>> > 2) Use different Spout (Spout A and Spout B and so on) within
>>> the
>>> >> >>>>> same topology (say Topology A) to read messages from different
>>> >> >>>>> queues
>>> >> >>>>> .
>>> >> >>>>> >
>>> >> >>>>> > 3) Use Different Spouts one within eachTopology (Topology A ,
>>> >> >>>>> Topology B and so on) to read messages from different queues .
>>> >> >>>>> >
>>> >> >>>>> > Which is the best way to process this considering I want high
>>> >> >>>>> throughput (more no of queue messages to be processed
>>> concurrently)
>>> >> >>>>> .
>>> >> >>>>> >
>>> >> >>>>> > Also , If In use same Topology for all Spouts (currently
>>> though
>>> >> >>>>> requirement is for 2 spouts)  will failure in one Spout (or its
>>> >> >>>>> associated
>>> >> >>>>> Bolts) effect the second or will they both continue working
>>> >> separately
>>> >> >>>>> even
>>> >> >>>>> if some failure is in Spout B ?
>>> >> >>>>> >
>>> >> >>>>> > Cost wise , how much would it be to maintain two different
>>> >> >>>>> > topologies
>>> >> >>>>> .
>>> >> >>>>> >
>>> >> >>>>> > Looking for inputs from members here.
>>> >> >>>>> >
>>> >> >>>>> > Thanks
>>> >> >>>>> > Ankur
>>> >> >>>>> >
>>> >> >>>>> >
>>> >> >>>>>
>>> >> >>>>
>>> >> >>>>
>>> >> >>>
>>> >> >
>>> >>
>>> >
>>>
>>
>>
>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Ravi Sharma <pi...@gmail.com>.
Hi Ankur,

Storm's design is stateless, so storm cant store any info about what bolts
were successful and which one failed.
Idea is to replay the message again without affecting the final outcome.
(means if mysql was success, it shudnt add two rows in case its replayed)

>From looking at far i would say you may be fixing an issue which hasnt
happened yet. Assumption is that one DB will be failing a lot, i guess this
may not be real case.
Any of the DB can fail once in a while and replaying them shudnt affect ur
performance. (say less then 10% Message failed) , you will be planning
atleast 50% more capacity then ur max load.


If you really want it to be very effective, i say use something like redis
and store your bolt status with message id there, so every time you plan to
start a bolt proessing check if you have already completed it succesfully,
if yes then skip it.
I have defined my own MessageId object and always put a retry count in it.
So first one goes with 0, and at that moment you can avoid the redis/nosql
checks.
But then u r adding one more technology and it just increased the
complexity.


Whatever design you choose, i will still suggest to use two bolts, Monogo
and mysql both are different cluster(hardware) and technology(software),
they both will have different throughput and scalability. And as per your
requirment you dont care if data hasnt reached to one exactly at same time,
no atomicity (basically its not one transaction), so you dont want to slow
down one system because other is slower.


Last suggestion is to go with two spouts....  both will read from same
topic(not queue), so all messages will be delivered to both Spouts. One
Spout will send message to Mysql Bolt other will send to Mongo Bolt.


Ravi.




Ravi.













On Mon, Oct 12, 2015 at 10:14 AM, Ankur Garg <an...@gmail.com> wrote:

> LOL .. I was looking for something better :) ..If you see then having
> multiple bolts here do not help much .. It would have helped had there been
> a provision to skip the already executed Bolts .
>
>
> I believe this should be there in Storm .
>
> Thanks
> Ankur
>
> On Mon, Oct 12, 2015 at 2:42 PM, Susheel Kumar Gadalay <
> skgadalay@gmail.com> wrote:
>
>> Check and insert
>>
>> On 10/12/15, Ankur Garg <an...@gmail.com> wrote:
>> > But what if MongoDb bolt has some error , in that case I suppose the
>> entire
>> > tuple will be replayed from Spout meaning it will have to redo the
>> > operation of inserting into sql . Is there a way I can skip inserting
>> into
>> > mysql ?
>> >
>> > On Mon, Oct 12, 2015 at 1:54 PM, Susheel Kumar Gadalay
>> > <sk...@gmail.com>
>> > wrote:
>> >
>> >> It is better to have 2 bolts - mysql bolt and mongodb bolt.
>> >>
>> >> Let mysql bolt forward the tuple to mongodb bolt, so in case of error
>> >> it won't  emit.
>> >>
>> >> On 10/12/15, Ankur Garg <an...@gmail.com> wrote:
>> >> > So I have a situation where the tuple received on Spout has to be
>> saved
>> >> to
>> >> > mysql database and mongoDb as well .
>> >> >
>> >> > What should be better . Using 1 bolt to save it into mysql and
>> MongoDb
>> >> or 2
>> >> > seperate Bolts (One for saving into mysql and other for saving into
>> >> Mongo).
>> >> >
>> >> > What happens when the exception occurs while saving into mysql ? I
>> >> believe
>> >> > I will get acknowledgement inside the fail method in my Spout . So
>> If I
>> >> > reprocess it using 2 bolts , I believe it will again be sent to Bolt
>> >> > for
>> >> > saving into Mongo database .
>> >> >
>> >> > If the above is true , will having 2 seperate bolts be of any
>> advantage
>> >> > ?
>> >> > how can I configure things so that Failure while inserting into mysql
>> >> does
>> >> > not impact inserting into MongoDb .
>> >> >
>> >> > Thanks
>> >> > Ankur
>> >> >
>> >> > On Sun, Oct 11, 2015 at 10:57 PM, Ravi Sharma <pi...@gmail.com>
>> >> wrote:
>> >> >
>> >> >> That depends if ur spout error has affected jvm or normal
>> application
>> >> >> error
>> >> >>
>> >> >> performance issue in case of lot of errors, I don't think there is
>> any
>> >> >> issue be coz of errors themselves but ofcourse if u r retrying these
>> >> >> messages on failure then that means u will be processing lot of
>> >> >> messages
>> >> >> then normal and overall throughput will go down
>> >> >>
>> >> >> Ravi
>> >> >>
>> >> >> If ur topology has enabled acknowledgment that means spout will
>> always
>> >> >> receive
>> >> >> On 11 Oct 2015 18:15, "Ankur Garg" <an...@gmail.com> wrote:
>> >> >>
>> >> >>>
>> >> >>> Thanks for the reply Abhishek and Ravi .
>> >> >>>
>> >> >>> One question though , going with One topology with multiple spouts
>> >> >>> ...What if something goes wrong in One spout or its associated
>> bolts
>> >> >>> ..
>> >> >>> Does it impact other Spout as well?
>> >> >>>
>> >> >>> Thanks
>> >> >>> Ankur
>> >> >>>
>> >> >>> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <ping2ravi@gmail.com
>> >
>> >> >>> wrote:
>> >> >>>
>> >> >>>> No 100% right ansers , u will have to test and see what will fit..
>> >> >>>>
>> >> >>>> persoanlly i wud suggest Multiple spouts in one Topology and if
>> you
>> >> >>>> have
>> >> >>>> N node where topology will be running then each Spout(reading from
>> >> >>>> one
>> >> >>>> queue) shud run N times in parallel.
>> >> >>>>
>> >> >>>> if 2 Queues and say 4 Nodes
>> >> >>>> then one topolgy
>> >> >>>> 4 Spouts reading from Queue1 in different nodes
>> >> >>>> 4 spouts reading from Queue2 in different nodes
>> >> >>>>
>> >> >>>> Ravi.
>> >> >>>>
>> >> >>>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <
>> >> >>>> abhishek.priya@gmail.com> wrote:
>> >> >>>>
>> >> >>>>> I guess this is a question where there r no really correct
>> answers.
>> >> >>>>> I'll certainly avoid#1 as it is better to keep logic separate and
>> >> >>>>> lightweight.
>> >> >>>>>
>> >> >>>>> If your downstream bolts are same, then it makes senses to keep
>> >> >>>>> them
>> >> >>>>> in
>> >> >>>>> same topology but if they r totally different, I'll keep them in
>> >> >>>>> two
>> >> >>>>> different topologies. That will allow me to independently deploy
>> >> >>>>> and
>> >> >>>>> scale
>> >> >>>>> the topology. But if the rest of logic is same I topology scaling
>> >> >>>>> and
>> >> >>>>> resource utilization will be better with one topology.
>> >> >>>>>
>> >> >>>>> I hope this helps..
>> >> >>>>>
>> >> >>>>> Sent somehow....
>> >> >>>>>
>> >> >>>>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com>
>> >> >>>>> > wrote:
>> >> >>>>> >
>> >> >>>>> > Hi ,
>> >> >>>>> >
>> >> >>>>> > So I have a situation where I want to read messages from
>> >> >>>>> > different
>> >> >>>>> queues hosted in a Rabbitmq Server .
>> >> >>>>> >
>> >> >>>>> > Now , there are three ways which I can think to leverage Apache
>> >> >>>>> > Storm
>> >> >>>>> here :-
>> >> >>>>> >
>> >> >>>>> > 1) Use the same Spout (say Spout A) to read messages from
>> >> >>>>> > different
>> >> >>>>> queues and based on the messages received emit it to different
>> >> >>>>> Bolts.
>> >> >>>>> >
>> >> >>>>> > 2) Use different Spout (Spout A and Spout B and so on) within
>> the
>> >> >>>>> same topology (say Topology A) to read messages from different
>> >> >>>>> queues
>> >> >>>>> .
>> >> >>>>> >
>> >> >>>>> > 3) Use Different Spouts one within eachTopology (Topology A ,
>> >> >>>>> Topology B and so on) to read messages from different queues .
>> >> >>>>> >
>> >> >>>>> > Which is the best way to process this considering I want high
>> >> >>>>> throughput (more no of queue messages to be processed
>> concurrently)
>> >> >>>>> .
>> >> >>>>> >
>> >> >>>>> > Also , If In use same Topology for all Spouts (currently though
>> >> >>>>> requirement is for 2 spouts)  will failure in one Spout (or its
>> >> >>>>> associated
>> >> >>>>> Bolts) effect the second or will they both continue working
>> >> separately
>> >> >>>>> even
>> >> >>>>> if some failure is in Spout B ?
>> >> >>>>> >
>> >> >>>>> > Cost wise , how much would it be to maintain two different
>> >> >>>>> > topologies
>> >> >>>>> .
>> >> >>>>> >
>> >> >>>>> > Looking for inputs from members here.
>> >> >>>>> >
>> >> >>>>> > Thanks
>> >> >>>>> > Ankur
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>>
>> >> >>>>
>> >> >>>>
>> >> >>>
>> >> >
>> >>
>> >
>>
>
>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Ankur Garg <an...@gmail.com>.
LOL .. I was looking for something better :) ..If you see then having
multiple bolts here do not help much .. It would have helped had there been
a provision to skip the already executed Bolts .


I believe this should be there in Storm .

Thanks
Ankur

On Mon, Oct 12, 2015 at 2:42 PM, Susheel Kumar Gadalay <sk...@gmail.com>
wrote:

> Check and insert
>
> On 10/12/15, Ankur Garg <an...@gmail.com> wrote:
> > But what if MongoDb bolt has some error , in that case I suppose the
> entire
> > tuple will be replayed from Spout meaning it will have to redo the
> > operation of inserting into sql . Is there a way I can skip inserting
> into
> > mysql ?
> >
> > On Mon, Oct 12, 2015 at 1:54 PM, Susheel Kumar Gadalay
> > <sk...@gmail.com>
> > wrote:
> >
> >> It is better to have 2 bolts - mysql bolt and mongodb bolt.
> >>
> >> Let mysql bolt forward the tuple to mongodb bolt, so in case of error
> >> it won't  emit.
> >>
> >> On 10/12/15, Ankur Garg <an...@gmail.com> wrote:
> >> > So I have a situation where the tuple received on Spout has to be
> saved
> >> to
> >> > mysql database and mongoDb as well .
> >> >
> >> > What should be better . Using 1 bolt to save it into mysql and MongoDb
> >> or 2
> >> > seperate Bolts (One for saving into mysql and other for saving into
> >> Mongo).
> >> >
> >> > What happens when the exception occurs while saving into mysql ? I
> >> believe
> >> > I will get acknowledgement inside the fail method in my Spout . So If
> I
> >> > reprocess it using 2 bolts , I believe it will again be sent to Bolt
> >> > for
> >> > saving into Mongo database .
> >> >
> >> > If the above is true , will having 2 seperate bolts be of any
> advantage
> >> > ?
> >> > how can I configure things so that Failure while inserting into mysql
> >> does
> >> > not impact inserting into MongoDb .
> >> >
> >> > Thanks
> >> > Ankur
> >> >
> >> > On Sun, Oct 11, 2015 at 10:57 PM, Ravi Sharma <pi...@gmail.com>
> >> wrote:
> >> >
> >> >> That depends if ur spout error has affected jvm or normal application
> >> >> error
> >> >>
> >> >> performance issue in case of lot of errors, I don't think there is
> any
> >> >> issue be coz of errors themselves but ofcourse if u r retrying these
> >> >> messages on failure then that means u will be processing lot of
> >> >> messages
> >> >> then normal and overall throughput will go down
> >> >>
> >> >> Ravi
> >> >>
> >> >> If ur topology has enabled acknowledgment that means spout will
> always
> >> >> receive
> >> >> On 11 Oct 2015 18:15, "Ankur Garg" <an...@gmail.com> wrote:
> >> >>
> >> >>>
> >> >>> Thanks for the reply Abhishek and Ravi .
> >> >>>
> >> >>> One question though , going with One topology with multiple spouts
> >> >>> ...What if something goes wrong in One spout or its associated bolts
> >> >>> ..
> >> >>> Does it impact other Spout as well?
> >> >>>
> >> >>> Thanks
> >> >>> Ankur
> >> >>>
> >> >>> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <pi...@gmail.com>
> >> >>> wrote:
> >> >>>
> >> >>>> No 100% right ansers , u will have to test and see what will fit..
> >> >>>>
> >> >>>> persoanlly i wud suggest Multiple spouts in one Topology and if you
> >> >>>> have
> >> >>>> N node where topology will be running then each Spout(reading from
> >> >>>> one
> >> >>>> queue) shud run N times in parallel.
> >> >>>>
> >> >>>> if 2 Queues and say 4 Nodes
> >> >>>> then one topolgy
> >> >>>> 4 Spouts reading from Queue1 in different nodes
> >> >>>> 4 spouts reading from Queue2 in different nodes
> >> >>>>
> >> >>>> Ravi.
> >> >>>>
> >> >>>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <
> >> >>>> abhishek.priya@gmail.com> wrote:
> >> >>>>
> >> >>>>> I guess this is a question where there r no really correct
> answers.
> >> >>>>> I'll certainly avoid#1 as it is better to keep logic separate and
> >> >>>>> lightweight.
> >> >>>>>
> >> >>>>> If your downstream bolts are same, then it makes senses to keep
> >> >>>>> them
> >> >>>>> in
> >> >>>>> same topology but if they r totally different, I'll keep them in
> >> >>>>> two
> >> >>>>> different topologies. That will allow me to independently deploy
> >> >>>>> and
> >> >>>>> scale
> >> >>>>> the topology. But if the rest of logic is same I topology scaling
> >> >>>>> and
> >> >>>>> resource utilization will be better with one topology.
> >> >>>>>
> >> >>>>> I hope this helps..
> >> >>>>>
> >> >>>>> Sent somehow....
> >> >>>>>
> >> >>>>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com>
> >> >>>>> > wrote:
> >> >>>>> >
> >> >>>>> > Hi ,
> >> >>>>> >
> >> >>>>> > So I have a situation where I want to read messages from
> >> >>>>> > different
> >> >>>>> queues hosted in a Rabbitmq Server .
> >> >>>>> >
> >> >>>>> > Now , there are three ways which I can think to leverage Apache
> >> >>>>> > Storm
> >> >>>>> here :-
> >> >>>>> >
> >> >>>>> > 1) Use the same Spout (say Spout A) to read messages from
> >> >>>>> > different
> >> >>>>> queues and based on the messages received emit it to different
> >> >>>>> Bolts.
> >> >>>>> >
> >> >>>>> > 2) Use different Spout (Spout A and Spout B and so on) within
> the
> >> >>>>> same topology (say Topology A) to read messages from different
> >> >>>>> queues
> >> >>>>> .
> >> >>>>> >
> >> >>>>> > 3) Use Different Spouts one within eachTopology (Topology A ,
> >> >>>>> Topology B and so on) to read messages from different queues .
> >> >>>>> >
> >> >>>>> > Which is the best way to process this considering I want high
> >> >>>>> throughput (more no of queue messages to be processed
> concurrently)
> >> >>>>> .
> >> >>>>> >
> >> >>>>> > Also , If In use same Topology for all Spouts (currently though
> >> >>>>> requirement is for 2 spouts)  will failure in one Spout (or its
> >> >>>>> associated
> >> >>>>> Bolts) effect the second or will they both continue working
> >> separately
> >> >>>>> even
> >> >>>>> if some failure is in Spout B ?
> >> >>>>> >
> >> >>>>> > Cost wise , how much would it be to maintain two different
> >> >>>>> > topologies
> >> >>>>> .
> >> >>>>> >
> >> >>>>> > Looking for inputs from members here.
> >> >>>>> >
> >> >>>>> > Thanks
> >> >>>>> > Ankur
> >> >>>>> >
> >> >>>>> >
> >> >>>>>
> >> >>>>
> >> >>>>
> >> >>>
> >> >
> >>
> >
>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Check and insert

On 10/12/15, Ankur Garg <an...@gmail.com> wrote:
> But what if MongoDb bolt has some error , in that case I suppose the entire
> tuple will be replayed from Spout meaning it will have to redo the
> operation of inserting into sql . Is there a way I can skip inserting into
> mysql ?
>
> On Mon, Oct 12, 2015 at 1:54 PM, Susheel Kumar Gadalay
> <sk...@gmail.com>
> wrote:
>
>> It is better to have 2 bolts - mysql bolt and mongodb bolt.
>>
>> Let mysql bolt forward the tuple to mongodb bolt, so in case of error
>> it won't  emit.
>>
>> On 10/12/15, Ankur Garg <an...@gmail.com> wrote:
>> > So I have a situation where the tuple received on Spout has to be saved
>> to
>> > mysql database and mongoDb as well .
>> >
>> > What should be better . Using 1 bolt to save it into mysql and MongoDb
>> or 2
>> > seperate Bolts (One for saving into mysql and other for saving into
>> Mongo).
>> >
>> > What happens when the exception occurs while saving into mysql ? I
>> believe
>> > I will get acknowledgement inside the fail method in my Spout . So If I
>> > reprocess it using 2 bolts , I believe it will again be sent to Bolt
>> > for
>> > saving into Mongo database .
>> >
>> > If the above is true , will having 2 seperate bolts be of any advantage
>> > ?
>> > how can I configure things so that Failure while inserting into mysql
>> does
>> > not impact inserting into MongoDb .
>> >
>> > Thanks
>> > Ankur
>> >
>> > On Sun, Oct 11, 2015 at 10:57 PM, Ravi Sharma <pi...@gmail.com>
>> wrote:
>> >
>> >> That depends if ur spout error has affected jvm or normal application
>> >> error
>> >>
>> >> performance issue in case of lot of errors, I don't think there is any
>> >> issue be coz of errors themselves but ofcourse if u r retrying these
>> >> messages on failure then that means u will be processing lot of
>> >> messages
>> >> then normal and overall throughput will go down
>> >>
>> >> Ravi
>> >>
>> >> If ur topology has enabled acknowledgment that means spout will always
>> >> receive
>> >> On 11 Oct 2015 18:15, "Ankur Garg" <an...@gmail.com> wrote:
>> >>
>> >>>
>> >>> Thanks for the reply Abhishek and Ravi .
>> >>>
>> >>> One question though , going with One topology with multiple spouts
>> >>> ...What if something goes wrong in One spout or its associated bolts
>> >>> ..
>> >>> Does it impact other Spout as well?
>> >>>
>> >>> Thanks
>> >>> Ankur
>> >>>
>> >>> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <pi...@gmail.com>
>> >>> wrote:
>> >>>
>> >>>> No 100% right ansers , u will have to test and see what will fit..
>> >>>>
>> >>>> persoanlly i wud suggest Multiple spouts in one Topology and if you
>> >>>> have
>> >>>> N node where topology will be running then each Spout(reading from
>> >>>> one
>> >>>> queue) shud run N times in parallel.
>> >>>>
>> >>>> if 2 Queues and say 4 Nodes
>> >>>> then one topolgy
>> >>>> 4 Spouts reading from Queue1 in different nodes
>> >>>> 4 spouts reading from Queue2 in different nodes
>> >>>>
>> >>>> Ravi.
>> >>>>
>> >>>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <
>> >>>> abhishek.priya@gmail.com> wrote:
>> >>>>
>> >>>>> I guess this is a question where there r no really correct answers.
>> >>>>> I'll certainly avoid#1 as it is better to keep logic separate and
>> >>>>> lightweight.
>> >>>>>
>> >>>>> If your downstream bolts are same, then it makes senses to keep
>> >>>>> them
>> >>>>> in
>> >>>>> same topology but if they r totally different, I'll keep them in
>> >>>>> two
>> >>>>> different topologies. That will allow me to independently deploy
>> >>>>> and
>> >>>>> scale
>> >>>>> the topology. But if the rest of logic is same I topology scaling
>> >>>>> and
>> >>>>> resource utilization will be better with one topology.
>> >>>>>
>> >>>>> I hope this helps..
>> >>>>>
>> >>>>> Sent somehow....
>> >>>>>
>> >>>>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com>
>> >>>>> > wrote:
>> >>>>> >
>> >>>>> > Hi ,
>> >>>>> >
>> >>>>> > So I have a situation where I want to read messages from
>> >>>>> > different
>> >>>>> queues hosted in a Rabbitmq Server .
>> >>>>> >
>> >>>>> > Now , there are three ways which I can think to leverage Apache
>> >>>>> > Storm
>> >>>>> here :-
>> >>>>> >
>> >>>>> > 1) Use the same Spout (say Spout A) to read messages from
>> >>>>> > different
>> >>>>> queues and based on the messages received emit it to different
>> >>>>> Bolts.
>> >>>>> >
>> >>>>> > 2) Use different Spout (Spout A and Spout B and so on) within the
>> >>>>> same topology (say Topology A) to read messages from different
>> >>>>> queues
>> >>>>> .
>> >>>>> >
>> >>>>> > 3) Use Different Spouts one within eachTopology (Topology A ,
>> >>>>> Topology B and so on) to read messages from different queues .
>> >>>>> >
>> >>>>> > Which is the best way to process this considering I want high
>> >>>>> throughput (more no of queue messages to be processed concurrently)
>> >>>>> .
>> >>>>> >
>> >>>>> > Also , If In use same Topology for all Spouts (currently though
>> >>>>> requirement is for 2 spouts)  will failure in one Spout (or its
>> >>>>> associated
>> >>>>> Bolts) effect the second or will they both continue working
>> separately
>> >>>>> even
>> >>>>> if some failure is in Spout B ?
>> >>>>> >
>> >>>>> > Cost wise , how much would it be to maintain two different
>> >>>>> > topologies
>> >>>>> .
>> >>>>> >
>> >>>>> > Looking for inputs from members here.
>> >>>>> >
>> >>>>> > Thanks
>> >>>>> > Ankur
>> >>>>> >
>> >>>>> >
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >
>>
>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Ankur Garg <an...@gmail.com>.
But what if MongoDb bolt has some error , in that case I suppose the entire
tuple will be replayed from Spout meaning it will have to redo the
operation of inserting into sql . Is there a way I can skip inserting into
mysql ?

On Mon, Oct 12, 2015 at 1:54 PM, Susheel Kumar Gadalay <sk...@gmail.com>
wrote:

> It is better to have 2 bolts - mysql bolt and mongodb bolt.
>
> Let mysql bolt forward the tuple to mongodb bolt, so in case of error
> it won't  emit.
>
> On 10/12/15, Ankur Garg <an...@gmail.com> wrote:
> > So I have a situation where the tuple received on Spout has to be saved
> to
> > mysql database and mongoDb as well .
> >
> > What should be better . Using 1 bolt to save it into mysql and MongoDb
> or 2
> > seperate Bolts (One for saving into mysql and other for saving into
> Mongo).
> >
> > What happens when the exception occurs while saving into mysql ? I
> believe
> > I will get acknowledgement inside the fail method in my Spout . So If I
> > reprocess it using 2 bolts , I believe it will again be sent to Bolt for
> > saving into Mongo database .
> >
> > If the above is true , will having 2 seperate bolts be of any advantage ?
> > how can I configure things so that Failure while inserting into mysql
> does
> > not impact inserting into MongoDb .
> >
> > Thanks
> > Ankur
> >
> > On Sun, Oct 11, 2015 at 10:57 PM, Ravi Sharma <pi...@gmail.com>
> wrote:
> >
> >> That depends if ur spout error has affected jvm or normal application
> >> error
> >>
> >> performance issue in case of lot of errors, I don't think there is any
> >> issue be coz of errors themselves but ofcourse if u r retrying these
> >> messages on failure then that means u will be processing lot of messages
> >> then normal and overall throughput will go down
> >>
> >> Ravi
> >>
> >> If ur topology has enabled acknowledgment that means spout will always
> >> receive
> >> On 11 Oct 2015 18:15, "Ankur Garg" <an...@gmail.com> wrote:
> >>
> >>>
> >>> Thanks for the reply Abhishek and Ravi .
> >>>
> >>> One question though , going with One topology with multiple spouts
> >>> ...What if something goes wrong in One spout or its associated bolts ..
> >>> Does it impact other Spout as well?
> >>>
> >>> Thanks
> >>> Ankur
> >>>
> >>> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <pi...@gmail.com>
> >>> wrote:
> >>>
> >>>> No 100% right ansers , u will have to test and see what will fit..
> >>>>
> >>>> persoanlly i wud suggest Multiple spouts in one Topology and if you
> >>>> have
> >>>> N node where topology will be running then each Spout(reading from one
> >>>> queue) shud run N times in parallel.
> >>>>
> >>>> if 2 Queues and say 4 Nodes
> >>>> then one topolgy
> >>>> 4 Spouts reading from Queue1 in different nodes
> >>>> 4 spouts reading from Queue2 in different nodes
> >>>>
> >>>> Ravi.
> >>>>
> >>>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <
> >>>> abhishek.priya@gmail.com> wrote:
> >>>>
> >>>>> I guess this is a question where there r no really correct answers.
> >>>>> I'll certainly avoid#1 as it is better to keep logic separate and
> >>>>> lightweight.
> >>>>>
> >>>>> If your downstream bolts are same, then it makes senses to keep them
> >>>>> in
> >>>>> same topology but if they r totally different, I'll keep them in two
> >>>>> different topologies. That will allow me to independently deploy and
> >>>>> scale
> >>>>> the topology. But if the rest of logic is same I topology scaling and
> >>>>> resource utilization will be better with one topology.
> >>>>>
> >>>>> I hope this helps..
> >>>>>
> >>>>> Sent somehow....
> >>>>>
> >>>>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com>
> >>>>> > wrote:
> >>>>> >
> >>>>> > Hi ,
> >>>>> >
> >>>>> > So I have a situation where I want to read messages from different
> >>>>> queues hosted in a Rabbitmq Server .
> >>>>> >
> >>>>> > Now , there are three ways which I can think to leverage Apache
> >>>>> > Storm
> >>>>> here :-
> >>>>> >
> >>>>> > 1) Use the same Spout (say Spout A) to read messages from different
> >>>>> queues and based on the messages received emit it to different Bolts.
> >>>>> >
> >>>>> > 2) Use different Spout (Spout A and Spout B and so on) within the
> >>>>> same topology (say Topology A) to read messages from different queues
> >>>>> .
> >>>>> >
> >>>>> > 3) Use Different Spouts one within eachTopology (Topology A ,
> >>>>> Topology B and so on) to read messages from different queues .
> >>>>> >
> >>>>> > Which is the best way to process this considering I want high
> >>>>> throughput (more no of queue messages to be processed concurrently) .
> >>>>> >
> >>>>> > Also , If In use same Topology for all Spouts (currently though
> >>>>> requirement is for 2 spouts)  will failure in one Spout (or its
> >>>>> associated
> >>>>> Bolts) effect the second or will they both continue working
> separately
> >>>>> even
> >>>>> if some failure is in Spout B ?
> >>>>> >
> >>>>> > Cost wise , how much would it be to maintain two different
> >>>>> > topologies
> >>>>> .
> >>>>> >
> >>>>> > Looking for inputs from members here.
> >>>>> >
> >>>>> > Thanks
> >>>>> > Ankur
> >>>>> >
> >>>>> >
> >>>>>
> >>>>
> >>>>
> >>>
> >
>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
It is better to have 2 bolts - mysql bolt and mongodb bolt.

Let mysql bolt forward the tuple to mongodb bolt, so in case of error
it won't  emit.

On 10/12/15, Ankur Garg <an...@gmail.com> wrote:
> So I have a situation where the tuple received on Spout has to be saved to
> mysql database and mongoDb as well .
>
> What should be better . Using 1 bolt to save it into mysql and MongoDb or 2
> seperate Bolts (One for saving into mysql and other for saving into Mongo).
>
> What happens when the exception occurs while saving into mysql ? I believe
> I will get acknowledgement inside the fail method in my Spout . So If I
> reprocess it using 2 bolts , I believe it will again be sent to Bolt for
> saving into Mongo database .
>
> If the above is true , will having 2 seperate bolts be of any advantage ?
> how can I configure things so that Failure while inserting into mysql does
> not impact inserting into MongoDb .
>
> Thanks
> Ankur
>
> On Sun, Oct 11, 2015 at 10:57 PM, Ravi Sharma <pi...@gmail.com> wrote:
>
>> That depends if ur spout error has affected jvm or normal application
>> error
>>
>> performance issue in case of lot of errors, I don't think there is any
>> issue be coz of errors themselves but ofcourse if u r retrying these
>> messages on failure then that means u will be processing lot of messages
>> then normal and overall throughput will go down
>>
>> Ravi
>>
>> If ur topology has enabled acknowledgment that means spout will always
>> receive
>> On 11 Oct 2015 18:15, "Ankur Garg" <an...@gmail.com> wrote:
>>
>>>
>>> Thanks for the reply Abhishek and Ravi .
>>>
>>> One question though , going with One topology with multiple spouts
>>> ...What if something goes wrong in One spout or its associated bolts ..
>>> Does it impact other Spout as well?
>>>
>>> Thanks
>>> Ankur
>>>
>>> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <pi...@gmail.com>
>>> wrote:
>>>
>>>> No 100% right ansers , u will have to test and see what will fit..
>>>>
>>>> persoanlly i wud suggest Multiple spouts in one Topology and if you
>>>> have
>>>> N node where topology will be running then each Spout(reading from one
>>>> queue) shud run N times in parallel.
>>>>
>>>> if 2 Queues and say 4 Nodes
>>>> then one topolgy
>>>> 4 Spouts reading from Queue1 in different nodes
>>>> 4 spouts reading from Queue2 in different nodes
>>>>
>>>> Ravi.
>>>>
>>>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <
>>>> abhishek.priya@gmail.com> wrote:
>>>>
>>>>> I guess this is a question where there r no really correct answers.
>>>>> I'll certainly avoid#1 as it is better to keep logic separate and
>>>>> lightweight.
>>>>>
>>>>> If your downstream bolts are same, then it makes senses to keep them
>>>>> in
>>>>> same topology but if they r totally different, I'll keep them in two
>>>>> different topologies. That will allow me to independently deploy and
>>>>> scale
>>>>> the topology. But if the rest of logic is same I topology scaling and
>>>>> resource utilization will be better with one topology.
>>>>>
>>>>> I hope this helps..
>>>>>
>>>>> Sent somehow....
>>>>>
>>>>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > Hi ,
>>>>> >
>>>>> > So I have a situation where I want to read messages from different
>>>>> queues hosted in a Rabbitmq Server .
>>>>> >
>>>>> > Now , there are three ways which I can think to leverage Apache
>>>>> > Storm
>>>>> here :-
>>>>> >
>>>>> > 1) Use the same Spout (say Spout A) to read messages from different
>>>>> queues and based on the messages received emit it to different Bolts.
>>>>> >
>>>>> > 2) Use different Spout (Spout A and Spout B and so on) within the
>>>>> same topology (say Topology A) to read messages from different queues
>>>>> .
>>>>> >
>>>>> > 3) Use Different Spouts one within eachTopology (Topology A ,
>>>>> Topology B and so on) to read messages from different queues .
>>>>> >
>>>>> > Which is the best way to process this considering I want high
>>>>> throughput (more no of queue messages to be processed concurrently) .
>>>>> >
>>>>> > Also , If In use same Topology for all Spouts (currently though
>>>>> requirement is for 2 spouts)  will failure in one Spout (or its
>>>>> associated
>>>>> Bolts) effect the second or will they both continue working separately
>>>>> even
>>>>> if some failure is in Spout B ?
>>>>> >
>>>>> > Cost wise , how much would it be to maintain two different
>>>>> > topologies
>>>>> .
>>>>> >
>>>>> > Looking for inputs from members here.
>>>>> >
>>>>> > Thanks
>>>>> > Ankur
>>>>> >
>>>>> >
>>>>>
>>>>
>>>>
>>>
>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Ankur Garg <an...@gmail.com>.
So I have a situation where the tuple received on Spout has to be saved to
mysql database and mongoDb as well .

What should be better . Using 1 bolt to save it into mysql and MongoDb or 2
seperate Bolts (One for saving into mysql and other for saving into Mongo).

What happens when the exception occurs while saving into mysql ? I believe
I will get acknowledgement inside the fail method in my Spout . So If I
reprocess it using 2 bolts , I believe it will again be sent to Bolt for
saving into Mongo database .

If the above is true , will having 2 seperate bolts be of any advantage ?
how can I configure things so that Failure while inserting into mysql does
not impact inserting into MongoDb .

Thanks
Ankur

On Sun, Oct 11, 2015 at 10:57 PM, Ravi Sharma <pi...@gmail.com> wrote:

> That depends if ur spout error has affected jvm or normal application error
>
> performance issue in case of lot of errors, I don't think there is any
> issue be coz of errors themselves but ofcourse if u r retrying these
> messages on failure then that means u will be processing lot of messages
> then normal and overall throughput will go down
>
> Ravi
>
> If ur topology has enabled acknowledgment that means spout will always
> receive
> On 11 Oct 2015 18:15, "Ankur Garg" <an...@gmail.com> wrote:
>
>>
>> Thanks for the reply Abhishek and Ravi .
>>
>> One question though , going with One topology with multiple spouts
>> ...What if something goes wrong in One spout or its associated bolts ..
>> Does it impact other Spout as well?
>>
>> Thanks
>> Ankur
>>
>> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <pi...@gmail.com>
>> wrote:
>>
>>> No 100% right ansers , u will have to test and see what will fit..
>>>
>>> persoanlly i wud suggest Multiple spouts in one Topology and if you have
>>> N node where topology will be running then each Spout(reading from one
>>> queue) shud run N times in parallel.
>>>
>>> if 2 Queues and say 4 Nodes
>>> then one topolgy
>>> 4 Spouts reading from Queue1 in different nodes
>>> 4 spouts reading from Queue2 in different nodes
>>>
>>> Ravi.
>>>
>>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <
>>> abhishek.priya@gmail.com> wrote:
>>>
>>>> I guess this is a question where there r no really correct answers.
>>>> I'll certainly avoid#1 as it is better to keep logic separate and
>>>> lightweight.
>>>>
>>>> If your downstream bolts are same, then it makes senses to keep them in
>>>> same topology but if they r totally different, I'll keep them in two
>>>> different topologies. That will allow me to independently deploy and scale
>>>> the topology. But if the rest of logic is same I topology scaling and
>>>> resource utilization will be better with one topology.
>>>>
>>>> I hope this helps..
>>>>
>>>> Sent somehow....
>>>>
>>>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com> wrote:
>>>> >
>>>> > Hi ,
>>>> >
>>>> > So I have a situation where I want to read messages from different
>>>> queues hosted in a Rabbitmq Server .
>>>> >
>>>> > Now , there are three ways which I can think to leverage Apache Storm
>>>> here :-
>>>> >
>>>> > 1) Use the same Spout (say Spout A) to read messages from different
>>>> queues and based on the messages received emit it to different Bolts.
>>>> >
>>>> > 2) Use different Spout (Spout A and Spout B and so on) within the
>>>> same topology (say Topology A) to read messages from different queues .
>>>> >
>>>> > 3) Use Different Spouts one within eachTopology (Topology A ,
>>>> Topology B and so on) to read messages from different queues .
>>>> >
>>>> > Which is the best way to process this considering I want high
>>>> throughput (more no of queue messages to be processed concurrently) .
>>>> >
>>>> > Also , If In use same Topology for all Spouts (currently though
>>>> requirement is for 2 spouts)  will failure in one Spout (or its associated
>>>> Bolts) effect the second or will they both continue working separately even
>>>> if some failure is in Spout B ?
>>>> >
>>>> > Cost wise , how much would it be to maintain two different topologies
>>>> .
>>>> >
>>>> > Looking for inputs from members here.
>>>> >
>>>> > Thanks
>>>> > Ankur
>>>> >
>>>> >
>>>>
>>>
>>>
>>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Ankur Garg <an...@gmail.com>.
So I have a situation where the tuple received on Spout has to be saved to
mysql database and mongoDb as well .

What should be better . Using 1 bolt to save it into mysql and MongoDb or 2
seperate Bolts (One for saving into mysql and other for saving into Mongo).

What happens when the exception occurs while saving into mysql ? I believe
I will get acknowledgement inside the fail method in my Spout . So If I
reprocess it using 2 bolts , I believe it will again be sent to Bolt for
saving into Mongo database .

If the above is true , will having 2 seperate bolts be of any advantage ?
how can I configure things so that Failure while inserting into mysql does
not impact inserting into MongoDb .

Thanks
Ankur

On Sun, Oct 11, 2015 at 10:57 PM, Ravi Sharma <pi...@gmail.com> wrote:

> That depends if ur spout error has affected jvm or normal application error
>
> performance issue in case of lot of errors, I don't think there is any
> issue be coz of errors themselves but ofcourse if u r retrying these
> messages on failure then that means u will be processing lot of messages
> then normal and overall throughput will go down
>
> Ravi
>
> If ur topology has enabled acknowledgment that means spout will always
> receive
> On 11 Oct 2015 18:15, "Ankur Garg" <an...@gmail.com> wrote:
>
>>
>> Thanks for the reply Abhishek and Ravi .
>>
>> One question though , going with One topology with multiple spouts
>> ...What if something goes wrong in One spout or its associated bolts ..
>> Does it impact other Spout as well?
>>
>> Thanks
>> Ankur
>>
>> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <pi...@gmail.com>
>> wrote:
>>
>>> No 100% right ansers , u will have to test and see what will fit..
>>>
>>> persoanlly i wud suggest Multiple spouts in one Topology and if you have
>>> N node where topology will be running then each Spout(reading from one
>>> queue) shud run N times in parallel.
>>>
>>> if 2 Queues and say 4 Nodes
>>> then one topolgy
>>> 4 Spouts reading from Queue1 in different nodes
>>> 4 spouts reading from Queue2 in different nodes
>>>
>>> Ravi.
>>>
>>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <
>>> abhishek.priya@gmail.com> wrote:
>>>
>>>> I guess this is a question where there r no really correct answers.
>>>> I'll certainly avoid#1 as it is better to keep logic separate and
>>>> lightweight.
>>>>
>>>> If your downstream bolts are same, then it makes senses to keep them in
>>>> same topology but if they r totally different, I'll keep them in two
>>>> different topologies. That will allow me to independently deploy and scale
>>>> the topology. But if the rest of logic is same I topology scaling and
>>>> resource utilization will be better with one topology.
>>>>
>>>> I hope this helps..
>>>>
>>>> Sent somehow....
>>>>
>>>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com> wrote:
>>>> >
>>>> > Hi ,
>>>> >
>>>> > So I have a situation where I want to read messages from different
>>>> queues hosted in a Rabbitmq Server .
>>>> >
>>>> > Now , there are three ways which I can think to leverage Apache Storm
>>>> here :-
>>>> >
>>>> > 1) Use the same Spout (say Spout A) to read messages from different
>>>> queues and based on the messages received emit it to different Bolts.
>>>> >
>>>> > 2) Use different Spout (Spout A and Spout B and so on) within the
>>>> same topology (say Topology A) to read messages from different queues .
>>>> >
>>>> > 3) Use Different Spouts one within eachTopology (Topology A ,
>>>> Topology B and so on) to read messages from different queues .
>>>> >
>>>> > Which is the best way to process this considering I want high
>>>> throughput (more no of queue messages to be processed concurrently) .
>>>> >
>>>> > Also , If In use same Topology for all Spouts (currently though
>>>> requirement is for 2 spouts)  will failure in one Spout (or its associated
>>>> Bolts) effect the second or will they both continue working separately even
>>>> if some failure is in Spout B ?
>>>> >
>>>> > Cost wise , how much would it be to maintain two different topologies
>>>> .
>>>> >
>>>> > Looking for inputs from members here.
>>>> >
>>>> > Thanks
>>>> > Ankur
>>>> >
>>>> >
>>>>
>>>
>>>
>>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Rudraneel chakraborty <ru...@gmail.com>.
Can you give me a situation where multiple dependent topology have been
used , say different topologies will infer a big complex event

On Sunday, 11 October 2015, Ravi Sharma <pi...@gmail.com> wrote:

> That depends if ur spout error has affected jvm or normal application error
>
> performance issue in case of lot of errors, I don't think there is any
> issue be coz of errors themselves but ofcourse if u r retrying these
> messages on failure then that means u will be processing lot of messages
> then normal and overall throughput will go down
>
> Ravi
>
> If ur topology has enabled acknowledgment that means spout will always
> receive
> On 11 Oct 2015 18:15, "Ankur Garg" <ankurgarg9@gmail.com
> <javascript:_e(%7B%7D,'cvml','ankurgarg9@gmail.com');>> wrote:
>
>>
>> Thanks for the reply Abhishek and Ravi .
>>
>> One question though , going with One topology with multiple spouts
>> ...What if something goes wrong in One spout or its associated bolts ..
>> Does it impact other Spout as well?
>>
>> Thanks
>> Ankur
>>
>> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <ping2ravi@gmail.com
>> <javascript:_e(%7B%7D,'cvml','ping2ravi@gmail.com');>> wrote:
>>
>>> No 100% right ansers , u will have to test and see what will fit..
>>>
>>> persoanlly i wud suggest Multiple spouts in one Topology and if you have
>>> N node where topology will be running then each Spout(reading from one
>>> queue) shud run N times in parallel.
>>>
>>> if 2 Queues and say 4 Nodes
>>> then one topolgy
>>> 4 Spouts reading from Queue1 in different nodes
>>> 4 spouts reading from Queue2 in different nodes
>>>
>>> Ravi.
>>>
>>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <
>>> abhishek.priya@gmail.com
>>> <javascript:_e(%7B%7D,'cvml','abhishek.priya@gmail.com');>> wrote:
>>>
>>>> I guess this is a question where there r no really correct answers.
>>>> I'll certainly avoid#1 as it is better to keep logic separate and
>>>> lightweight.
>>>>
>>>> If your downstream bolts are same, then it makes senses to keep them in
>>>> same topology but if they r totally different, I'll keep them in two
>>>> different topologies. That will allow me to independently deploy and scale
>>>> the topology. But if the rest of logic is same I topology scaling and
>>>> resource utilization will be better with one topology.
>>>>
>>>> I hope this helps..
>>>>
>>>> Sent somehow....
>>>>
>>>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <ankurgarg9@gmail.com
>>>> <javascript:_e(%7B%7D,'cvml','ankurgarg9@gmail.com');>> wrote:
>>>> >
>>>> > Hi ,
>>>> >
>>>> > So I have a situation where I want to read messages from different
>>>> queues hosted in a Rabbitmq Server .
>>>> >
>>>> > Now , there are three ways which I can think to leverage Apache Storm
>>>> here :-
>>>> >
>>>> > 1) Use the same Spout (say Spout A) to read messages from different
>>>> queues and based on the messages received emit it to different Bolts.
>>>> >
>>>> > 2) Use different Spout (Spout A and Spout B and so on) within the
>>>> same topology (say Topology A) to read messages from different queues .
>>>> >
>>>> > 3) Use Different Spouts one within eachTopology (Topology A ,
>>>> Topology B and so on) to read messages from different queues .
>>>> >
>>>> > Which is the best way to process this considering I want high
>>>> throughput (more no of queue messages to be processed concurrently) .
>>>> >
>>>> > Also , If In use same Topology for all Spouts (currently though
>>>> requirement is for 2 spouts)  will failure in one Spout (or its associated
>>>> Bolts) effect the second or will they both continue working separately even
>>>> if some failure is in Spout B ?
>>>> >
>>>> > Cost wise , how much would it be to maintain two different topologies
>>>> .
>>>> >
>>>> > Looking for inputs from members here.
>>>> >
>>>> > Thanks
>>>> > Ankur
>>>> >
>>>> >
>>>>
>>>
>>>
>>

-- 
Rudraneel Chakraborty
Carleton University Real Time and Distributed Systems Reserach

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Ravi Sharma <pi...@gmail.com>.
That depends if ur spout error has affected jvm or normal application error

performance issue in case of lot of errors, I don't think there is any
issue be coz of errors themselves but ofcourse if u r retrying these
messages on failure then that means u will be processing lot of messages
then normal and overall throughput will go down

Ravi

If ur topology has enabled acknowledgment that means spout will always
receive
On 11 Oct 2015 18:15, "Ankur Garg" <an...@gmail.com> wrote:

>
> Thanks for the reply Abhishek and Ravi .
>
> One question though , going with One topology with multiple spouts ...What
> if something goes wrong in One spout or its associated bolts .. Does it
> impact other Spout as well?
>
> Thanks
> Ankur
>
> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <pi...@gmail.com> wrote:
>
>> No 100% right ansers , u will have to test and see what will fit..
>>
>> persoanlly i wud suggest Multiple spouts in one Topology and if you have
>> N node where topology will be running then each Spout(reading from one
>> queue) shud run N times in parallel.
>>
>> if 2 Queues and say 4 Nodes
>> then one topolgy
>> 4 Spouts reading from Queue1 in different nodes
>> 4 spouts reading from Queue2 in different nodes
>>
>> Ravi.
>>
>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <abhishek.priya@gmail.com
>> > wrote:
>>
>>> I guess this is a question where there r no really correct answers. I'll
>>> certainly avoid#1 as it is better to keep logic separate and lightweight.
>>>
>>> If your downstream bolts are same, then it makes senses to keep them in
>>> same topology but if they r totally different, I'll keep them in two
>>> different topologies. That will allow me to independently deploy and scale
>>> the topology. But if the rest of logic is same I topology scaling and
>>> resource utilization will be better with one topology.
>>>
>>> I hope this helps..
>>>
>>> Sent somehow....
>>>
>>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com> wrote:
>>> >
>>> > Hi ,
>>> >
>>> > So I have a situation where I want to read messages from different
>>> queues hosted in a Rabbitmq Server .
>>> >
>>> > Now , there are three ways which I can think to leverage Apache Storm
>>> here :-
>>> >
>>> > 1) Use the same Spout (say Spout A) to read messages from different
>>> queues and based on the messages received emit it to different Bolts.
>>> >
>>> > 2) Use different Spout (Spout A and Spout B and so on) within the same
>>> topology (say Topology A) to read messages from different queues .
>>> >
>>> > 3) Use Different Spouts one within eachTopology (Topology A , Topology
>>> B and so on) to read messages from different queues .
>>> >
>>> > Which is the best way to process this considering I want high
>>> throughput (more no of queue messages to be processed concurrently) .
>>> >
>>> > Also , If In use same Topology for all Spouts (currently though
>>> requirement is for 2 spouts)  will failure in one Spout (or its associated
>>> Bolts) effect the second or will they both continue working separately even
>>> if some failure is in Spout B ?
>>> >
>>> > Cost wise , how much would it be to maintain two different topologies .
>>> >
>>> > Looking for inputs from members here.
>>> >
>>> > Thanks
>>> > Ankur
>>> >
>>> >
>>>
>>
>>
>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Ravi Sharma <pi...@gmail.com>.
That depends if ur spout error has affected jvm or normal application error

performance issue in case of lot of errors, I don't think there is any
issue be coz of errors themselves but ofcourse if u r retrying these
messages on failure then that means u will be processing lot of messages
then normal and overall throughput will go down

Ravi

If ur topology has enabled acknowledgment that means spout will always
receive
On 11 Oct 2015 18:15, "Ankur Garg" <an...@gmail.com> wrote:

>
> Thanks for the reply Abhishek and Ravi .
>
> One question though , going with One topology with multiple spouts ...What
> if something goes wrong in One spout or its associated bolts .. Does it
> impact other Spout as well?
>
> Thanks
> Ankur
>
> On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <pi...@gmail.com> wrote:
>
>> No 100% right ansers , u will have to test and see what will fit..
>>
>> persoanlly i wud suggest Multiple spouts in one Topology and if you have
>> N node where topology will be running then each Spout(reading from one
>> queue) shud run N times in parallel.
>>
>> if 2 Queues and say 4 Nodes
>> then one topolgy
>> 4 Spouts reading from Queue1 in different nodes
>> 4 spouts reading from Queue2 in different nodes
>>
>> Ravi.
>>
>> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <abhishek.priya@gmail.com
>> > wrote:
>>
>>> I guess this is a question where there r no really correct answers. I'll
>>> certainly avoid#1 as it is better to keep logic separate and lightweight.
>>>
>>> If your downstream bolts are same, then it makes senses to keep them in
>>> same topology but if they r totally different, I'll keep them in two
>>> different topologies. That will allow me to independently deploy and scale
>>> the topology. But if the rest of logic is same I topology scaling and
>>> resource utilization will be better with one topology.
>>>
>>> I hope this helps..
>>>
>>> Sent somehow....
>>>
>>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com> wrote:
>>> >
>>> > Hi ,
>>> >
>>> > So I have a situation where I want to read messages from different
>>> queues hosted in a Rabbitmq Server .
>>> >
>>> > Now , there are three ways which I can think to leverage Apache Storm
>>> here :-
>>> >
>>> > 1) Use the same Spout (say Spout A) to read messages from different
>>> queues and based on the messages received emit it to different Bolts.
>>> >
>>> > 2) Use different Spout (Spout A and Spout B and so on) within the same
>>> topology (say Topology A) to read messages from different queues .
>>> >
>>> > 3) Use Different Spouts one within eachTopology (Topology A , Topology
>>> B and so on) to read messages from different queues .
>>> >
>>> > Which is the best way to process this considering I want high
>>> throughput (more no of queue messages to be processed concurrently) .
>>> >
>>> > Also , If In use same Topology for all Spouts (currently though
>>> requirement is for 2 spouts)  will failure in one Spout (or its associated
>>> Bolts) effect the second or will they both continue working separately even
>>> if some failure is in Spout B ?
>>> >
>>> > Cost wise , how much would it be to maintain two different topologies .
>>> >
>>> > Looking for inputs from members here.
>>> >
>>> > Thanks
>>> > Ankur
>>> >
>>> >
>>>
>>
>>
>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Ankur Garg <an...@gmail.com>.
Thanks for the reply Abhishek and Ravi .

One question though , going with One topology with multiple spouts ...What
if something goes wrong in One spout or its associated bolts .. Does it
impact other Spout as well?

Thanks
Ankur

On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <pi...@gmail.com> wrote:

> No 100% right ansers , u will have to test and see what will fit..
>
> persoanlly i wud suggest Multiple spouts in one Topology and if you have N
> node where topology will be running then each Spout(reading from one queue)
> shud run N times in parallel.
>
> if 2 Queues and say 4 Nodes
> then one topolgy
> 4 Spouts reading from Queue1 in different nodes
> 4 spouts reading from Queue2 in different nodes
>
> Ravi.
>
> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <ab...@gmail.com>
> wrote:
>
>> I guess this is a question where there r no really correct answers. I'll
>> certainly avoid#1 as it is better to keep logic separate and lightweight.
>>
>> If your downstream bolts are same, then it makes senses to keep them in
>> same topology but if they r totally different, I'll keep them in two
>> different topologies. That will allow me to independently deploy and scale
>> the topology. But if the rest of logic is same I topology scaling and
>> resource utilization will be better with one topology.
>>
>> I hope this helps..
>>
>> Sent somehow....
>>
>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com> wrote:
>> >
>> > Hi ,
>> >
>> > So I have a situation where I want to read messages from different
>> queues hosted in a Rabbitmq Server .
>> >
>> > Now , there are three ways which I can think to leverage Apache Storm
>> here :-
>> >
>> > 1) Use the same Spout (say Spout A) to read messages from different
>> queues and based on the messages received emit it to different Bolts.
>> >
>> > 2) Use different Spout (Spout A and Spout B and so on) within the same
>> topology (say Topology A) to read messages from different queues .
>> >
>> > 3) Use Different Spouts one within eachTopology (Topology A , Topology
>> B and so on) to read messages from different queues .
>> >
>> > Which is the best way to process this considering I want high
>> throughput (more no of queue messages to be processed concurrently) .
>> >
>> > Also , If In use same Topology for all Spouts (currently though
>> requirement is for 2 spouts)  will failure in one Spout (or its associated
>> Bolts) effect the second or will they both continue working separately even
>> if some failure is in Spout B ?
>> >
>> > Cost wise , how much would it be to maintain two different topologies .
>> >
>> > Looking for inputs from members here.
>> >
>> > Thanks
>> > Ankur
>> >
>> >
>>
>
>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Ankur Garg <an...@gmail.com>.
Thanks for the reply Abhishek and Ravi .

One question though , going with One topology with multiple spouts ...What
if something goes wrong in One spout or its associated bolts .. Does it
impact other Spout as well?

Thanks
Ankur

On Sun, Oct 11, 2015 at 10:21 PM, Ravi Sharma <pi...@gmail.com> wrote:

> No 100% right ansers , u will have to test and see what will fit..
>
> persoanlly i wud suggest Multiple spouts in one Topology and if you have N
> node where topology will be running then each Spout(reading from one queue)
> shud run N times in parallel.
>
> if 2 Queues and say 4 Nodes
> then one topolgy
> 4 Spouts reading from Queue1 in different nodes
> 4 spouts reading from Queue2 in different nodes
>
> Ravi.
>
> On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <ab...@gmail.com>
> wrote:
>
>> I guess this is a question where there r no really correct answers. I'll
>> certainly avoid#1 as it is better to keep logic separate and lightweight.
>>
>> If your downstream bolts are same, then it makes senses to keep them in
>> same topology but if they r totally different, I'll keep them in two
>> different topologies. That will allow me to independently deploy and scale
>> the topology. But if the rest of logic is same I topology scaling and
>> resource utilization will be better with one topology.
>>
>> I hope this helps..
>>
>> Sent somehow....
>>
>> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com> wrote:
>> >
>> > Hi ,
>> >
>> > So I have a situation where I want to read messages from different
>> queues hosted in a Rabbitmq Server .
>> >
>> > Now , there are three ways which I can think to leverage Apache Storm
>> here :-
>> >
>> > 1) Use the same Spout (say Spout A) to read messages from different
>> queues and based on the messages received emit it to different Bolts.
>> >
>> > 2) Use different Spout (Spout A and Spout B and so on) within the same
>> topology (say Topology A) to read messages from different queues .
>> >
>> > 3) Use Different Spouts one within eachTopology (Topology A , Topology
>> B and so on) to read messages from different queues .
>> >
>> > Which is the best way to process this considering I want high
>> throughput (more no of queue messages to be processed concurrently) .
>> >
>> > Also , If In use same Topology for all Spouts (currently though
>> requirement is for 2 spouts)  will failure in one Spout (or its associated
>> Bolts) effect the second or will they both continue working separately even
>> if some failure is in Spout B ?
>> >
>> > Cost wise , how much would it be to maintain two different topologies .
>> >
>> > Looking for inputs from members here.
>> >
>> > Thanks
>> > Ankur
>> >
>> >
>>
>
>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Ravi Sharma <pi...@gmail.com>.
No 100% right ansers , u will have to test and see what will fit..

persoanlly i wud suggest Multiple spouts in one Topology and if you have N
node where topology will be running then each Spout(reading from one queue)
shud run N times in parallel.

if 2 Queues and say 4 Nodes
then one topolgy
4 Spouts reading from Queue1 in different nodes
4 spouts reading from Queue2 in different nodes

Ravi.

On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <ab...@gmail.com>
wrote:

> I guess this is a question where there r no really correct answers. I'll
> certainly avoid#1 as it is better to keep logic separate and lightweight.
>
> If your downstream bolts are same, then it makes senses to keep them in
> same topology but if they r totally different, I'll keep them in two
> different topologies. That will allow me to independently deploy and scale
> the topology. But if the rest of logic is same I topology scaling and
> resource utilization will be better with one topology.
>
> I hope this helps..
>
> Sent somehow....
>
> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com> wrote:
> >
> > Hi ,
> >
> > So I have a situation where I want to read messages from different
> queues hosted in a Rabbitmq Server .
> >
> > Now , there are three ways which I can think to leverage Apache Storm
> here :-
> >
> > 1) Use the same Spout (say Spout A) to read messages from different
> queues and based on the messages received emit it to different Bolts.
> >
> > 2) Use different Spout (Spout A and Spout B and so on) within the same
> topology (say Topology A) to read messages from different queues .
> >
> > 3) Use Different Spouts one within eachTopology (Topology A , Topology B
> and so on) to read messages from different queues .
> >
> > Which is the best way to process this considering I want high throughput
> (more no of queue messages to be processed concurrently) .
> >
> > Also , If In use same Topology for all Spouts (currently though
> requirement is for 2 spouts)  will failure in one Spout (or its associated
> Bolts) effect the second or will they both continue working separately even
> if some failure is in Spout B ?
> >
> > Cost wise , how much would it be to maintain two different topologies .
> >
> > Looking for inputs from members here.
> >
> > Thanks
> > Ankur
> >
> >
>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Ravi Sharma <pi...@gmail.com>.
No 100% right ansers , u will have to test and see what will fit..

persoanlly i wud suggest Multiple spouts in one Topology and if you have N
node where topology will be running then each Spout(reading from one queue)
shud run N times in parallel.

if 2 Queues and say 4 Nodes
then one topolgy
4 Spouts reading from Queue1 in different nodes
4 spouts reading from Queue2 in different nodes

Ravi.

On Sun, Oct 11, 2015 at 5:25 PM, Abhishek priya <ab...@gmail.com>
wrote:

> I guess this is a question where there r no really correct answers. I'll
> certainly avoid#1 as it is better to keep logic separate and lightweight.
>
> If your downstream bolts are same, then it makes senses to keep them in
> same topology but if they r totally different, I'll keep them in two
> different topologies. That will allow me to independently deploy and scale
> the topology. But if the rest of logic is same I topology scaling and
> resource utilization will be better with one topology.
>
> I hope this helps..
>
> Sent somehow....
>
> > On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com> wrote:
> >
> > Hi ,
> >
> > So I have a situation where I want to read messages from different
> queues hosted in a Rabbitmq Server .
> >
> > Now , there are three ways which I can think to leverage Apache Storm
> here :-
> >
> > 1) Use the same Spout (say Spout A) to read messages from different
> queues and based on the messages received emit it to different Bolts.
> >
> > 2) Use different Spout (Spout A and Spout B and so on) within the same
> topology (say Topology A) to read messages from different queues .
> >
> > 3) Use Different Spouts one within eachTopology (Topology A , Topology B
> and so on) to read messages from different queues .
> >
> > Which is the best way to process this considering I want high throughput
> (more no of queue messages to be processed concurrently) .
> >
> > Also , If In use same Topology for all Spouts (currently though
> requirement is for 2 spouts)  will failure in one Spout (or its associated
> Bolts) effect the second or will they both continue working separately even
> if some failure is in Spout B ?
> >
> > Cost wise , how much would it be to maintain two different topologies .
> >
> > Looking for inputs from members here.
> >
> > Thanks
> > Ankur
> >
> >
>

Re: Multiple Spouts in Same topology or Topology per spout

Posted by Abhishek priya <ab...@gmail.com>.
I guess this is a question where there r no really correct answers. I'll certainly avoid#1 as it is better to keep logic separate and lightweight.

If your downstream bolts are same, then it makes senses to keep them in same topology but if they r totally different, I'll keep them in two different topologies. That will allow me to independently deploy and scale the topology. But if the rest of logic is same I topology scaling and resource utilization will be better with one topology.

I hope this helps..

Sent somehow....

> On Oct 11, 2015, at 9:07 AM, Ankur Garg <an...@gmail.com> wrote:
> 
> Hi ,
> 
> So I have a situation where I want to read messages from different queues hosted in a Rabbitmq Server . 
> 
> Now , there are three ways which I can think to leverage Apache Storm here :-
> 
> 1) Use the same Spout (say Spout A) to read messages from different queues and based on the messages received emit it to different Bolts.
> 
> 2) Use different Spout (Spout A and Spout B and so on) within the same topology (say Topology A) to read messages from different queues .
> 
> 3) Use Different Spouts one within eachTopology (Topology A , Topology B and so on) to read messages from different queues . 
> 
> Which is the best way to process this considering I want high throughput (more no of queue messages to be processed concurrently) . 
> 
> Also , If In use same Topology for all Spouts (currently though requirement is for 2 spouts)  will failure in one Spout (or its associated Bolts) effect the second or will they both continue working separately even if some failure is in Spout B ?
> 
> Cost wise , how much would it be to maintain two different topologies .
> 
> Looking for inputs from members here.
> 
> Thanks
> Ankur
> 
>