You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by "Rupani, Nishant" <Ni...@morganstanley.com> on 2015/02/12 10:50:20 UTC

Topology question

Hi,

I am trying out storm for first time so please bear with me for naive questions. We are trying to build a topology with three bolts -

-       Spout to receive the alerts

-       Bolt #1 to fetch the subscribers list

-       Bolt #2 to fetch users we are authorized to receive the alerts

-       Bolt #3 to match bolt #1 and #2 and prepare the final list

Here bolt #3 depends on tuple of #1 and #2. Now a bolt can receive Tuple from one but I couldn't figure out how I can build my topology to achieve the scenario. Can someone please help me if this is doable or do I need to process all bolts serially?

Thank you in advance.

Regards,
Nishant



________________________________

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

RE: Topology question

Posted by "Rupani, Nishant" <Ni...@morganstanley.com>.
Understood. Thank you!

Regards,
Nishant


From: Nathan Leung [mailto:ncleung@gmail.com]
Sent: Friday, February 13, 2015 11:37 AM
To: user
Subject: Re: Topology question

Bolt 3 would get tuples for Bolt 1 and Bolt 2 separately; each tuple would result in a separate call to execute().  You would have to keep tuples in the Bolt's memory until the matching tuple from the other bolt comes in.  If you are doing a join like this, you can use fieldsGrouping to make sure that all tuples for the same visitor go to the same bolt task.

Alternatively, as you mentioned in your original email, you can process the tuples serially.  In this case it does not matter which grouping bolt 3 uses.  This will be the easier (in my opinion better) option for your logic.

Cache is also something you manage yourself in the bolt.  There are several levels of caching you can do.  I've used both Google Guava and Couchbase for caching purposes.


On Fri, Feb 13, 2015 at 12:33 AM, Rupani, Nishant <Ni...@morganstanley.com>> wrote:
Leung,

Thank you for your suggestion. Can you please help me with some more details.

If Bolt 3 subscribes to bolt 1 and 2, will bolt 3 effectively wait till both bolt 1 and 2 are done processing for an alert? You see we need to match subscriber with authorized so we need this behavior.

If Bolt #1 (to fetch subscriber list) gets executed by server1 of cluster, Bolt #2 (get authorized user list) gets executed by server2 of cluster then will storm make sure that output of bolt 1 and 2 will lend to same instance of bolt3? Because we need to match output of bolt 1 and 2, we need this too.

Fetching the list of subscriber and authorized people is not straight forward. It depends on lot of conditions (sometimes dynamic) and varies per alert type. Will see if I can cache it. How is cache management done? If I have another process that maintains this cache, do I need to run it in all servers of the clusters?

Regards,
Nishant

From: Nathan Leung [mailto:ncleung@gmail.com<ma...@gmail.com>]
Sent: Thursday, February 12, 2015 6:28 PM
To: user
Subject: RE: Topology question


You wouldn't need streams.

Bolt 1 subscribes to spout
Bolt 2 subscribes to spout
Bolt 3 subscribes to Bolts 1,2

However, unless your user and subscriber lists are changing frequently, why not just use one bolt and read them during initialization?
On Feb 12, 2015 7:30 AM, "Brunner, Bill" <bi...@baml.com>> wrote:
Have bolt 1 and 2 output on a defined stream, and bolt 3 listens to that stream.  I don’t remember the exact terminology in storm because I’m using trident, but that’s the gist.

From: Rupani, Nishant [mailto:Nishant.Rupani@morganstanley.com<ma...@morganstanley.com>]
Sent: Thursday, February 12, 2015 4:50 AM
To: user@storm.apache.org<ma...@storm.apache.org>
Subject: Topology question

Hi,

I am trying out storm for first time so please bear with me for naive questions. We are trying to build a topology with three bolts –

-      Spout to receive the alerts

-      Bolt #1 to fetch the subscribers list

-      Bolt #2 to fetch users we are authorized to receive the alerts

-      Bolt #3 to match bolt #1 and #2 and prepare the final list

Here bolt #3 depends on tuple of #1 and #2. Now a bolt can receive Tuple from one but I couldn’t figure out how I can build my topology to achieve the scenario. Can someone please help me if this is doable or do I need to process all bolts serially?

Thank you in advance.

Regards,
Nishant


________________________________

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
________________________________
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message.

________________________________

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.



________________________________

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

Re: Topology question

Posted by Nathan Leung <nc...@gmail.com>.
Bolt 3 would get tuples for Bolt 1 and Bolt 2 separately; each tuple would
result in a separate call to execute().  You would have to keep tuples in
the Bolt's memory until the matching tuple from the other bolt comes in.
If you are doing a join like this, you can use fieldsGrouping to make sure
that all tuples for the same visitor go to the same bolt task.

Alternatively, as you mentioned in your original email, you can process the
tuples serially.  In this case it does not matter which grouping bolt 3
uses.  This will be the easier (in my opinion better) option for your logic.

Cache is also something you manage yourself in the bolt.  There are several
levels of caching you can do.  I've used both Google Guava and Couchbase
for caching purposes.


On Fri, Feb 13, 2015 at 12:33 AM, Rupani, Nishant <
Nishant.Rupani@morganstanley.com> wrote:

>   Leung,
>
>
>
> Thank you for your suggestion. Can you please help me with some more
> details.
>
>
>
> If Bolt 3 subscribes to bolt 1 and 2, will bolt 3 effectively wait till
> both bolt 1 and 2 are done processing for an alert? You see we need to
> match subscriber with authorized so we need this behavior.
>
>
>
> If Bolt #1 (to fetch subscriber list) gets executed by server1 of cluster,
> Bolt #2 (get authorized user list) gets executed by server2 of cluster then
> will storm make sure that output of bolt 1 and 2 will lend to same instance
> of bolt3? Because we need to match output of bolt 1 and 2, we need this too.
>
>
>
> Fetching the list of subscriber and authorized people is not straight
> forward. It depends on lot of conditions (sometimes dynamic) and varies per
> alert type. Will see if I can cache it. How is cache management done? If I
> have another process that maintains this cache, do I need to run it in all
> servers of the clusters?
>
>
>
> Regards,
>
> Nishant
>
>
>
> *From:* Nathan Leung [mailto:ncleung@gmail.com]
> *Sent:* Thursday, February 12, 2015 6:28 PM
> *To:* user
> *Subject:* RE: Topology question
>
>
>
> You wouldn't need streams.
>
> Bolt 1 subscribes to spout
> Bolt 2 subscribes to spout
> Bolt 3 subscribes to Bolts 1,2
>
> However, unless your user and subscriber lists are changing frequently,
> why not just use one bolt and read them during initialization?
>
> On Feb 12, 2015 7:30 AM, "Brunner, Bill" <bi...@baml.com> wrote:
>
> Have bolt 1 and 2 output on a defined stream, and bolt 3 listens to that
> stream.  I don’t remember the exact terminology in storm because I’m using
> trident, but that’s the gist.
>
>
>
> *From:* Rupani, Nishant [mailto:Nishant.Rupani@morganstanley.com]
> *Sent:* Thursday, February 12, 2015 4:50 AM
> *To:* user@storm.apache.org
> *Subject:* Topology question
>
>
>
> Hi,
>
>
>
> I am trying out storm for first time so please bear with me for naive
> questions. We are trying to build a topology with three bolts –
>
> -      Spout to receive the alerts
>
> -      Bolt #1 to fetch the subscribers list
>
> -      Bolt #2 to fetch users we are authorized to receive the alerts
>
> -      Bolt #3 to match bolt #1 and #2 and prepare the final list
>
>
>
> Here bolt #3 depends on tuple of #1 and #2. Now a bolt can receive Tuple
> from one but I couldn’t figure out how I can build my topology to achieve
> the scenario. Can someone please help me if this is doable or do I need to
> process all bolts serially?
>
>
>
> Thank you in advance.
>
>
>
> Regards,
>
> Nishant
>
>
>
>
>  ------------------------------
>
>
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> opinions or views contained herein are not intended to be, and do not
> constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall
> Street Reform and Consumer Protection Act. If you have received this
> communication in error, please destroy all electronic and paper copies; do
> not disclose, use or act upon the information; and notify the sender
> immediately. Mistransmission is not intended to waive confidentiality or
> privilege. Morgan Stanley reserves the right, to the extent permitted under
> applicable law, to monitor electronic communications. This message is
> subject to terms available at the following link:
> http://www.morganstanley.com/disclaimers If you cannot access these
> links, please notify us by reply message and we will send the contents to
> you. By messaging with Morgan Stanley you consent to the foregoing.
>  ------------------------------
>
> This message, and any attachments, is for the intended recipient(s) only,
> may contain information that is privileged, confidential and/or proprietary
> and subject to important terms and conditions available at
> http://www.bankofamerica.com/emaildisclaimer. If you are not the intended
> recipient, please delete this message.
>
>
> ------------------------------
>
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> opinions or views contained herein are not intended to be, and do not
> constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall
> Street Reform and Consumer Protection Act. If you have received this
> communication in error, please destroy all electronic and paper copies; do
> not disclose, use or act upon the information; and notify the sender
> immediately. Mistransmission is not intended to waive confidentiality or
> privilege. Morgan Stanley reserves the right, to the extent permitted under
> applicable law, to monitor electronic communications. This message is
> subject to terms available at the following link:
> http://www.morganstanley.com/disclaimers If you cannot access these
> links, please notify us by reply message and we will send the contents to
> you. By messaging with Morgan Stanley you consent to the foregoing.
>
>

RE: Topology question

Posted by "Rupani, Nishant" <Ni...@morganstanley.com>.
Leung,

Thank you for your suggestion. Can you please help me with some more details.

If Bolt 3 subscribes to bolt 1 and 2, will bolt 3 effectively wait till both bolt 1 and 2 are done processing for an alert? You see we need to match subscriber with authorized so we need this behavior.

If Bolt #1 (to fetch subscriber list) gets executed by server1 of cluster, Bolt #2 (get authorized user list) gets executed by server2 of cluster then will storm make sure that output of bolt 1 and 2 will lend to same instance of bolt3? Because we need to match output of bolt 1 and 2, we need this too.

Fetching the list of subscriber and authorized people is not straight forward. It depends on lot of conditions (sometimes dynamic) and varies per alert type. Will see if I can cache it. How is cache management done? If I have another process that maintains this cache, do I need to run it in all servers of the clusters?

Regards,
Nishant

From: Nathan Leung [mailto:ncleung@gmail.com]
Sent: Thursday, February 12, 2015 6:28 PM
To: user
Subject: RE: Topology question


You wouldn't need streams.

Bolt 1 subscribes to spout
Bolt 2 subscribes to spout
Bolt 3 subscribes to Bolts 1,2

However, unless your user and subscriber lists are changing frequently, why not just use one bolt and read them during initialization?
On Feb 12, 2015 7:30 AM, "Brunner, Bill" <bi...@baml.com>> wrote:
Have bolt 1 and 2 output on a defined stream, and bolt 3 listens to that stream.  I don’t remember the exact terminology in storm because I’m using trident, but that’s the gist.

From: Rupani, Nishant [mailto:Nishant.Rupani@morganstanley.com<ma...@morganstanley.com>]
Sent: Thursday, February 12, 2015 4:50 AM
To: user@storm.apache.org<ma...@storm.apache.org>
Subject: Topology question

Hi,

I am trying out storm for first time so please bear with me for naive questions. We are trying to build a topology with three bolts –

-      Spout to receive the alerts

-      Bolt #1 to fetch the subscribers list

-      Bolt #2 to fetch users we are authorized to receive the alerts

-      Bolt #3 to match bolt #1 and #2 and prepare the final list

Here bolt #3 depends on tuple of #1 and #2. Now a bolt can receive Tuple from one but I couldn’t figure out how I can build my topology to achieve the scenario. Can someone please help me if this is doable or do I need to process all bolts serially?

Thank you in advance.

Regards,
Nishant


________________________________

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
________________________________
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message.


________________________________

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

RE: Topology question

Posted by Nathan Leung <nc...@gmail.com>.
You wouldn't need streams.

Bolt 1 subscribes to spout
Bolt 2 subscribes to spout
Bolt 3 subscribes to Bolts 1,2

However, unless your user and subscriber lists are changing frequently, why
not just use one bolt and read them during initialization?
On Feb 12, 2015 7:30 AM, "Brunner, Bill" <bi...@baml.com> wrote:

>  Have bolt 1 and 2 output on a defined stream, and bolt 3 listens to that
> stream.  I don’t remember the exact terminology in storm because I’m using
> trident, but that’s the gist.
>
>
>
> *From:* Rupani, Nishant [mailto:Nishant.Rupani@morganstanley.com]
> *Sent:* Thursday, February 12, 2015 4:50 AM
> *To:* user@storm.apache.org
> *Subject:* Topology question
>
>
>
> Hi,
>
>
>
> I am trying out storm for first time so please bear with me for naive
> questions. We are trying to build a topology with three bolts –
>
> -      Spout to receive the alerts
>
> -      Bolt #1 to fetch the subscribers list
>
> -      Bolt #2 to fetch users we are authorized to receive the alerts
>
> -      Bolt #3 to match bolt #1 and #2 and prepare the final list
>
>
>
> Here bolt #3 depends on tuple of #1 and #2. Now a bolt can receive Tuple
> from one but I couldn’t figure out how I can build my topology to achieve
> the scenario. Can someone please help me if this is doable or do I need to
> process all bolts serially?
>
>
>
> Thank you in advance.
>
>
>
> Regards,
>
> Nishant
>
>
>
>
>  ------------------------------
>
>
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> opinions or views contained herein are not intended to be, and do not
> constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall
> Street Reform and Consumer Protection Act. If you have received this
> communication in error, please destroy all electronic and paper copies; do
> not disclose, use or act upon the information; and notify the sender
> immediately. Mistransmission is not intended to waive confidentiality or
> privilege. Morgan Stanley reserves the right, to the extent permitted under
> applicable law, to monitor electronic communications. This message is
> subject to terms available at the following link:
> http://www.morganstanley.com/disclaimers If you cannot access these
> links, please notify us by reply message and we will send the contents to
> you. By messaging with Morgan Stanley you consent to the foregoing.
>  ------------------------------
> This message, and any attachments, is for the intended recipient(s) only,
> may contain information that is privileged, confidential and/or proprietary
> and subject to important terms and conditions available at
> http://www.bankofamerica.com/emaildisclaimer. If you are not the intended
> recipient, please delete this message.
>

RE: Topology question

Posted by "Brunner, Bill" <bi...@baml.com>.
Have bolt 1 and 2 output on a defined stream, and bolt 3 listens to that stream.  I don't remember the exact terminology in storm because I'm using trident, but that's the gist.

From: Rupani, Nishant [mailto:Nishant.Rupani@morganstanley.com]
Sent: Thursday, February 12, 2015 4:50 AM
To: user@storm.apache.org
Subject: Topology question

Hi,

I am trying out storm for first time so please bear with me for naive questions. We are trying to build a topology with three bolts -

-      Spout to receive the alerts

-      Bolt #1 to fetch the subscribers list

-      Bolt #2 to fetch users we are authorized to receive the alerts

-      Bolt #3 to match bolt #1 and #2 and prepare the final list

Here bolt #3 depends on tuple of #1 and #2. Now a bolt can receive Tuple from one but I couldn't figure out how I can build my topology to achieve the scenario. Can someone please help me if this is doable or do I need to process all bolts serially?

Thank you in advance.

Regards,
Nishant


________________________________

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

----------------------------------------------------------------------
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient, please delete this message.