You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by "Garcia-Contractor, Joseph (CORP)" <Jo...@ADP.com> on 2015/09/28 17:08:12 UTC

Starting and stopping storm

Hi all,

               I am a DevOps guy and I need implement a storm cluster with the proper start and stop init scripts on a Linux server.  I already went through the documentation and it seems simple enough.  I am using supervisor as my process manager.  I am however having a debate with one of the developers using Storm on the proper way to shutdown Storm and I am hoping that you fine folks can help us out in this regard.

               The developer believes that before you tell supervisor to kill (SIGTERM) the storm workers, supervisor, and nimbus, you must first issue a "storm deactivate topology-name", then tell supervisor to kill all the various processes.  He believes this because he doesn't know if Storm will do an orderly shutdown on SIGTERM and that there is a chance that something will get screwed up.  This also means that when you start storm, after nimbus is up, you need to issue a ""storm activate topology-name".

               I am of the belief that because of storms fast fail and because it guarantees data processing, none of that is necessary and that you can just tell supervisor to stop the process.

               So who is right here?

----------------------------------------------------------------------
This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, notify the sender immediately by return email and delete the message and any attachments from your system.

Re: Starting and stopping storm

Posted by Parth Brahmbhatt <pb...@hortonworks.com>.
If you are not upgrading storm you can just deactivate and perform host
maintenance. You could also do this in rolling fashion so you never have
to deactivate the topologies. When a node goes down nimbus will assign all
of its work to some other node, when it comes backup nimbus will take that
up as a candidate for future scheduling. If you plan to do this often you
should notify the devs who are going to run topologies on the cluster that
they may loose local state often and their performance may vary every time
the reboot happens. If availability is no concern you can just deactivate
and once topologies are deactivated you can do your maintenance reboots
and activate the topologies after reboots are done.

If you are setting up a new cluster I highly recommend to start at 0.10
version as we have made significant changes in that release , one of which
is support for rolling upgrades. So in future if you are on version 0.10
or above, when you have to upgrade storm (or downgrade to a 0.10 or above)
you will not have to kill topologies.

Thanks
Parth

On 9/29/15, 11:04 AM, "Garcia-Contractor, Joseph (CORP)"
<Jo...@ADP.com> wrote:

>Ahh so here is the thing... I am not upgrading anything.  I am in the
>process of setting up storm 0.9.5.  I am at the point where I need to
>know how to properly start and stop storm without potentially damaging
>the topologies using init scripts.  The use case for this is for day to
>day operations like a kernel upgrade that requires a reboot.
>
>-----Original Message-----
>From: Parth Brahmbhatt [mailto:pbrahmbhatt@hortonworks.com]
>Sent: Tuesday, September 29, 2015 2:00 PM
>To: user@storm.apache.org
>Subject: Re: Starting and stopping storm
>
>Can you share what version you are on and what version you are trying to
>upgrade to? 
>
>Thanks
>Parth
>
>On 9/29/15, 10:55 AM, "Matthias J. Sax" <mj...@apache.org> wrote:
>
>>As far as I know, running topologies are restarted by Nimbus if the
>>cluster goes online again. I have never tested it for a deactivated
>>topology. But I would guess, that there is no difference.
>>
>>-Matthias
>>
>>On 09/29/2015 07:46 PM, Stephen Powis wrote:
>>> I have no idea what happens if you bring down all of the nodes in the
>>> cluster while the topologies are deactivated.  I'd suggest testing it
>>> and seeing, or maybe someone else can speak up?
>>> 
>>> Also depending on the version of storm you're upgrading from, there
>>> may be different steps involved that may complicate things.
>>> 
>>> See release notes around upgrading from 0.8.x to 0.9.0:
>>> 
>>>https://storm.apache.org/2013/12/08/storm090-released.html#api-compati
>>>bil
>>>ity-and-upgrading
>>> for just an example.
>>> 
>>> Additionally depending on if the storm client API changes
>>> significantly between versions, it may require recompiling existing
>>> topology code against the new API version before it can run properly
>>> on the new storm cluster version.  Taking a wild guess... this
>>> probably really only will be a problem when upgrading major versions,
>>> and less of a concern for minor version upgrades, but again I don't
>>>really know that for sure.
>>> 
>>> 
>>> On Tue, Sep 29, 2015 at 1:36 PM, Garcia-Contractor, Joseph (CORP)
>>> <Joseph.Garcia-Contractor@adp.com
>>> <ma...@adp.com>> wrote:
>>> 
>>>     Stephen, ____
>>> 
>>>     __ __
>>> 
>>>     Thank you for the response!  Helps out a lot.____
>>> 
>>>     __ __
>>> 
>>>     So a further question.  And forgive my lack of knowledge here, I am
>>>     not the one using Storm, only deploying and running it, so I don¹t
>>>     understand all the reasoning behind why something is done a certain
>>>     way in Storm.____
>>> 
>>>     __ __
>>> 
>>>     Let¹s say I have deactivated all the topologies.  Is it necessary
>>>to
>>>     then kill the topology?  Could I not just wait a set amount of time
>>>     to ensure the tuples have cleared, say 5 minutes, and then bring
>>>     down the nodes?____
>>> 
>>>     __ __
>>> 
>>>     The reason I ask this is because it is a lot easier to activate the
>>>     topologies after the nodes are back up with a non-interactive
>>>     script.  I would like to avoid using ³storm jar² to load the
>>>     topology because that means I need to hard code stuff into my
>>>     scripts or come up with a separate conf file for my script.  See my
>>>     current code below:____
>>> 
>>>     __ __
>>> 
>>>     function deactivate_topos {____
>>> 
>>>       STORM_TOPO_STATUS=$(storm list | sed -n -e
>>>     
>>>'/^-------------------------------------------------------------------
>>>/,$
>>>p'
>>>     | sed -e
>>>     
>>>'/^-------------------------------------------------------------------/d
>>>'
>>>     | awk '{print $1 ":" $2}')____
>>> 
>>>     __ __
>>> 
>>>       for i in $STORM_TOPO_STATUS____
>>> 
>>>       do____
>>> 
>>>         IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"____
>>> 
>>>                    echo "$TOPO_NAME $TOPO_STATUS"____
>>> 
>>>                    if [ $TOPO_STATUS = 'ACTIVE' ]; then____
>>> 
>>>                                   storm deactivate ${TOPO_NAME}____
>>> 
>>>                    fi____
>>> 
>>>         storm list | sed -n -e
>>>     
>>>'/^-------------------------------------------------------------------
>>>/,$
>>>p'____
>>> 
>>>       done____
>>> 
>>>     }____
>>> 
>>>     __ __
>>> 
>>>     function activate_topos {____
>>> 
>>>       STORM_TOPO_STATUS=$(storm list | sed -n -e
>>>     
>>>'/^-------------------------------------------------------------------
>>>/,$
>>>p'
>>>     | sed -e
>>>     
>>>'/^-------------------------------------------------------------------/d
>>>'
>>>     | awk '{print $1 ":" $2}')____
>>> 
>>>       for i in $STORM_TOPO_STATUS____
>>> 
>>>       do____
>>> 
>>>         IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"____
>>> 
>>>         echo "$TOPO_NAME $TOPO_STATUS"____
>>> 
>>>         if [ $TOPO_STATUS = 'INACTIVE' ]; then____
>>> 
>>>           storm activate ${TOPO_NAME}____
>>> 
>>>         fi____
>>> 
>>>         storm list | sed -n -e
>>>     
>>>'/^-------------------------------------------------------------------
>>>/,$
>>>p'____
>>> 
>>>       done____
>>> 
>>>     }____
>>> 
>>>     __ __
>>> 
>>>     *From:*Stephen Powis [mailto:spowis@salesforce.com
>>>     <ma...@salesforce.com>]
>>>     *Sent:* Tuesday, September 29, 2015 12:45 PM
>>> 
>>> 
>>>     *To:* user@storm.apache.org <ma...@storm.apache.org>
>>>     *Subject:* Re: Starting and stopping storm____
>>> 
>>>     __ __
>>> 
>>>     I would imagine the safest way would be to elect to deactivate each
>>>     running topology, which should make your spouts stop emitting
>>>     tuples.  You'd wait for all of the currently processing tuples to
>>>     finish processing, and then kill the topology.
>>> 
>>>     If tuples get processed quickly in your topologies, you can
>>>     effectively do this by selecting kill and giving it a long enough
>>>     wait time.  IE -- Telling storm to kill your topology after 30
>>>     seconds means it will deactivate your spouts for 30 seconds,
>>>waiting
>>>     for existing tuples to finish getting processed, and then kill off
>>>     the topology.
>>> 
>>>     Then bring down each node, upgrade it, bring it back online and
>>>     resubmit your topologies.  ____
>>> 
>>>     __ __
>>> 
>>>     On Tue, Sep 29, 2015 at 10:02 AM, Garcia-Contractor, Joseph (CORP)
>>>     <Joseph.Garcia-Contractor@adp.com
>>>     <ma...@adp.com>> wrote:____
>>> 
>>>     I don't think I got my question across right or I am confused.
>>> 
>>>     Let me break this down in a more simple fashion.
>>> 
>>>     I have a Storm Cluster named "The Quiet Storm" ;) here is what it
>>>     consists of:
>>> 
>>>     ******
>>>     Server ZK1: Running Zookeeper
>>>     Server ZK2: Running Zookeeper
>>>     Server ZK3: Running Zookeeper
>>> 
>>>     Server N1: SupervisorD running Storm Nimbus
>>> 
>>>     Server S1: SupervisorD running Storm Supervisor with 4 workers.
>>>     Server S2: SupervisorD running Storm Supervisor with 4 workers.
>>>     Server S3: SupervisorD running Storm Supervisor with 4 workers.
>>>     ******
>>> 
>>>     Now the "The Quiet Storm" can have 1-n number of topologies running
>>>     on it.
>>> 
>>>     I need to shut down all the servers in the cluster for maintenance.
>>>     What is the procedure to do this without doing harm to the
>>>currently
>>>     running topologies?
>>> 
>>>     Thank you,
>>> 
>>>     Joe
>>> 
>>>     -----Original Message-----
>>>     From: Matthias J. Sax [mailto:mjsax@apache.org
>>>     <ma...@apache.org>]
>>>     Sent: Monday, September 28, 2015 12:15 PM
>>>     To: user@storm.apache.org <ma...@storm.apache.org>
>>>     Subject: Re: Starting and stopping storm
>>> 
>>>     Hi,
>>> 
>>>     as always: it depends. ;)
>>> 
>>>     Storm itself clear ups its own resources just fine. However, if the
>>>     running topology needs to clean-up/release resources before it is
>>>     shut down, Storm is not of any help. Even if there is a Spout/Bolt
>>>     cleanup() method, Storm does not guarantee that it will be called.
>>> 
>>>     Thus, using "storm deactivate" is a good way to achieve proper
>>>cleanup.
>>>     However, the topology must provide some code for it, too. On the
>>>     call to Spout.deactivate(), it must emit a special "clean-up"
>>>     message (that you have to design by yourself) that must propagate
>>>     through the whole topology, ie, each bolt must forward this message
>>>     to all its output streams. Furthermore, bolts must to the clean-up
>>>     if they receive this message.
>>> 
>>>     Long story short: "storm deactivate" before "storm kill" makes only
>>>     sense if the topology requires proper cleanup and if the topology
>>>     itself can react/cleanup properly on Spout.deactivate().
>>> 
>>>     Using "storm activate" in not necessary in any case.
>>> 
>>>     -Matthias
>>> 
>>> 
>>>     On 09/28/2015 05:08 PM, Garcia-Contractor, Joseph (CORP) wrote:
>>>     > Hi all,
>>>     >
>>>     >
>>>     >
>>>     >                I am a DevOps guy and I need implement a storm
>>>cluster
>>>     > with the proper start and stop init scripts on a Linux server.  I
>>>     > already went through the documentation and it seems simple
>>>enough.  I
>>>     > am using supervisor as my process manager.  I am however having a
>>>     > debate with one of the developers using Storm on the proper way
>>>to
>>>     > shutdown Storm and I am hoping that you fine folks can help us
>>>out
>>>     in this regard.
>>>     >
>>>     >
>>>     >
>>>     >                The developer believes that before you tell
>>>supervisor
>>>     > to kill (SIGTERM) the storm workers, supervisor, and nimbus,
>>>you must
>>>     > first issue a "storm deactivate topology-name", then tell
>>>supervisor
>>>     > to kill all the various processes.  He believes this because he
>>>     > doesn't know if Storm will do an orderly shutdown on SIGTERM
>>>and that
>>>     > there is a chance that something will get screwed up.  This
>>>also means
>>>     > that when you start storm, after nimbus is up, you need to issue
>>>a
>>>     > ""storm activate topology-name".
>>>     >
>>>     >
>>>     >
>>>     >                I am of the belief that because of storms fast
>>>fail and
>>>     > because it guarantees data processing, none of that is
>>>necessary and
>>>     > that you can just tell supervisor to stop the process.
>>>     >
>>>     >
>>>     >
>>>     >                So who is right here?
>>>     >
>>>     >
>>>----------------------------------------------------------------------
>>>     > -- This message and any attachments are intended only for the
>>>use of
>>>     > the addressee and may contain information that is privileged and
>>>     > confidential. If the reader of the message is not the intended
>>>     > recipient or an authorized representative of the intended
>>>recipient,
>>>     > you are hereby notified that any dissemination of this
>>>communication
>>>     > is strictly prohibited. If you have received this communication
>>>in
>>>     > error, notify the sender immediately by return email and delete
>>>the
>>>     > message and any attachments from your system.
>>> 
>>>     
>>>----------------------------------------------------------------------
>>>     This message and any attachments are intended only for the use of
>>>     the addressee and may contain information that is privileged and
>>>     confidential. If the reader of the message is not the intended
>>>     recipient or an authorized representative of the intended
>>>recipient,
>>>     you are hereby notified that any dissemination of this
>>>communication
>>>     is strictly prohibited. If you have received this communication in
>>>     error, notify the sender immediately by return email and delete the
>>>     message and any attachments from your system.____
>>> 
>>>     __ __
>>> 
>>> 
>>
>
>----------------------------------------------------------------------
>This message and any attachments are intended only for the use of the
>addressee and may contain information that is privileged and
>confidential. If the reader of the message is not the intended recipient
>or an authorized representative of the intended recipient, you are hereby
>notified that any dissemination of this communication is strictly
>prohibited. If you have received this communication in error, notify the
>sender immediately by return email and delete the message and any
>attachments from your system.
>


RE: Starting and stopping storm

Posted by "Garcia-Contractor, Joseph (CORP)" <Jo...@ADP.com>.
Ahh so here is the thing... I am not upgrading anything.  I am in the process of setting up storm 0.9.5.  I am at the point where I need to know how to properly start and stop storm without potentially damaging the topologies using init scripts.  The use case for this is for day to day operations like a kernel upgrade that requires a reboot.

-----Original Message-----
From: Parth Brahmbhatt [mailto:pbrahmbhatt@hortonworks.com] 
Sent: Tuesday, September 29, 2015 2:00 PM
To: user@storm.apache.org
Subject: Re: Starting and stopping storm

Can you share what version you are on and what version you are trying to upgrade to? 

Thanks
Parth

On 9/29/15, 10:55 AM, "Matthias J. Sax" <mj...@apache.org> wrote:

>As far as I know, running topologies are restarted by Nimbus if the 
>cluster goes online again. I have never tested it for a deactivated 
>topology. But I would guess, that there is no difference.
>
>-Matthias
>
>On 09/29/2015 07:46 PM, Stephen Powis wrote:
>> I have no idea what happens if you bring down all of the nodes in the 
>> cluster while the topologies are deactivated.  I'd suggest testing it 
>> and seeing, or maybe someone else can speak up?
>> 
>> Also depending on the version of storm you're upgrading from, there 
>> may be different steps involved that may complicate things.
>> 
>> See release notes around upgrading from 0.8.x to 0.9.0:
>> 
>>https://storm.apache.org/2013/12/08/storm090-released.html#api-compati
>>bil
>>ity-and-upgrading
>> for just an example.
>> 
>> Additionally depending on if the storm client API changes 
>> significantly between versions, it may require recompiling existing 
>> topology code against the new API version before it can run properly 
>> on the new storm cluster version.  Taking a wild guess... this 
>> probably really only will be a problem when upgrading major versions, 
>> and less of a concern for minor version upgrades, but again I don't really know that for sure.
>> 
>> 
>> On Tue, Sep 29, 2015 at 1:36 PM, Garcia-Contractor, Joseph (CORP) 
>> <Joseph.Garcia-Contractor@adp.com 
>> <ma...@adp.com>> wrote:
>> 
>>     Stephen, ____
>> 
>>     __ __
>> 
>>     Thank you for the response!  Helps out a lot.____
>> 
>>     __ __
>> 
>>     So a further question.  And forgive my lack of knowledge here, I am
>>     not the one using Storm, only deploying and running it, so I don¹t
>>     understand all the reasoning behind why something is done a certain
>>     way in Storm.____
>> 
>>     __ __
>> 
>>     Let¹s say I have deactivated all the topologies.  Is it necessary to
>>     then kill the topology?  Could I not just wait a set amount of time
>>     to ensure the tuples have cleared, say 5 minutes, and then bring
>>     down the nodes?____
>> 
>>     __ __
>> 
>>     The reason I ask this is because it is a lot easier to activate the
>>     topologies after the nodes are back up with a non-interactive
>>     script.  I would like to avoid using ³storm jar² to load the
>>     topology because that means I need to hard code stuff into my
>>     scripts or come up with a separate conf file for my script.  See my
>>     current code below:____
>> 
>>     __ __
>> 
>>     function deactivate_topos {____
>> 
>>       STORM_TOPO_STATUS=$(storm list | sed -n -e
>>     
>>'/^-------------------------------------------------------------------
>>/,$
>>p'
>>     | sed -e
>>     
>>'/^-------------------------------------------------------------------/d'
>>     | awk '{print $1 ":" $2}')____
>> 
>>     __ __
>> 
>>       for i in $STORM_TOPO_STATUS____
>> 
>>       do____
>> 
>>         IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"____
>> 
>>                    echo "$TOPO_NAME $TOPO_STATUS"____
>> 
>>                    if [ $TOPO_STATUS = 'ACTIVE' ]; then____
>> 
>>                                   storm deactivate ${TOPO_NAME}____
>> 
>>                    fi____
>> 
>>         storm list | sed -n -e
>>     
>>'/^-------------------------------------------------------------------
>>/,$
>>p'____
>> 
>>       done____
>> 
>>     }____
>> 
>>     __ __
>> 
>>     function activate_topos {____
>> 
>>       STORM_TOPO_STATUS=$(storm list | sed -n -e
>>     
>>'/^-------------------------------------------------------------------
>>/,$
>>p'
>>     | sed -e
>>     
>>'/^-------------------------------------------------------------------/d'
>>     | awk '{print $1 ":" $2}')____
>> 
>>       for i in $STORM_TOPO_STATUS____
>> 
>>       do____
>> 
>>         IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"____
>> 
>>         echo "$TOPO_NAME $TOPO_STATUS"____
>> 
>>         if [ $TOPO_STATUS = 'INACTIVE' ]; then____
>> 
>>           storm activate ${TOPO_NAME}____
>> 
>>         fi____
>> 
>>         storm list | sed -n -e
>>     
>>'/^-------------------------------------------------------------------
>>/,$
>>p'____
>> 
>>       done____
>> 
>>     }____
>> 
>>     __ __
>> 
>>     *From:*Stephen Powis [mailto:spowis@salesforce.com
>>     <ma...@salesforce.com>]
>>     *Sent:* Tuesday, September 29, 2015 12:45 PM
>> 
>> 
>>     *To:* user@storm.apache.org <ma...@storm.apache.org>
>>     *Subject:* Re: Starting and stopping storm____
>> 
>>     __ __
>> 
>>     I would imagine the safest way would be to elect to deactivate each
>>     running topology, which should make your spouts stop emitting
>>     tuples.  You'd wait for all of the currently processing tuples to
>>     finish processing, and then kill the topology.
>> 
>>     If tuples get processed quickly in your topologies, you can
>>     effectively do this by selecting kill and giving it a long enough
>>     wait time.  IE -- Telling storm to kill your topology after 30
>>     seconds means it will deactivate your spouts for 30 seconds, waiting
>>     for existing tuples to finish getting processed, and then kill off
>>     the topology.
>> 
>>     Then bring down each node, upgrade it, bring it back online and
>>     resubmit your topologies.  ____
>> 
>>     __ __
>> 
>>     On Tue, Sep 29, 2015 at 10:02 AM, Garcia-Contractor, Joseph (CORP)
>>     <Joseph.Garcia-Contractor@adp.com
>>     <ma...@adp.com>> wrote:____
>> 
>>     I don't think I got my question across right or I am confused.
>> 
>>     Let me break this down in a more simple fashion.
>> 
>>     I have a Storm Cluster named "The Quiet Storm" ;) here is what it
>>     consists of:
>> 
>>     ******
>>     Server ZK1: Running Zookeeper
>>     Server ZK2: Running Zookeeper
>>     Server ZK3: Running Zookeeper
>> 
>>     Server N1: SupervisorD running Storm Nimbus
>> 
>>     Server S1: SupervisorD running Storm Supervisor with 4 workers.
>>     Server S2: SupervisorD running Storm Supervisor with 4 workers.
>>     Server S3: SupervisorD running Storm Supervisor with 4 workers.
>>     ******
>> 
>>     Now the "The Quiet Storm" can have 1-n number of topologies running
>>     on it.
>> 
>>     I need to shut down all the servers in the cluster for maintenance.
>>     What is the procedure to do this without doing harm to the currently
>>     running topologies?
>> 
>>     Thank you,
>> 
>>     Joe
>> 
>>     -----Original Message-----
>>     From: Matthias J. Sax [mailto:mjsax@apache.org
>>     <ma...@apache.org>]
>>     Sent: Monday, September 28, 2015 12:15 PM
>>     To: user@storm.apache.org <ma...@storm.apache.org>
>>     Subject: Re: Starting and stopping storm
>> 
>>     Hi,
>> 
>>     as always: it depends. ;)
>> 
>>     Storm itself clear ups its own resources just fine. However, if the
>>     running topology needs to clean-up/release resources before it is
>>     shut down, Storm is not of any help. Even if there is a Spout/Bolt
>>     cleanup() method, Storm does not guarantee that it will be called.
>> 
>>     Thus, using "storm deactivate" is a good way to achieve proper 
>>cleanup.
>>     However, the topology must provide some code for it, too. On the
>>     call to Spout.deactivate(), it must emit a special "clean-up"
>>     message (that you have to design by yourself) that must propagate
>>     through the whole topology, ie, each bolt must forward this message
>>     to all its output streams. Furthermore, bolts must to the clean-up
>>     if they receive this message.
>> 
>>     Long story short: "storm deactivate" before "storm kill" makes only
>>     sense if the topology requires proper cleanup and if the topology
>>     itself can react/cleanup properly on Spout.deactivate().
>> 
>>     Using "storm activate" in not necessary in any case.
>> 
>>     -Matthias
>> 
>> 
>>     On 09/28/2015 05:08 PM, Garcia-Contractor, Joseph (CORP) wrote:
>>     > Hi all,
>>     >
>>     >
>>     >
>>     >                I am a DevOps guy and I need implement a storm
>>cluster
>>     > with the proper start and stop init scripts on a Linux server.  I
>>     > already went through the documentation and it seems simple 
>>enough.  I
>>     > am using supervisor as my process manager.  I am however having a
>>     > debate with one of the developers using Storm on the proper way to
>>     > shutdown Storm and I am hoping that you fine folks can help us out
>>     in this regard.
>>     >
>>     >
>>     >
>>     >                The developer believes that before you tell
>>supervisor
>>     > to kill (SIGTERM) the storm workers, supervisor, and nimbus, 
>>you must
>>     > first issue a "storm deactivate topology-name", then tell 
>>supervisor
>>     > to kill all the various processes.  He believes this because he
>>     > doesn't know if Storm will do an orderly shutdown on SIGTERM 
>>and that
>>     > there is a chance that something will get screwed up.  This 
>>also means
>>     > that when you start storm, after nimbus is up, you need to issue a
>>     > ""storm activate topology-name".
>>     >
>>     >
>>     >
>>     >                I am of the belief that because of storms fast
>>fail and
>>     > because it guarantees data processing, none of that is 
>>necessary and
>>     > that you can just tell supervisor to stop the process.
>>     >
>>     >
>>     >
>>     >                So who is right here?
>>     >
>>     >
>>----------------------------------------------------------------------
>>     > -- This message and any attachments are intended only for the 
>>use of
>>     > the addressee and may contain information that is privileged and
>>     > confidential. If the reader of the message is not the intended
>>     > recipient or an authorized representative of the intended 
>>recipient,
>>     > you are hereby notified that any dissemination of this 
>>communication
>>     > is strictly prohibited. If you have received this communication in
>>     > error, notify the sender immediately by return email and delete 
>>the
>>     > message and any attachments from your system.
>> 
>>     
>>----------------------------------------------------------------------
>>     This message and any attachments are intended only for the use of
>>     the addressee and may contain information that is privileged and
>>     confidential. If the reader of the message is not the intended
>>     recipient or an authorized representative of the intended recipient,
>>     you are hereby notified that any dissemination of this communication
>>     is strictly prohibited. If you have received this communication in
>>     error, notify the sender immediately by return email and delete the
>>     message and any attachments from your system.____
>> 
>>     __ __
>> 
>> 
>

----------------------------------------------------------------------
This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, notify the sender immediately by return email and delete the message and any attachments from your system.

Re: Starting and stopping storm

Posted by Parth Brahmbhatt <pb...@hortonworks.com>.
Can you share what version you are on and what version you are trying to
upgrade to? 

Thanks
Parth

On 9/29/15, 10:55 AM, "Matthias J. Sax" <mj...@apache.org> wrote:

>As far as I know, running topologies are restarted by Nimbus if the
>cluster goes online again. I have never tested it for a deactivated
>topology. But I would guess, that there is no difference.
>
>-Matthias
>
>On 09/29/2015 07:46 PM, Stephen Powis wrote:
>> I have no idea what happens if you bring down all of the nodes in the
>> cluster while the topologies are deactivated.  I'd suggest testing it
>> and seeing, or maybe someone else can speak up?
>> 
>> Also depending on the version of storm you're upgrading from, there may
>> be different steps involved that may complicate things.
>> 
>> See release notes around upgrading from 0.8.x to 0.9.0:
>> 
>>https://storm.apache.org/2013/12/08/storm090-released.html#api-compatibil
>>ity-and-upgrading
>> for just an example.
>> 
>> Additionally depending on if the storm client API changes significantly
>> between versions, it may require recompiling existing topology code
>> against the new API version before it can run properly on the new storm
>> cluster version.  Taking a wild guess... this probably really only will
>> be a problem when upgrading major versions, and less of a concern for
>> minor version upgrades, but again I don't really know that for sure.
>> 
>> 
>> On Tue, Sep 29, 2015 at 1:36 PM, Garcia-Contractor, Joseph (CORP)
>> <Joseph.Garcia-Contractor@adp.com
>> <ma...@adp.com>> wrote:
>> 
>>     Stephen, ____
>> 
>>     __ __
>> 
>>     Thank you for the response!  Helps out a lot.____
>> 
>>     __ __
>> 
>>     So a further question.  And forgive my lack of knowledge here, I am
>>     not the one using Storm, only deploying and running it, so I don¹t
>>     understand all the reasoning behind why something is done a certain
>>     way in Storm.____
>> 
>>     __ __
>> 
>>     Let¹s say I have deactivated all the topologies.  Is it necessary to
>>     then kill the topology?  Could I not just wait a set amount of time
>>     to ensure the tuples have cleared, say 5 minutes, and then bring
>>     down the nodes?____
>> 
>>     __ __
>> 
>>     The reason I ask this is because it is a lot easier to activate the
>>     topologies after the nodes are back up with a non-interactive
>>     script.  I would like to avoid using ³storm jar² to load the
>>     topology because that means I need to hard code stuff into my
>>     scripts or come up with a separate conf file for my script.  See my
>>     current code below:____
>> 
>>     __ __
>> 
>>     function deactivate_topos {____
>> 
>>       STORM_TOPO_STATUS=$(storm list | sed -n -e
>>     
>>'/^-------------------------------------------------------------------/,$
>>p'
>>     | sed -e
>>     
>>'/^-------------------------------------------------------------------/d'
>>     | awk '{print $1 ":" $2}')____
>> 
>>     __ __
>> 
>>       for i in $STORM_TOPO_STATUS____
>> 
>>       do____
>> 
>>         IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"____
>> 
>>                    echo "$TOPO_NAME $TOPO_STATUS"____
>> 
>>                    if [ $TOPO_STATUS = 'ACTIVE' ]; then____
>> 
>>                                   storm deactivate ${TOPO_NAME}____
>> 
>>                    fi____
>> 
>>         storm list | sed -n -e
>>     
>>'/^-------------------------------------------------------------------/,$
>>p'____
>> 
>>       done____
>> 
>>     }____
>> 
>>     __ __
>> 
>>     function activate_topos {____
>> 
>>       STORM_TOPO_STATUS=$(storm list | sed -n -e
>>     
>>'/^-------------------------------------------------------------------/,$
>>p'
>>     | sed -e
>>     
>>'/^-------------------------------------------------------------------/d'
>>     | awk '{print $1 ":" $2}')____
>> 
>>       for i in $STORM_TOPO_STATUS____
>> 
>>       do____
>> 
>>         IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"____
>> 
>>         echo "$TOPO_NAME $TOPO_STATUS"____
>> 
>>         if [ $TOPO_STATUS = 'INACTIVE' ]; then____
>> 
>>           storm activate ${TOPO_NAME}____
>> 
>>         fi____
>> 
>>         storm list | sed -n -e
>>     
>>'/^-------------------------------------------------------------------/,$
>>p'____
>> 
>>       done____
>> 
>>     }____
>> 
>>     __ __
>> 
>>     *From:*Stephen Powis [mailto:spowis@salesforce.com
>>     <ma...@salesforce.com>]
>>     *Sent:* Tuesday, September 29, 2015 12:45 PM
>> 
>> 
>>     *To:* user@storm.apache.org <ma...@storm.apache.org>
>>     *Subject:* Re: Starting and stopping storm____
>> 
>>     __ __
>> 
>>     I would imagine the safest way would be to elect to deactivate each
>>     running topology, which should make your spouts stop emitting
>>     tuples.  You'd wait for all of the currently processing tuples to
>>     finish processing, and then kill the topology.
>> 
>>     If tuples get processed quickly in your topologies, you can
>>     effectively do this by selecting kill and giving it a long enough
>>     wait time.  IE -- Telling storm to kill your topology after 30
>>     seconds means it will deactivate your spouts for 30 seconds, waiting
>>     for existing tuples to finish getting processed, and then kill off
>>     the topology.
>> 
>>     Then bring down each node, upgrade it, bring it back online and
>>     resubmit your topologies.  ____
>> 
>>     __ __
>> 
>>     On Tue, Sep 29, 2015 at 10:02 AM, Garcia-Contractor, Joseph (CORP)
>>     <Joseph.Garcia-Contractor@adp.com
>>     <ma...@adp.com>> wrote:____
>> 
>>     I don't think I got my question across right or I am confused.
>> 
>>     Let me break this down in a more simple fashion.
>> 
>>     I have a Storm Cluster named "The Quiet Storm" ;) here is what it
>>     consists of:
>> 
>>     ******
>>     Server ZK1: Running Zookeeper
>>     Server ZK2: Running Zookeeper
>>     Server ZK3: Running Zookeeper
>> 
>>     Server N1: SupervisorD running Storm Nimbus
>> 
>>     Server S1: SupervisorD running Storm Supervisor with 4 workers.
>>     Server S2: SupervisorD running Storm Supervisor with 4 workers.
>>     Server S3: SupervisorD running Storm Supervisor with 4 workers.
>>     ******
>> 
>>     Now the "The Quiet Storm" can have 1-n number of topologies running
>>     on it.
>> 
>>     I need to shut down all the servers in the cluster for maintenance.
>>     What is the procedure to do this without doing harm to the currently
>>     running topologies?
>> 
>>     Thank you,
>> 
>>     Joe
>> 
>>     -----Original Message-----
>>     From: Matthias J. Sax [mailto:mjsax@apache.org
>>     <ma...@apache.org>]
>>     Sent: Monday, September 28, 2015 12:15 PM
>>     To: user@storm.apache.org <ma...@storm.apache.org>
>>     Subject: Re: Starting and stopping storm
>> 
>>     Hi,
>> 
>>     as always: it depends. ;)
>> 
>>     Storm itself clear ups its own resources just fine. However, if the
>>     running topology needs to clean-up/release resources before it is
>>     shut down, Storm is not of any help. Even if there is a Spout/Bolt
>>     cleanup() method, Storm does not guarantee that it will be called.
>> 
>>     Thus, using "storm deactivate" is a good way to achieve proper
>>cleanup.
>>     However, the topology must provide some code for it, too. On the
>>     call to Spout.deactivate(), it must emit a special "clean-up"
>>     message (that you have to design by yourself) that must propagate
>>     through the whole topology, ie, each bolt must forward this message
>>     to all its output streams. Furthermore, bolts must to the clean-up
>>     if they receive this message.
>> 
>>     Long story short: "storm deactivate" before "storm kill" makes only
>>     sense if the topology requires proper cleanup and if the topology
>>     itself can react/cleanup properly on Spout.deactivate().
>> 
>>     Using "storm activate" in not necessary in any case.
>> 
>>     -Matthias
>> 
>> 
>>     On 09/28/2015 05:08 PM, Garcia-Contractor, Joseph (CORP) wrote:
>>     > Hi all,
>>     >
>>     >
>>     >
>>     >                I am a DevOps guy and I need implement a storm
>>cluster
>>     > with the proper start and stop init scripts on a Linux server.  I
>>     > already went through the documentation and it seems simple
>>enough.  I
>>     > am using supervisor as my process manager.  I am however having a
>>     > debate with one of the developers using Storm on the proper way to
>>     > shutdown Storm and I am hoping that you fine folks can help us out
>>     in this regard.
>>     >
>>     >
>>     >
>>     >                The developer believes that before you tell
>>supervisor
>>     > to kill (SIGTERM) the storm workers, supervisor, and nimbus, you
>>must
>>     > first issue a "storm deactivate topology-name", then tell
>>supervisor
>>     > to kill all the various processes.  He believes this because he
>>     > doesn't know if Storm will do an orderly shutdown on SIGTERM and
>>that
>>     > there is a chance that something will get screwed up.  This also
>>means
>>     > that when you start storm, after nimbus is up, you need to issue a
>>     > ""storm activate topology-name".
>>     >
>>     >
>>     >
>>     >                I am of the belief that because of storms fast
>>fail and
>>     > because it guarantees data processing, none of that is necessary
>>and
>>     > that you can just tell supervisor to stop the process.
>>     >
>>     >
>>     >
>>     >                So who is right here?
>>     >
>>     > 
>>----------------------------------------------------------------------
>>     > -- This message and any attachments are intended only for the use
>>of
>>     > the addressee and may contain information that is privileged and
>>     > confidential. If the reader of the message is not the intended
>>     > recipient or an authorized representative of the intended
>>recipient,
>>     > you are hereby notified that any dissemination of this
>>communication
>>     > is strictly prohibited. If you have received this communication in
>>     > error, notify the sender immediately by return email and delete
>>the
>>     > message and any attachments from your system.
>> 
>>     
>>----------------------------------------------------------------------
>>     This message and any attachments are intended only for the use of
>>     the addressee and may contain information that is privileged and
>>     confidential. If the reader of the message is not the intended
>>     recipient or an authorized representative of the intended recipient,
>>     you are hereby notified that any dissemination of this communication
>>     is strictly prohibited. If you have received this communication in
>>     error, notify the sender immediately by return email and delete the
>>     message and any attachments from your system.____
>> 
>>     __ __
>> 
>> 
>


RE: Starting and stopping storm

Posted by "Garcia-Contractor, Joseph (CORP)" <Jo...@ADP.com>.
I checked this already... an inactive topology when a storm cluster is shutdown will remain inactive when the storm cluster is brought back up.  :( That’s why I already added the code below.

-----Original Message-----
From: Matthias J. Sax [mailto:mjsax@apache.org] 
Sent: Tuesday, September 29, 2015 1:55 PM
To: user@storm.apache.org
Subject: Re: Starting and stopping storm

As far as I know, running topologies are restarted by Nimbus if the cluster goes online again. I have never tested it for a deactivated topology. But I would guess, that there is no difference.

-Matthias

On 09/29/2015 07:46 PM, Stephen Powis wrote:
> I have no idea what happens if you bring down all of the nodes in the 
> cluster while the topologies are deactivated.  I'd suggest testing it 
> and seeing, or maybe someone else can speak up?
> 
> Also depending on the version of storm you're upgrading from, there 
> may be different steps involved that may complicate things.
> 
> See release notes around upgrading from 0.8.x to 0.9.0:
> https://storm.apache.org/2013/12/08/storm090-released.html#api-compati
> bility-and-upgrading
> for just an example.
> 
> Additionally depending on if the storm client API changes 
> significantly between versions, it may require recompiling existing 
> topology code against the new API version before it can run properly 
> on the new storm cluster version.  Taking a wild guess... this 
> probably really only will be a problem when upgrading major versions, 
> and less of a concern for minor version upgrades, but again I don't really know that for sure.
> 
> 
> On Tue, Sep 29, 2015 at 1:36 PM, Garcia-Contractor, Joseph (CORP) 
> <Joseph.Garcia-Contractor@adp.com 
> <ma...@adp.com>> wrote:
> 
>     Stephen, ____
> 
>     __ __
> 
>     Thank you for the response!  Helps out a lot.____
> 
>     __ __
> 
>     So a further question.  And forgive my lack of knowledge here, I am
>     not the one using Storm, only deploying and running it, so I don’t
>     understand all the reasoning behind why something is done a certain
>     way in Storm.____
> 
>     __ __
> 
>     Let’s say I have deactivated all the topologies.  Is it necessary to
>     then kill the topology?  Could I not just wait a set amount of time
>     to ensure the tuples have cleared, say 5 minutes, and then bring
>     down the nodes?____
> 
>     __ __
> 
>     The reason I ask this is because it is a lot easier to activate the
>     topologies after the nodes are back up with a non-interactive
>     script.  I would like to avoid using “storm jar” to load the
>     topology because that means I need to hard code stuff into my
>     scripts or come up with a separate conf file for my script.  See my
>     current code below:____
> 
>     __ __
> 
>     function deactivate_topos {____
> 
>       STORM_TOPO_STATUS=$(storm list | sed -n -e
>     '/^-------------------------------------------------------------------/,$p'
>     | sed -e
>     '/^-------------------------------------------------------------------/d'
>     | awk '{print $1 ":" $2}')____
> 
>     __ __
> 
>       for i in $STORM_TOPO_STATUS____
> 
>       do____
> 
>         IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"____
> 
>                    echo "$TOPO_NAME $TOPO_STATUS"____
> 
>                    if [ $TOPO_STATUS = 'ACTIVE' ]; then____
> 
>                                   storm deactivate ${TOPO_NAME}____
> 
>                    fi____
> 
>         storm list | sed -n -e
>     
> '/^-------------------------------------------------------------------
> /,$p'____
> 
>       done____
> 
>     }____
> 
>     __ __
> 
>     function activate_topos {____
> 
>       STORM_TOPO_STATUS=$(storm list | sed -n -e
>     '/^-------------------------------------------------------------------/,$p'
>     | sed -e
>     '/^-------------------------------------------------------------------/d'
>     | awk '{print $1 ":" $2}')____
> 
>       for i in $STORM_TOPO_STATUS____
> 
>       do____
> 
>         IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"____
> 
>         echo "$TOPO_NAME $TOPO_STATUS"____
> 
>         if [ $TOPO_STATUS = 'INACTIVE' ]; then____
> 
>           storm activate ${TOPO_NAME}____
> 
>         fi____
> 
>         storm list | sed -n -e
>     
> '/^-------------------------------------------------------------------
> /,$p'____
> 
>       done____
> 
>     }____
> 
>     __ __
> 
>     *From:*Stephen Powis [mailto:spowis@salesforce.com
>     <ma...@salesforce.com>]
>     *Sent:* Tuesday, September 29, 2015 12:45 PM
> 
> 
>     *To:* user@storm.apache.org <ma...@storm.apache.org>
>     *Subject:* Re: Starting and stopping storm____
> 
>     __ __
> 
>     I would imagine the safest way would be to elect to deactivate each
>     running topology, which should make your spouts stop emitting
>     tuples.  You'd wait for all of the currently processing tuples to
>     finish processing, and then kill the topology. 
> 
>     If tuples get processed quickly in your topologies, you can
>     effectively do this by selecting kill and giving it a long enough
>     wait time.  IE -- Telling storm to kill your topology after 30
>     seconds means it will deactivate your spouts for 30 seconds, waiting
>     for existing tuples to finish getting processed, and then kill off
>     the topology.
> 
>     Then bring down each node, upgrade it, bring it back online and
>     resubmit your topologies.  ____
> 
>     __ __
> 
>     On Tue, Sep 29, 2015 at 10:02 AM, Garcia-Contractor, Joseph (CORP)
>     <Joseph.Garcia-Contractor@adp.com
>     <ma...@adp.com>> wrote:____
> 
>     I don't think I got my question across right or I am confused.
> 
>     Let me break this down in a more simple fashion.
> 
>     I have a Storm Cluster named "The Quiet Storm" ;) here is what it
>     consists of:
> 
>     ******
>     Server ZK1: Running Zookeeper
>     Server ZK2: Running Zookeeper
>     Server ZK3: Running Zookeeper
> 
>     Server N1: SupervisorD running Storm Nimbus
> 
>     Server S1: SupervisorD running Storm Supervisor with 4 workers.
>     Server S2: SupervisorD running Storm Supervisor with 4 workers.
>     Server S3: SupervisorD running Storm Supervisor with 4 workers.
>     ******
> 
>     Now the "The Quiet Storm" can have 1-n number of topologies running
>     on it.
> 
>     I need to shut down all the servers in the cluster for maintenance. 
>     What is the procedure to do this without doing harm to the currently
>     running topologies?
> 
>     Thank you,
> 
>     Joe
> 
>     -----Original Message-----
>     From: Matthias J. Sax [mailto:mjsax@apache.org
>     <ma...@apache.org>]
>     Sent: Monday, September 28, 2015 12:15 PM
>     To: user@storm.apache.org <ma...@storm.apache.org>
>     Subject: Re: Starting and stopping storm
> 
>     Hi,
> 
>     as always: it depends. ;)
> 
>     Storm itself clear ups its own resources just fine. However, if the
>     running topology needs to clean-up/release resources before it is
>     shut down, Storm is not of any help. Even if there is a Spout/Bolt
>     cleanup() method, Storm does not guarantee that it will be called.
> 
>     Thus, using "storm deactivate" is a good way to achieve proper cleanup.
>     However, the topology must provide some code for it, too. On the
>     call to Spout.deactivate(), it must emit a special "clean-up"
>     message (that you have to design by yourself) that must propagate
>     through the whole topology, ie, each bolt must forward this message
>     to all its output streams. Furthermore, bolts must to the clean-up
>     if they receive this message.
> 
>     Long story short: "storm deactivate" before "storm kill" makes only
>     sense if the topology requires proper cleanup and if the topology
>     itself can react/cleanup properly on Spout.deactivate().
> 
>     Using "storm activate" in not necessary in any case.
> 
>     -Matthias
> 
> 
>     On 09/28/2015 05:08 PM, Garcia-Contractor, Joseph (CORP) wrote:
>     > Hi all,
>     >
>     >
>     >
>     >                I am a DevOps guy and I need implement a storm cluster
>     > with the proper start and stop init scripts on a Linux server.  I
>     > already went through the documentation and it seems simple enough.  I
>     > am using supervisor as my process manager.  I am however having a
>     > debate with one of the developers using Storm on the proper way to
>     > shutdown Storm and I am hoping that you fine folks can help us out
>     in this regard.
>     >
>     >
>     >
>     >                The developer believes that before you tell supervisor
>     > to kill (SIGTERM) the storm workers, supervisor, and nimbus, you must
>     > first issue a "storm deactivate topology-name", then tell supervisor
>     > to kill all the various processes.  He believes this because he
>     > doesn't know if Storm will do an orderly shutdown on SIGTERM and that
>     > there is a chance that something will get screwed up.  This also means
>     > that when you start storm, after nimbus is up, you need to issue a
>     > ""storm activate topology-name".
>     >
>     >
>     >
>     >                I am of the belief that because of storms fast fail and
>     > because it guarantees data processing, none of that is necessary and
>     > that you can just tell supervisor to stop the process.
>     >
>     >
>     >
>     >                So who is right here?
>     >
>     > ----------------------------------------------------------------------
>     > -- This message and any attachments are intended only for the use of
>     > the addressee and may contain information that is privileged and
>     > confidential. If the reader of the message is not the intended
>     > recipient or an authorized representative of the intended recipient,
>     > you are hereby notified that any dissemination of this communication
>     > is strictly prohibited. If you have received this communication in
>     > error, notify the sender immediately by return email and delete the
>     > message and any attachments from your system.
> 
>     ----------------------------------------------------------------------
>     This message and any attachments are intended only for the use of
>     the addressee and may contain information that is privileged and
>     confidential. If the reader of the message is not the intended
>     recipient or an authorized representative of the intended recipient,
>     you are hereby notified that any dissemination of this communication
>     is strictly prohibited. If you have received this communication in
>     error, notify the sender immediately by return email and delete the
>     message and any attachments from your system.____
> 
>     __ __
> 
> 


----------------------------------------------------------------------
This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, notify the sender immediately by return email and delete the message and any attachments from your system.

Re: Starting and stopping storm

Posted by "Matthias J. Sax" <mj...@apache.org>.
As far as I know, running topologies are restarted by Nimbus if the
cluster goes online again. I have never tested it for a deactivated
topology. But I would guess, that there is no difference.

-Matthias

On 09/29/2015 07:46 PM, Stephen Powis wrote:
> I have no idea what happens if you bring down all of the nodes in the
> cluster while the topologies are deactivated.  I'd suggest testing it
> and seeing, or maybe someone else can speak up?
> 
> Also depending on the version of storm you're upgrading from, there may
> be different steps involved that may complicate things.
> 
> See release notes around upgrading from 0.8.x to 0.9.0:
> https://storm.apache.org/2013/12/08/storm090-released.html#api-compatibility-and-upgrading
> for just an example.
> 
> Additionally depending on if the storm client API changes significantly
> between versions, it may require recompiling existing topology code
> against the new API version before it can run properly on the new storm
> cluster version.  Taking a wild guess... this probably really only will
> be a problem when upgrading major versions, and less of a concern for
> minor version upgrades, but again I don't really know that for sure.
> 
> 
> On Tue, Sep 29, 2015 at 1:36 PM, Garcia-Contractor, Joseph (CORP)
> <Joseph.Garcia-Contractor@adp.com
> <ma...@adp.com>> wrote:
> 
>     Stephen, ____
> 
>     __ __
> 
>     Thank you for the response!  Helps out a lot.____
> 
>     __ __
> 
>     So a further question.  And forgive my lack of knowledge here, I am
>     not the one using Storm, only deploying and running it, so I don’t
>     understand all the reasoning behind why something is done a certain
>     way in Storm.____
> 
>     __ __
> 
>     Let’s say I have deactivated all the topologies.  Is it necessary to
>     then kill the topology?  Could I not just wait a set amount of time
>     to ensure the tuples have cleared, say 5 minutes, and then bring
>     down the nodes?____
> 
>     __ __
> 
>     The reason I ask this is because it is a lot easier to activate the
>     topologies after the nodes are back up with a non-interactive
>     script.  I would like to avoid using “storm jar” to load the
>     topology because that means I need to hard code stuff into my
>     scripts or come up with a separate conf file for my script.  See my
>     current code below:____
> 
>     __ __
> 
>     function deactivate_topos {____
> 
>       STORM_TOPO_STATUS=$(storm list | sed -n -e
>     '/^-------------------------------------------------------------------/,$p'
>     | sed -e
>     '/^-------------------------------------------------------------------/d'
>     | awk '{print $1 ":" $2}')____
> 
>     __ __
> 
>       for i in $STORM_TOPO_STATUS____
> 
>       do____
> 
>         IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"____
> 
>                    echo "$TOPO_NAME $TOPO_STATUS"____
> 
>                    if [ $TOPO_STATUS = 'ACTIVE' ]; then____
> 
>                                   storm deactivate ${TOPO_NAME}____
> 
>                    fi____
> 
>         storm list | sed -n -e
>     '/^-------------------------------------------------------------------/,$p'____
> 
>       done____
> 
>     }____
> 
>     __ __
> 
>     function activate_topos {____
> 
>       STORM_TOPO_STATUS=$(storm list | sed -n -e
>     '/^-------------------------------------------------------------------/,$p'
>     | sed -e
>     '/^-------------------------------------------------------------------/d'
>     | awk '{print $1 ":" $2}')____
> 
>       for i in $STORM_TOPO_STATUS____
> 
>       do____
> 
>         IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"____
> 
>         echo "$TOPO_NAME $TOPO_STATUS"____
> 
>         if [ $TOPO_STATUS = 'INACTIVE' ]; then____
> 
>           storm activate ${TOPO_NAME}____
> 
>         fi____
> 
>         storm list | sed -n -e
>     '/^-------------------------------------------------------------------/,$p'____
> 
>       done____
> 
>     }____
> 
>     __ __
> 
>     *From:*Stephen Powis [mailto:spowis@salesforce.com
>     <ma...@salesforce.com>]
>     *Sent:* Tuesday, September 29, 2015 12:45 PM
> 
> 
>     *To:* user@storm.apache.org <ma...@storm.apache.org>
>     *Subject:* Re: Starting and stopping storm____
> 
>     __ __
> 
>     I would imagine the safest way would be to elect to deactivate each
>     running topology, which should make your spouts stop emitting
>     tuples.  You'd wait for all of the currently processing tuples to
>     finish processing, and then kill the topology. 
> 
>     If tuples get processed quickly in your topologies, you can
>     effectively do this by selecting kill and giving it a long enough
>     wait time.  IE -- Telling storm to kill your topology after 30
>     seconds means it will deactivate your spouts for 30 seconds, waiting
>     for existing tuples to finish getting processed, and then kill off
>     the topology.
> 
>     Then bring down each node, upgrade it, bring it back online and
>     resubmit your topologies.  ____
> 
>     __ __
> 
>     On Tue, Sep 29, 2015 at 10:02 AM, Garcia-Contractor, Joseph (CORP)
>     <Joseph.Garcia-Contractor@adp.com
>     <ma...@adp.com>> wrote:____
> 
>     I don't think I got my question across right or I am confused.
> 
>     Let me break this down in a more simple fashion.
> 
>     I have a Storm Cluster named "The Quiet Storm" ;) here is what it
>     consists of:
> 
>     ******
>     Server ZK1: Running Zookeeper
>     Server ZK2: Running Zookeeper
>     Server ZK3: Running Zookeeper
> 
>     Server N1: SupervisorD running Storm Nimbus
> 
>     Server S1: SupervisorD running Storm Supervisor with 4 workers.
>     Server S2: SupervisorD running Storm Supervisor with 4 workers.
>     Server S3: SupervisorD running Storm Supervisor with 4 workers.
>     ******
> 
>     Now the "The Quiet Storm" can have 1-n number of topologies running
>     on it.
> 
>     I need to shut down all the servers in the cluster for maintenance. 
>     What is the procedure to do this without doing harm to the currently
>     running topologies?
> 
>     Thank you,
> 
>     Joe
> 
>     -----Original Message-----
>     From: Matthias J. Sax [mailto:mjsax@apache.org
>     <ma...@apache.org>]
>     Sent: Monday, September 28, 2015 12:15 PM
>     To: user@storm.apache.org <ma...@storm.apache.org>
>     Subject: Re: Starting and stopping storm
> 
>     Hi,
> 
>     as always: it depends. ;)
> 
>     Storm itself clear ups its own resources just fine. However, if the
>     running topology needs to clean-up/release resources before it is
>     shut down, Storm is not of any help. Even if there is a Spout/Bolt
>     cleanup() method, Storm does not guarantee that it will be called.
> 
>     Thus, using "storm deactivate" is a good way to achieve proper cleanup.
>     However, the topology must provide some code for it, too. On the
>     call to Spout.deactivate(), it must emit a special "clean-up"
>     message (that you have to design by yourself) that must propagate
>     through the whole topology, ie, each bolt must forward this message
>     to all its output streams. Furthermore, bolts must to the clean-up
>     if they receive this message.
> 
>     Long story short: "storm deactivate" before "storm kill" makes only
>     sense if the topology requires proper cleanup and if the topology
>     itself can react/cleanup properly on Spout.deactivate().
> 
>     Using "storm activate" in not necessary in any case.
> 
>     -Matthias
> 
> 
>     On 09/28/2015 05:08 PM, Garcia-Contractor, Joseph (CORP) wrote:
>     > Hi all,
>     >
>     >
>     >
>     >                I am a DevOps guy and I need implement a storm cluster
>     > with the proper start and stop init scripts on a Linux server.  I
>     > already went through the documentation and it seems simple enough.  I
>     > am using supervisor as my process manager.  I am however having a
>     > debate with one of the developers using Storm on the proper way to
>     > shutdown Storm and I am hoping that you fine folks can help us out
>     in this regard.
>     >
>     >
>     >
>     >                The developer believes that before you tell supervisor
>     > to kill (SIGTERM) the storm workers, supervisor, and nimbus, you must
>     > first issue a "storm deactivate topology-name", then tell supervisor
>     > to kill all the various processes.  He believes this because he
>     > doesn't know if Storm will do an orderly shutdown on SIGTERM and that
>     > there is a chance that something will get screwed up.  This also means
>     > that when you start storm, after nimbus is up, you need to issue a
>     > ""storm activate topology-name".
>     >
>     >
>     >
>     >                I am of the belief that because of storms fast fail and
>     > because it guarantees data processing, none of that is necessary and
>     > that you can just tell supervisor to stop the process.
>     >
>     >
>     >
>     >                So who is right here?
>     >
>     > ----------------------------------------------------------------------
>     > -- This message and any attachments are intended only for the use of
>     > the addressee and may contain information that is privileged and
>     > confidential. If the reader of the message is not the intended
>     > recipient or an authorized representative of the intended recipient,
>     > you are hereby notified that any dissemination of this communication
>     > is strictly prohibited. If you have received this communication in
>     > error, notify the sender immediately by return email and delete the
>     > message and any attachments from your system.
> 
>     ----------------------------------------------------------------------
>     This message and any attachments are intended only for the use of
>     the addressee and may contain information that is privileged and
>     confidential. If the reader of the message is not the intended
>     recipient or an authorized representative of the intended recipient,
>     you are hereby notified that any dissemination of this communication
>     is strictly prohibited. If you have received this communication in
>     error, notify the sender immediately by return email and delete the
>     message and any attachments from your system.____
> 
>     __ __
> 
> 


Re: Starting and stopping storm

Posted by Stephen Powis <sp...@salesforce.com>.
I have no idea what happens if you bring down all of the nodes in the
cluster while the topologies are deactivated.  I'd suggest testing it and
seeing, or maybe someone else can speak up?

Also depending on the version of storm you're upgrading from, there may be
different steps involved that may complicate things.

See release notes around upgrading from 0.8.x to 0.9.0:
https://storm.apache.org/2013/12/08/storm090-released.html#api-compatibility-and-upgrading
for just an example.

Additionally depending on if the storm client API changes significantly
between versions, it may require recompiling existing topology code against
the new API version before it can run properly on the new storm cluster
version.  Taking a wild guess... this probably really only will be a
problem when upgrading major versions, and less of a concern for minor
version upgrades, but again I don't really know that for sure.


On Tue, Sep 29, 2015 at 1:36 PM, Garcia-Contractor, Joseph (CORP) <
Joseph.Garcia-Contractor@adp.com> wrote:

> Stephen,
>
>
>
> Thank you for the response!  Helps out a lot.
>
>
>
> So a further question.  And forgive my lack of knowledge here, I am not
> the one using Storm, only deploying and running it, so I don’t understand
> all the reasoning behind why something is done a certain way in Storm.
>
>
>
> Let’s say I have deactivated all the topologies.  Is it necessary to then
> kill the topology?  Could I not just wait a set amount of time to ensure
> the tuples have cleared, say 5 minutes, and then bring down the nodes?
>
>
>
> The reason I ask this is because it is a lot easier to activate the
> topologies after the nodes are back up with a non-interactive script.  I
> would like to avoid using “storm jar” to load the topology because that
> means I need to hard code stuff into my scripts or come up with a separate
> conf file for my script.  See my current code below:
>
>
>
> function deactivate_topos {
>
>   STORM_TOPO_STATUS=$(storm list | sed -n -e
> '/^-------------------------------------------------------------------/,$p'
> | sed -e
> '/^-------------------------------------------------------------------/d' |
> awk '{print $1 ":" $2}')
>
>
>
>   for i in $STORM_TOPO_STATUS
>
>   do
>
>     IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"
>
>                echo "$TOPO_NAME $TOPO_STATUS"
>
>                if [ $TOPO_STATUS = 'ACTIVE' ]; then
>
>                               storm deactivate ${TOPO_NAME}
>
>                fi
>
>     storm list | sed -n -e
> '/^-------------------------------------------------------------------/,$p'
>
>   done
>
> }
>
>
>
> function activate_topos {
>
>   STORM_TOPO_STATUS=$(storm list | sed -n -e
> '/^-------------------------------------------------------------------/,$p'
> | sed -e
> '/^-------------------------------------------------------------------/d' |
> awk '{print $1 ":" $2}')
>
>   for i in $STORM_TOPO_STATUS
>
>   do
>
>     IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"
>
>     echo "$TOPO_NAME $TOPO_STATUS"
>
>     if [ $TOPO_STATUS = 'INACTIVE' ]; then
>
>       storm activate ${TOPO_NAME}
>
>     fi
>
>     storm list | sed -n -e
> '/^-------------------------------------------------------------------/,$p'
>
>   done
>
> }
>
>
>
> *From:* Stephen Powis [mailto:spowis@salesforce.com]
> *Sent:* Tuesday, September 29, 2015 12:45 PM
>
> *To:* user@storm.apache.org
> *Subject:* Re: Starting and stopping storm
>
>
>
> I would imagine the safest way would be to elect to deactivate each
> running topology, which should make your spouts stop emitting tuples.
> You'd wait for all of the currently processing tuples to finish processing,
> and then kill the topology.
>
> If tuples get processed quickly in your topologies, you can effectively do
> this by selecting kill and giving it a long enough wait time.  IE --
> Telling storm to kill your topology after 30 seconds means it will
> deactivate your spouts for 30 seconds, waiting for existing tuples to
> finish getting processed, and then kill off the topology.
>
> Then bring down each node, upgrade it, bring it back online and resubmit
> your topologies.
>
>
>
> On Tue, Sep 29, 2015 at 10:02 AM, Garcia-Contractor, Joseph (CORP) <
> Joseph.Garcia-Contractor@adp.com> wrote:
>
> I don't think I got my question across right or I am confused.
>
> Let me break this down in a more simple fashion.
>
> I have a Storm Cluster named "The Quiet Storm" ;) here is what it consists
> of:
>
> ******
> Server ZK1: Running Zookeeper
> Server ZK2: Running Zookeeper
> Server ZK3: Running Zookeeper
>
> Server N1: SupervisorD running Storm Nimbus
>
> Server S1: SupervisorD running Storm Supervisor with 4 workers.
> Server S2: SupervisorD running Storm Supervisor with 4 workers.
> Server S3: SupervisorD running Storm Supervisor with 4 workers.
> ******
>
> Now the "The Quiet Storm" can have 1-n number of topologies running on it.
>
> I need to shut down all the servers in the cluster for maintenance.  What
> is the procedure to do this without doing harm to the currently running
> topologies?
>
> Thank you,
>
> Joe
>
> -----Original Message-----
> From: Matthias J. Sax [mailto:mjsax@apache.org]
> Sent: Monday, September 28, 2015 12:15 PM
> To: user@storm.apache.org
> Subject: Re: Starting and stopping storm
>
> Hi,
>
> as always: it depends. ;)
>
> Storm itself clear ups its own resources just fine. However, if the
> running topology needs to clean-up/release resources before it is shut
> down, Storm is not of any help. Even if there is a Spout/Bolt cleanup()
> method, Storm does not guarantee that it will be called.
>
> Thus, using "storm deactivate" is a good way to achieve proper cleanup.
> However, the topology must provide some code for it, too. On the call to
> Spout.deactivate(), it must emit a special "clean-up" message (that you
> have to design by yourself) that must propagate through the whole topology,
> ie, each bolt must forward this message to all its output streams.
> Furthermore, bolts must to the clean-up if they receive this message.
>
> Long story short: "storm deactivate" before "storm kill" makes only sense
> if the topology requires proper cleanup and if the topology itself can
> react/cleanup properly on Spout.deactivate().
>
> Using "storm activate" in not necessary in any case.
>
> -Matthias
>
>
> On 09/28/2015 05:08 PM, Garcia-Contractor, Joseph (CORP) wrote:
> > Hi all,
> >
> >
> >
> >                I am a DevOps guy and I need implement a storm cluster
> > with the proper start and stop init scripts on a Linux server.  I
> > already went through the documentation and it seems simple enough.  I
> > am using supervisor as my process manager.  I am however having a
> > debate with one of the developers using Storm on the proper way to
> > shutdown Storm and I am hoping that you fine folks can help us out in
> this regard.
> >
> >
> >
> >                The developer believes that before you tell supervisor
> > to kill (SIGTERM) the storm workers, supervisor, and nimbus, you must
> > first issue a "storm deactivate topology-name", then tell supervisor
> > to kill all the various processes.  He believes this because he
> > doesn't know if Storm will do an orderly shutdown on SIGTERM and that
> > there is a chance that something will get screwed up.  This also means
> > that when you start storm, after nimbus is up, you need to issue a
> > ""storm activate topology-name".
> >
> >
> >
> >                I am of the belief that because of storms fast fail and
> > because it guarantees data processing, none of that is necessary and
> > that you can just tell supervisor to stop the process.
> >
> >
> >
> >                So who is right here?
> >
> > ----------------------------------------------------------------------
> > -- This message and any attachments are intended only for the use of
> > the addressee and may contain information that is privileged and
> > confidential. If the reader of the message is not the intended
> > recipient or an authorized representative of the intended recipient,
> > you are hereby notified that any dissemination of this communication
> > is strictly prohibited. If you have received this communication in
> > error, notify the sender immediately by return email and delete the
> > message and any attachments from your system.
>
> ----------------------------------------------------------------------
> This message and any attachments are intended only for the use of the
> addressee and may contain information that is privileged and confidential.
> If the reader of the message is not the intended recipient or an authorized
> representative of the intended recipient, you are hereby notified that any
> dissemination of this communication is strictly prohibited. If you have
> received this communication in error, notify the sender immediately by
> return email and delete the message and any attachments from your system.
>
>
>

RE: Starting and stopping storm

Posted by "Garcia-Contractor, Joseph (CORP)" <Jo...@ADP.com>.
Stephen,

Thank you for the response!  Helps out a lot.

So a further question.  And forgive my lack of knowledge here, I am not the one using Storm, only deploying and running it, so I don’t understand all the reasoning behind why something is done a certain way in Storm.

Let’s say I have deactivated all the topologies.  Is it necessary to then kill the topology?  Could I not just wait a set amount of time to ensure the tuples have cleared, say 5 minutes, and then bring down the nodes?

The reason I ask this is because it is a lot easier to activate the topologies after the nodes are back up with a non-interactive script.  I would like to avoid using “storm jar” to load the topology because that means I need to hard code stuff into my scripts or come up with a separate conf file for my script.  See my current code below:

function deactivate_topos {
  STORM_TOPO_STATUS=$(storm list | sed -n -e '/^-------------------------------------------------------------------/,$p' | sed -e '/^-------------------------------------------------------------------/d' | awk '{print $1 ":" $2}')

  for i in $STORM_TOPO_STATUS
  do
    IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"
               echo "$TOPO_NAME $TOPO_STATUS"
               if [ $TOPO_STATUS = 'ACTIVE' ]; then
                              storm deactivate ${TOPO_NAME}
               fi
    storm list | sed -n -e '/^-------------------------------------------------------------------/,$p'
  done
}

function activate_topos {
  STORM_TOPO_STATUS=$(storm list | sed -n -e '/^-------------------------------------------------------------------/,$p' | sed -e '/^-------------------------------------------------------------------/d' | awk '{print $1 ":" $2}')
  for i in $STORM_TOPO_STATUS
  do
    IFS=':' read TOPO_NAME TOPO_STATUS <<< "$i"
    echo "$TOPO_NAME $TOPO_STATUS"
    if [ $TOPO_STATUS = 'INACTIVE' ]; then
      storm activate ${TOPO_NAME}
    fi
    storm list | sed -n -e '/^-------------------------------------------------------------------/,$p'
  done
}

From: Stephen Powis [mailto:spowis@salesforce.com]
Sent: Tuesday, September 29, 2015 12:45 PM
To: user@storm.apache.org
Subject: Re: Starting and stopping storm

I would imagine the safest way would be to elect to deactivate each running topology, which should make your spouts stop emitting tuples.  You'd wait for all of the currently processing tuples to finish processing, and then kill the topology.

If tuples get processed quickly in your topologies, you can effectively do this by selecting kill and giving it a long enough wait time.  IE -- Telling storm to kill your topology after 30 seconds means it will deactivate your spouts for 30 seconds, waiting for existing tuples to finish getting processed, and then kill off the topology.

Then bring down each node, upgrade it, bring it back online and resubmit your topologies.

On Tue, Sep 29, 2015 at 10:02 AM, Garcia-Contractor, Joseph (CORP) <Jo...@adp.com>> wrote:
I don't think I got my question across right or I am confused.

Let me break this down in a more simple fashion.

I have a Storm Cluster named "The Quiet Storm" ;) here is what it consists of:

******
Server ZK1: Running Zookeeper
Server ZK2: Running Zookeeper
Server ZK3: Running Zookeeper

Server N1: SupervisorD running Storm Nimbus

Server S1: SupervisorD running Storm Supervisor with 4 workers.
Server S2: SupervisorD running Storm Supervisor with 4 workers.
Server S3: SupervisorD running Storm Supervisor with 4 workers.
******

Now the "The Quiet Storm" can have 1-n number of topologies running on it.

I need to shut down all the servers in the cluster for maintenance.  What is the procedure to do this without doing harm to the currently running topologies?

Thank you,

Joe

-----Original Message-----
From: Matthias J. Sax [mailto:mjsax@apache.org<ma...@apache.org>]
Sent: Monday, September 28, 2015 12:15 PM
To: user@storm.apache.org<ma...@storm.apache.org>
Subject: Re: Starting and stopping storm

Hi,

as always: it depends. ;)

Storm itself clear ups its own resources just fine. However, if the running topology needs to clean-up/release resources before it is shut down, Storm is not of any help. Even if there is a Spout/Bolt cleanup() method, Storm does not guarantee that it will be called.

Thus, using "storm deactivate" is a good way to achieve proper cleanup.
However, the topology must provide some code for it, too. On the call to Spout.deactivate(), it must emit a special "clean-up" message (that you have to design by yourself) that must propagate through the whole topology, ie, each bolt must forward this message to all its output streams. Furthermore, bolts must to the clean-up if they receive this message.

Long story short: "storm deactivate" before "storm kill" makes only sense if the topology requires proper cleanup and if the topology itself can react/cleanup properly on Spout.deactivate().

Using "storm activate" in not necessary in any case.

-Matthias


On 09/28/2015 05:08 PM, Garcia-Contractor, Joseph (CORP) wrote:
> Hi all,
>
>
>
>                I am a DevOps guy and I need implement a storm cluster
> with the proper start and stop init scripts on a Linux server.  I
> already went through the documentation and it seems simple enough.  I
> am using supervisor as my process manager.  I am however having a
> debate with one of the developers using Storm on the proper way to
> shutdown Storm and I am hoping that you fine folks can help us out in this regard.
>
>
>
>                The developer believes that before you tell supervisor
> to kill (SIGTERM) the storm workers, supervisor, and nimbus, you must
> first issue a "storm deactivate topology-name", then tell supervisor
> to kill all the various processes.  He believes this because he
> doesn't know if Storm will do an orderly shutdown on SIGTERM and that
> there is a chance that something will get screwed up.  This also means
> that when you start storm, after nimbus is up, you need to issue a
> ""storm activate topology-name".
>
>
>
>                I am of the belief that because of storms fast fail and
> because it guarantees data processing, none of that is necessary and
> that you can just tell supervisor to stop the process.
>
>
>
>                So who is right here?
>
> ----------------------------------------------------------------------
> -- This message and any attachments are intended only for the use of
> the addressee and may contain information that is privileged and
> confidential. If the reader of the message is not the intended
> recipient or an authorized representative of the intended recipient,
> you are hereby notified that any dissemination of this communication
> is strictly prohibited. If you have received this communication in
> error, notify the sender immediately by return email and delete the
> message and any attachments from your system.

----------------------------------------------------------------------
This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, notify the sender immediately by return email and delete the message and any attachments from your system.


Re: Starting and stopping storm

Posted by Stephen Powis <sp...@salesforce.com>.
I would imagine the safest way would be to elect to deactivate each running
topology, which should make your spouts stop emitting tuples.  You'd wait
for all of the currently processing tuples to finish processing, and then
kill the topology.

If tuples get processed quickly in your topologies, you can effectively do
this by selecting kill and giving it a long enough wait time.  IE --
Telling storm to kill your topology after 30 seconds means it will
deactivate your spouts for 30 seconds, waiting for existing tuples to
finish getting processed, and then kill off the topology.

Then bring down each node, upgrade it, bring it back online and resubmit
your topologies.

On Tue, Sep 29, 2015 at 10:02 AM, Garcia-Contractor, Joseph (CORP) <
Joseph.Garcia-Contractor@adp.com> wrote:

> I don't think I got my question across right or I am confused.
>
> Let me break this down in a more simple fashion.
>
> I have a Storm Cluster named "The Quiet Storm" ;) here is what it consists
> of:
>
> ******
> Server ZK1: Running Zookeeper
> Server ZK2: Running Zookeeper
> Server ZK3: Running Zookeeper
>
> Server N1: SupervisorD running Storm Nimbus
>
> Server S1: SupervisorD running Storm Supervisor with 4 workers.
> Server S2: SupervisorD running Storm Supervisor with 4 workers.
> Server S3: SupervisorD running Storm Supervisor with 4 workers.
> ******
>
> Now the "The Quiet Storm" can have 1-n number of topologies running on it.
>
> I need to shut down all the servers in the cluster for maintenance.  What
> is the procedure to do this without doing harm to the currently running
> topologies?
>
> Thank you,
>
> Joe
>
> -----Original Message-----
> From: Matthias J. Sax [mailto:mjsax@apache.org]
> Sent: Monday, September 28, 2015 12:15 PM
> To: user@storm.apache.org
> Subject: Re: Starting and stopping storm
>
> Hi,
>
> as always: it depends. ;)
>
> Storm itself clear ups its own resources just fine. However, if the
> running topology needs to clean-up/release resources before it is shut
> down, Storm is not of any help. Even if there is a Spout/Bolt cleanup()
> method, Storm does not guarantee that it will be called.
>
> Thus, using "storm deactivate" is a good way to achieve proper cleanup.
> However, the topology must provide some code for it, too. On the call to
> Spout.deactivate(), it must emit a special "clean-up" message (that you
> have to design by yourself) that must propagate through the whole topology,
> ie, each bolt must forward this message to all its output streams.
> Furthermore, bolts must to the clean-up if they receive this message.
>
> Long story short: "storm deactivate" before "storm kill" makes only sense
> if the topology requires proper cleanup and if the topology itself can
> react/cleanup properly on Spout.deactivate().
>
> Using "storm activate" in not necessary in any case.
>
> -Matthias
>
>
> On 09/28/2015 05:08 PM, Garcia-Contractor, Joseph (CORP) wrote:
> > Hi all,
> >
> >
> >
> >                I am a DevOps guy and I need implement a storm cluster
> > with the proper start and stop init scripts on a Linux server.  I
> > already went through the documentation and it seems simple enough.  I
> > am using supervisor as my process manager.  I am however having a
> > debate with one of the developers using Storm on the proper way to
> > shutdown Storm and I am hoping that you fine folks can help us out in
> this regard.
> >
> >
> >
> >                The developer believes that before you tell supervisor
> > to kill (SIGTERM) the storm workers, supervisor, and nimbus, you must
> > first issue a "storm deactivate topology-name", then tell supervisor
> > to kill all the various processes.  He believes this because he
> > doesn't know if Storm will do an orderly shutdown on SIGTERM and that
> > there is a chance that something will get screwed up.  This also means
> > that when you start storm, after nimbus is up, you need to issue a
> > ""storm activate topology-name".
> >
> >
> >
> >                I am of the belief that because of storms fast fail and
> > because it guarantees data processing, none of that is necessary and
> > that you can just tell supervisor to stop the process.
> >
> >
> >
> >                So who is right here?
> >
> > ----------------------------------------------------------------------
> > -- This message and any attachments are intended only for the use of
> > the addressee and may contain information that is privileged and
> > confidential. If the reader of the message is not the intended
> > recipient or an authorized representative of the intended recipient,
> > you are hereby notified that any dissemination of this communication
> > is strictly prohibited. If you have received this communication in
> > error, notify the sender immediately by return email and delete the
> > message and any attachments from your system.
>
> ----------------------------------------------------------------------
> This message and any attachments are intended only for the use of the
> addressee and may contain information that is privileged and confidential.
> If the reader of the message is not the intended recipient or an authorized
> representative of the intended recipient, you are hereby notified that any
> dissemination of this communication is strictly prohibited. If you have
> received this communication in error, notify the sender immediately by
> return email and delete the message and any attachments from your system.
>

RE: Starting and stopping storm

Posted by "Garcia-Contractor, Joseph (CORP)" <Jo...@ADP.com>.
I don't think I got my question across right or I am confused.

Let me break this down in a more simple fashion.

I have a Storm Cluster named "The Quiet Storm" ;) here is what it consists of:

******
Server ZK1: Running Zookeeper
Server ZK2: Running Zookeeper
Server ZK3: Running Zookeeper

Server N1: SupervisorD running Storm Nimbus

Server S1: SupervisorD running Storm Supervisor with 4 workers.
Server S2: SupervisorD running Storm Supervisor with 4 workers.
Server S3: SupervisorD running Storm Supervisor with 4 workers.
******

Now the "The Quiet Storm" can have 1-n number of topologies running on it.

I need to shut down all the servers in the cluster for maintenance.  What is the procedure to do this without doing harm to the currently running topologies?

Thank you,

Joe

-----Original Message-----
From: Matthias J. Sax [mailto:mjsax@apache.org] 
Sent: Monday, September 28, 2015 12:15 PM
To: user@storm.apache.org
Subject: Re: Starting and stopping storm

Hi,

as always: it depends. ;)

Storm itself clear ups its own resources just fine. However, if the running topology needs to clean-up/release resources before it is shut down, Storm is not of any help. Even if there is a Spout/Bolt cleanup() method, Storm does not guarantee that it will be called.

Thus, using "storm deactivate" is a good way to achieve proper cleanup.
However, the topology must provide some code for it, too. On the call to Spout.deactivate(), it must emit a special "clean-up" message (that you have to design by yourself) that must propagate through the whole topology, ie, each bolt must forward this message to all its output streams. Furthermore, bolts must to the clean-up if they receive this message.

Long story short: "storm deactivate" before "storm kill" makes only sense if the topology requires proper cleanup and if the topology itself can react/cleanup properly on Spout.deactivate().

Using "storm activate" in not necessary in any case.

-Matthias


On 09/28/2015 05:08 PM, Garcia-Contractor, Joseph (CORP) wrote:
> Hi all,
> 
>  
> 
>                I am a DevOps guy and I need implement a storm cluster 
> with the proper start and stop init scripts on a Linux server.  I 
> already went through the documentation and it seems simple enough.  I 
> am using supervisor as my process manager.  I am however having a 
> debate with one of the developers using Storm on the proper way to 
> shutdown Storm and I am hoping that you fine folks can help us out in this regard.
> 
>  
> 
>                The developer believes that before you tell supervisor 
> to kill (SIGTERM) the storm workers, supervisor, and nimbus, you must 
> first issue a "storm deactivate topology-name", then tell supervisor 
> to kill all the various processes.  He believes this because he 
> doesn't know if Storm will do an orderly shutdown on SIGTERM and that 
> there is a chance that something will get screwed up.  This also means 
> that when you start storm, after nimbus is up, you need to issue a 
> ""storm activate topology-name".
> 
>  
> 
>                I am of the belief that because of storms fast fail and 
> because it guarantees data processing, none of that is necessary and 
> that you can just tell supervisor to stop the process.
> 
>  
> 
>                So who is right here?
> 
> ----------------------------------------------------------------------
> -- This message and any attachments are intended only for the use of 
> the addressee and may contain information that is privileged and 
> confidential. If the reader of the message is not the intended 
> recipient or an authorized representative of the intended recipient, 
> you are hereby notified that any dissemination of this communication 
> is strictly prohibited. If you have received this communication in 
> error, notify the sender immediately by return email and delete the 
> message and any attachments from your system.

----------------------------------------------------------------------
This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, notify the sender immediately by return email and delete the message and any attachments from your system.

Re: Starting and stopping storm

Posted by "Matthias J. Sax" <mj...@apache.org>.
Hi,

as always: it depends. ;)

Storm itself clear ups its own resources just fine. However, if the
running topology needs to clean-up/release resources before it is shut
down, Storm is not of any help. Even if there is a Spout/Bolt cleanup()
method, Storm does not guarantee that it will be called.

Thus, using "storm deactivate" is a good way to achieve proper cleanup.
However, the topology must provide some code for it, too. On the call to
Spout.deactivate(), it must emit a special "clean-up" message (that you
have to design by yourself) that must propagate through the whole
topology, ie, each bolt must forward this message to all its output
streams. Furthermore, bolts must to the clean-up if they receive this
message.

Long story short: "storm deactivate" before "storm kill" makes only
sense if the topology requires proper cleanup and if the topology itself
can react/cleanup properly on Spout.deactivate().

Using "storm activate" in not necessary in any case.

-Matthias


On 09/28/2015 05:08 PM, Garcia-Contractor, Joseph (CORP) wrote:
> Hi all,
> 
>  
> 
>                I am a DevOps guy and I need implement a storm cluster
> with the proper start and stop init scripts on a Linux server.  I
> already went through the documentation and it seems simple enough.  I am
> using supervisor as my process manager.  I am however having a debate
> with one of the developers using Storm on the proper way to shutdown
> Storm and I am hoping that you fine folks can help us out in this regard.
> 
>  
> 
>                The developer believes that before you tell supervisor to
> kill (SIGTERM) the storm workers, supervisor, and nimbus, you must first
> issue a “storm deactivate topology-name”, then tell supervisor to kill
> all the various processes.  He believes this because he doesn’t know if
> Storm will do an orderly shutdown on SIGTERM and that there is a chance
> that something will get screwed up.  This also means that when you start
> storm, after nimbus is up, you need to issue a ““storm activate
> topology-name”.
> 
>  
> 
>                I am of the belief that because of storms fast fail and
> because it guarantees data processing, none of that is necessary and
> that you can just tell supervisor to stop the process.
> 
>  
> 
>                So who is right here?
> 
> ------------------------------------------------------------------------
> This message and any attachments are intended only for the use of the
> addressee and may contain information that is privileged and
> confidential. If the reader of the message is not the intended recipient
> or an authorized representative of the intended recipient, you are
> hereby notified that any dissemination of this communication is strictly
> prohibited. If you have received this communication in error, notify the
> sender immediately by return email and delete the message and any
> attachments from your system.