You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by P Ghosh <ja...@gmail.com> on 2014/05/01 21:16:49 UTC

Best practice for shutting down storm

I have few topologies running. The spout puts the ID of the object it is
emitting into an WIP list in REDIS. When the spout gets the ack or fail
method called, it takes it out of the WIP list.

The environment and application are undergoing lot of changes.. and as a
result I'm required to occasionally restart the topology or the storm
cluster itself.

Problem is, as I restart, I see quite few messages are left in WIP..which
means for these messages, spout didn't receive any ack or fail.

My restart process has been
    1. Kill the topology from UI (I find killing from UI is more responsive
than from command line.... the killed topology goes off very quickly...if I
do it from command line, the "killed" topology remains in the list for a
long time , hindering my ability to relaunch the topology...). I typically
kill it it with 0 secs. wait time..(may be this where I'm doing wrong)

    2. Go to each VM and stop the
              a> supervisor
              b> logviewer
    3. Go to nimbus,shutdown
             a> ui/nimbus/logviewer
    4.Go to zookeeper and shutdown zookeeper


This I thought is the proper flow...but I doubt that given the left over
messages I see in WIP.

Any thoughts...will be helpful.

Thanks,
Prasun

Re: Best practice for shutting down storm

Posted by Nathan Leung <nc...@gmail.com>.
Hi Prasun,

Acks and fails should continue to be handled.  For step 2 I would consider
adding a timeout just in case.

-Nathan


On Thu, May 1, 2014 at 3:36 PM, Prasun Ghosh <pr...@apple.com> wrote:

> Thanks Nathan,
>
> So, my shutdown script should be
> 1. Deactivate the topology
> 2. Wait for the WIP size to become “Zero”.
> The piece of code that removes from WIP resides in Spout's ack/fail
> method. By deactivating the topology and hence the spout, will these pieces
> of code (in spout) still execute to remove items from WIP ?
> In short, when I deactivate topology, are we just pausing the call to
> “nextTuple()” on spout and everything else will continue to work as is ?
> 3. Initiate Shutdown...
> - Thanks,
> Prasun Ghosh
>  Apple Inc.
> Information Security
>
>
>
>
>
>
> On May 1, 2014, at 12:27 PM, Nathan Leung <nc...@gmail.com> wrote:
>
> You can deactivate the topology, which will shut off the spouts.  Then
> after a period of time (enough for your bolts to all drain), kill the
> topology.  I believe this is what kill with a non-zero timeout does as
> well.  Kill with a zero timeout will kill the worker process/es without
> letting them drain, hence the tuples that were not acked or failed.
>
>
> On Thu, May 1, 2014 at 3:16 PM, P Ghosh <ja...@gmail.com> wrote:
>
>> I have few topologies running. The spout puts the ID of the object it is
>> emitting into an WIP list in REDIS. When the spout gets the ack or fail
>> method called, it takes it out of the WIP list.
>>
>> The environment and application are undergoing lot of changes.. and as a
>> result I'm required to occasionally restart the topology or the storm
>> cluster itself.
>>
>> Problem is, as I restart, I see quite few messages are left in WIP..which
>> means for these messages, spout didn't receive any ack or fail.
>>
>> My restart process has been
>>     1. Kill the topology from UI (I find killing from UI is more
>> responsive than from command line.... the killed topology goes off very
>> quickly...if I do it from command line, the "killed" topology remains in
>> the list for a long time , hindering my ability to relaunch the
>> topology...). I typically kill it it with 0 secs. wait time..(may be this
>> where I'm doing wrong)
>>
>>     2. Go to each VM and stop the
>>               a> supervisor
>>               b> logviewer
>>     3. Go to nimbus,shutdown
>>              a> ui/nimbus/logviewer
>>     4.Go to zookeeper and shutdown zookeeper
>>
>>
>> This I thought is the proper flow...but I doubt that given the left over
>> messages I see in WIP.
>>
>> Any thoughts...will be helpful.
>>
>> Thanks,
>> Prasun
>>
>>
>>
>>
>
>
>

Re: Best practice for shutting down storm

Posted by Prasun Ghosh <pr...@apple.com>.
Thanks Nathan,

So, my shutdown script should be
	1. Deactivate the topology
	2. Wait for the WIP size to become “Zero”. 
		The piece of code that removes from WIP resides in Spout's ack/fail method. By deactivating the topology and hence the spout, will these pieces of code (in spout) still execute to remove items from WIP ?
		In short, when I deactivate topology, are we just pausing the call to “nextTuple()” on spout and everything else will continue to work as is ?
	3. Initiate Shutdown...
- Thanks,
Prasun Ghosh
 Apple Inc.
Information Security






On May 1, 2014, at 12:27 PM, Nathan Leung <nc...@gmail.com> wrote:

> You can deactivate the topology, which will shut off the spouts.  Then after a period of time (enough for your bolts to all drain), kill the topology.  I believe this is what kill with a non-zero timeout does as well.  Kill with a zero timeout will kill the worker process/es without letting them drain, hence the tuples that were not acked or failed.
> 
> 
> On Thu, May 1, 2014 at 3:16 PM, P Ghosh <ja...@gmail.com> wrote:
> I have few topologies running. The spout puts the ID of the object it is emitting into an WIP list in REDIS. When the spout gets the ack or fail method called, it takes it out of the WIP list.
> 
> The environment and application are undergoing lot of changes.. and as a result I'm required to occasionally restart the topology or the storm cluster itself.
> 
> Problem is, as I restart, I see quite few messages are left in WIP..which means for these messages, spout didn't receive any ack or fail.
> 
> My restart process has been
>     1. Kill the topology from UI (I find killing from UI is more responsive than from command line.... the killed topology goes off very quickly...if I do it from command line, the "killed" topology remains in the list for a long time , hindering my ability to relaunch the topology...). I typically kill it it with 0 secs. wait time..(may be this where I'm doing wrong)
> 
>     2. Go to each VM and stop the 
>               a> supervisor
>               b> logviewer
>     3. Go to nimbus,shutdown
>              a> ui/nimbus/logviewer
>     4.Go to zookeeper and shutdown zookeeper
> 
> 
> This I thought is the proper flow...but I doubt that given the left over messages I see in WIP.
> 
> Any thoughts...will be helpful.
> 
> Thanks,
> Prasun
>              
>              
>            
> 


Re: Best practice for shutting down storm

Posted by Nathan Leung <nc...@gmail.com>.
You can deactivate the topology, which will shut off the spouts.  Then
after a period of time (enough for your bolts to all drain), kill the
topology.  I believe this is what kill with a non-zero timeout does as
well.  Kill with a zero timeout will kill the worker process/es without
letting them drain, hence the tuples that were not acked or failed.


On Thu, May 1, 2014 at 3:16 PM, P Ghosh <ja...@gmail.com> wrote:

> I have few topologies running. The spout puts the ID of the object it is
> emitting into an WIP list in REDIS. When the spout gets the ack or fail
> method called, it takes it out of the WIP list.
>
> The environment and application are undergoing lot of changes.. and as a
> result I'm required to occasionally restart the topology or the storm
> cluster itself.
>
> Problem is, as I restart, I see quite few messages are left in WIP..which
> means for these messages, spout didn't receive any ack or fail.
>
> My restart process has been
>     1. Kill the topology from UI (I find killing from UI is more
> responsive than from command line.... the killed topology goes off very
> quickly...if I do it from command line, the "killed" topology remains in
> the list for a long time , hindering my ability to relaunch the
> topology...). I typically kill it it with 0 secs. wait time..(may be this
> where I'm doing wrong)
>
>     2. Go to each VM and stop the
>               a> supervisor
>               b> logviewer
>     3. Go to nimbus,shutdown
>              a> ui/nimbus/logviewer
>     4.Go to zookeeper and shutdown zookeeper
>
>
> This I thought is the proper flow...but I doubt that given the left over
> messages I see in WIP.
>
> Any thoughts...will be helpful.
>
> Thanks,
> Prasun
>
>
>
>