You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by Emmanuel <el...@msn.com> on 2015/03/16 20:05:32 UTC

Updating a live topology

Hello,
Can someone explain to me how one would update a live topology?I assume one would run a new topology with the updated code, and switch off the old one, but how to make the transition in terms of the data flow? 
How to tell the topology to stop pulling data from the queue when the new topology is started? Or is there any guarantee that the stopped topology will finish processing the data it ingested before shutting down?
I would be very grateful if someone could enlight me on this.
ThanksEmmanuel

Re:Updating a live topology

Posted by xu...@gmail.com.

Hi liguozhong,



Do you mean we can use zookeeper to help on migration? What is the watch node you mentioned? 




I am also very interested in learn about how to upgrade a live topology. I read somewhere people are working on a “swap” command, but not sure what’s the status there.




Thanks,

Jia


—
Sent from Mailbox

On Tue, Mar 17, 2015 at 6:15 AM, 李国忠 <li...@qq.com> wrote:

> you need zookeeper between old topology and new topology。
> old topology --》watch nodeA --》zookeeper --》 new topology --》 change nodeA state--》watcher --》old topology stop pulling--》ok。
> am I right？
> ------------------ Original ------------------
> From: "Emmanuel"<el...@msn.com>; 
> Date: 2015年3月17日(星期二) 凌晨3:05
> To: "user"<us...@storm.apache.org>; 
> Subject: Updating a live topology
> Hello,
> Can someone explain to me how one would update a live topology?
> I assume one would run a new topology with the updated code, and switch off the old one, but how to make the transition in terms of the data flow? 
> How to tell the topology to stop pulling data from the queue when the new topology is started? 
> Or is there any guarantee that the stopped topology will finish processing the data it ingested before shutting down?
> I would be very grateful if someone could enlight me on this.
> Thanks
> Emmanuel

Re:Updating a live topology

Posted by 李国忠 <li...@qq.com>.

you need zookeeper between old topology and new topology。
old topology --》watch nodeA --》zookeeper --》 new topology --》 change nodeA state--》watcher --》old topology stop pulling--》ok。
am I right？

------------------ Original ------------------
From: "Emmanuel"<el...@msn.com>; 
Date: 2015年3月17日(星期二) 凌晨3:05
To: "user"<us...@storm.apache.org>; 
Subject: Updating a live topology

Hello,

Can someone explain to me how one would update a live topology?
I assume one would run a new topology with the updated code, and switch off the old one, but how to make the transition in terms of the data flow? 

How to tell the topology to stop pulling data from the queue when the new topology is started? 
Or is there any guarantee that the stopped topology will finish processing the data it ingested before shutting down?

I would be very grateful if someone could enlight me on this.

Thanks
Emmanuel

Re: Updating a live topology

Posted by Richards Peter <hb...@gmail.com>.

Hi,

Storm topology usually gets killed after the tuple timeout (storm kill [-w
timeout_in_seconds] topology_name) assuming that the timeout_in_seconds is
not passed as a parameter. Once the topology kill command is fired, the
spout will not invoke the nextTuple() anymore, but it will wait to receive
all the pending acknowledgements (which should be ideally received within
the tuple timeout). If nextTuple() is not invoked anymore, no new tuples
will be emitted.

I think with the aforementioned behaviour, you can kill the old topology
and start the new one. However you need to check whether your queuing
system will push the unacknowledged messages to the new topology also if it
is launched before the tuple timeout interval(i.e. before the first
topology is killed).

Hope this helps.
Richards Peter.

Re: Updating a live topology

Posted by Tero Paananen <te...@gmail.com>.

> Can someone explain to me how one would update a live topology?

This was always problematic for our topology, but this is how we did
it. I make no claims about it being optimal or even the best way to do
it. It worked for us after we figured out all the kinks.

For topology changes that are backwards compatible, we would just
deploy the new topology with a new name (e.g. name-<timestamp>), and
when Storm UI would show that the new topology was up and running
properly we would kill the old topology. Make sure the timeout you're
using to kill the old topology works for you, i.e. if you have any
in-memory batching or other in-memory stateful processing that you
give the workers enough time to finish whatever they're doing.

The above obviously temporarily requires twice the resources for the
topology, because you're running the old and new version at the same
time for a short period of time.


For backwards incompatible changes you can't do it that way, if you're
processing a lot of events, because you would swamp your topology with
errors, unless your error processing is designed in such a way that
you could recover easily.

What we used to do is kill the old topology to stop processing events
entirely (same caveats as above). Events would start backing up in the
message queue (e.g. Kafka). We would then deploy the new topology
after the old topology was 100% gone. To prepare for this we would
copy the topology jar files to be deployed on the nimbus server, and
then use the Storm command line tool on the Nimbus server to kill the
old topology and deploy the new topology. We did this to get rid of
the topology jar file transfer to the Nimbus server over a network.
This way the new topology gets up and running MUCH faster minimizing
your downtime.

You need to make sure your topology can handle the rush of events
backed up in whatever event source you're using in this case.
Depending on your event volume, you could end up with a incredibly
high event spike compared to your normal event volume. When we had
prolonged downtime for maintenance we would back up events into a raw
events data store and then use batch processing later to catch up.
YMMV.

-TPP