You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by "Alexander S. Klimov" <al...@microsoft.com> on 2014/03/03 10:49:46 UTC

swapping topologies

Hi guys,

In the discussions I found regarding changing scale of the topology (how many bolts and spout tasks we should have in given topology) I found recommendation to do swapping instead of trying to change topology definition dynamically.

For that it is recommended to deploy new topology in deactivated state, pause consumption in spouts of old topology and then after message timeout - activate consumption in new topology.


1.      Is this recommendation correct?

2.      How topology can be deployed in deactivated state?

3.      Is there standard way to pause consumption in spouts of old topology?

Thanks,
Alex

回复: swapping topologies

Posted by Kang Xiao <kx...@gmail.com>.
Hi Alexander

We implemented a feature “storm update” to achieve topology online update. You can refer to this jira.

https://issues.apache.org/jira/browse/STORM-167


--  
Best Regards!

肖康(Kang Xiao,<kxiao.tiger@gmail.com (mailto:kxiao.tiger@gmail.com)>)
Distributed Software Engineer
已使用 Sparrow (http://www.sparrowmailapp.com/?sig)

已使用 Sparrow (http://www.sparrowmailapp.com/?sig)  

在 2014年3月7日 星期五,8:40,Alexander S. Klimov 写道:

> Found this piece of the information:
> http://storm.incubator.apache.org/documentation/Running-topologies-on-a-production-cluster.html
>                  
> Updating a running topology
> To update a running topology, the only option currently is to kill the current topology and resubmit a new one. A planned feature is to implement a storm swap command that swaps a running topology with a new one, ensuring minimal downtime and no chance of both topologies processing tuples at the same time.
> Alright… That’s not great and we probably can try to code around this problem to ensure less downtime.
>   
> Do we have any ETA on “storm swap” command?
>   
> Thanks,
> Alex  
>   
> From: Alexander S. Klimov [mailto:alexklim@microsoft.com]  
> Sent: Wednesday, March 5, 2014 11:09 AM
> To: user@storm.incubator.apache.org (mailto:user@storm.incubator.apache.org)
> Subject: RE: swapping topologies  
>   
> Hi guys,
>   
> Sorry for bringing the thread up. Just trying to understand the correct change management mechanism recommended in Storm. As far as I understand – recommended way to change topology is swap new topology with old one.
>   
> If that is correct – does the algorithm outlined in previous mail sound right?
>   
> Does anyone update topology/change scale of bolts/spouts differently? Or in general there is no need for as less downtime as possible when making changes to production topology – as we can rely on Kafka or similar system to capture the traffic while the system is down?
>   
> Thanks,
> Alex  
>   
> From: Alexander S. Klimov [mailto:alexklim@microsoft.com]  
> Sent: Monday, March 3, 2014 9:57 AM
> To: user@storm.incubator.apache.org (mailto:user@storm.incubator.apache.org)
> Subject: RE: swapping topologies  
>   
> In spout interface definition we have following methods: activate (http://nathanmarz.github.io/storm/doc/backtype/storm/spout/ISpout.html#activate()) and deactivate (http://nathanmarz.github.io/storm/doc/backtype/storm/spout/ISpout.html#deactivate()).
>   
> My guess is that algorithm for swapping topologies would be following:
> 1.       Deploy new topology in deactivated state. (Not sure yet how to accomplish this).
> 2.       Call Deactivate for old topology in Nimbus UI or console and specify message timeout.
> 3.       Spouts in old topology get method deactivate triggered, so they have information that they are about to get shutdown. After this event no new messages will be sent to old topology – only current messages will be drained.
> a.       Old topology could be reactivated again if new topology is broken/buggy. For that method activate in topology spouts could be called again.
> 4.       Meanwhile new topology could be activated.
>   
> Does this sound correct? How topology can be deployed in deactivated state?
>   
> Thanks,
> Alex  
>   
> From: Alexander S. Klimov [mailto:alexklim@microsoft.com]  
> Sent: Monday, March 3, 2014 1:50 AM
> To: user@storm.incubator.apache.org (mailto:user@storm.incubator.apache.org)
> Subject: swapping topologies  
>   
> Hi guys,
>   
> In the discussions I found regarding changing scale of the topology (how many bolts and spout tasks we should have in given topology) I found recommendation to do swapping instead of trying to change topology definition dynamically.
>   
> For that it is recommended to deploy new topology in deactivated state, pause consumption in spouts of old topology and then after message timeout – activate consumption in new topology.
>   
> 1.       Is this recommendation correct?
> 2.       How topology can be deployed in deactivated state?
> 3.       Is there standard way to pause consumption in spouts of old topology?
>   
> Thanks,
> Alex  


RE: swapping topologies

Posted by "Alexander S. Klimov" <al...@microsoft.com>.
Found this piece of the information:
http://storm.incubator.apache.org/documentation/Running-topologies-on-a-production-cluster.html

Updating a running topology
To update a running topology, the only option currently is to kill the current topology and resubmit a new one. A planned feature is to implement a storm swap command that swaps a running topology with a new one, ensuring minimal downtime and no chance of both topologies processing tuples at the same time.
Alright... That's not great and we probably can try to code around this problem to ensure less downtime.

Do we have any ETA on "storm swap" command?

Thanks,
Alex

From: Alexander S. Klimov [mailto:alexklim@microsoft.com]
Sent: Wednesday, March 5, 2014 11:09 AM
To: user@storm.incubator.apache.org
Subject: RE: swapping topologies

Hi guys,

Sorry for bringing the thread up. Just trying to understand the correct change management mechanism recommended in Storm. As far as I understand - recommended way to change topology is swap new topology with old one.

If that is correct - does the algorithm outlined in previous mail sound right?

Does anyone update topology/change scale of bolts/spouts differently? Or in general there is no need for as less downtime as possible when making changes to production topology - as we can rely on Kafka or similar system to capture the traffic while the system is down?

Thanks,
Alex

From: Alexander S. Klimov [mailto:alexklim@microsoft.com]
Sent: Monday, March 3, 2014 9:57 AM
To: user@storm.incubator.apache.org<ma...@storm.incubator.apache.org>
Subject: RE: swapping topologies

In spout interface definition we have following methods: activate<http://nathanmarz.github.io/storm/doc/backtype/storm/spout/ISpout.html#activate()> and deactivate<http://nathanmarz.github.io/storm/doc/backtype/storm/spout/ISpout.html#deactivate()>.

My guess is that algorithm for swapping topologies would be following:

1.       Deploy new topology in deactivated state. (Not sure yet how to accomplish this).

2.       Call Deactivate for old topology in Nimbus UI or console and specify message timeout.

3.       Spouts in old topology get method deactivate triggered, so they have information that they are about to get shutdown. After this event no new messages will be sent to old topology - only current messages will be drained.

a.       Old topology could be reactivated again if new topology is broken/buggy. For that method activate in topology spouts could be called again.

4.       Meanwhile new topology could be activated.

Does this sound correct? How topology can be deployed in deactivated state?

Thanks,
Alex

From: Alexander S. Klimov [mailto:alexklim@microsoft.com]
Sent: Monday, March 3, 2014 1:50 AM
To: user@storm.incubator.apache.org<ma...@storm.incubator.apache.org>
Subject: swapping topologies

Hi guys,

In the discussions I found regarding changing scale of the topology (how many bolts and spout tasks we should have in given topology) I found recommendation to do swapping instead of trying to change topology definition dynamically.

For that it is recommended to deploy new topology in deactivated state, pause consumption in spouts of old topology and then after message timeout - activate consumption in new topology.


1.       Is this recommendation correct?

2.       How topology can be deployed in deactivated state?

3.       Is there standard way to pause consumption in spouts of old topology?

Thanks,
Alex

RE: swapping topologies

Posted by "Alexander S. Klimov" <al...@microsoft.com>.
Hi guys,

Sorry for bringing the thread up. Just trying to understand the correct change management mechanism recommended in Storm. As far as I understand - recommended way to change topology is swap new topology with old one.

If that is correct - does the algorithm outlined in previous mail sound right?

Does anyone update topology/change scale of bolts/spouts differently? Or in general there is no need for as less downtime as possible when making changes to production topology - as we can rely on Kafka or similar system to capture the traffic while the system is down?

Thanks,
Alex

From: Alexander S. Klimov [mailto:alexklim@microsoft.com]
Sent: Monday, March 3, 2014 9:57 AM
To: user@storm.incubator.apache.org
Subject: RE: swapping topologies

In spout interface definition we have following methods: activate<http://nathanmarz.github.io/storm/doc/backtype/storm/spout/ISpout.html#activate()> and deactivate<http://nathanmarz.github.io/storm/doc/backtype/storm/spout/ISpout.html#deactivate()>.

My guess is that algorithm for swapping topologies would be following:

1.       Deploy new topology in deactivated state. (Not sure yet how to accomplish this).

2.       Call Deactivate for old topology in Nimbus UI or console and specify message timeout.

3.       Spouts in old topology get method deactivate triggered, so they have information that they are about to get shutdown. After this event no new messages will be sent to old topology - only current messages will be drained.

a.       Old topology could be reactivated again if new topology is broken/buggy. For that method activate in topology spouts could be called again.

4.       Meanwhile new topology could be activated.

Does this sound correct? How topology can be deployed in deactivated state?

Thanks,
Alex

From: Alexander S. Klimov [mailto:alexklim@microsoft.com]
Sent: Monday, March 3, 2014 1:50 AM
To: user@storm.incubator.apache.org<ma...@storm.incubator.apache.org>
Subject: swapping topologies

Hi guys,

In the discussions I found regarding changing scale of the topology (how many bolts and spout tasks we should have in given topology) I found recommendation to do swapping instead of trying to change topology definition dynamically.

For that it is recommended to deploy new topology in deactivated state, pause consumption in spouts of old topology and then after message timeout - activate consumption in new topology.


1.       Is this recommendation correct?

2.       How topology can be deployed in deactivated state?

3.       Is there standard way to pause consumption in spouts of old topology?

Thanks,
Alex

RE: swapping topologies

Posted by "Alexander S. Klimov" <al...@microsoft.com>.
In spout interface definition we have following methods: activate<http://nathanmarz.github.io/storm/doc/backtype/storm/spout/ISpout.html#activate()> and deactivate<http://nathanmarz.github.io/storm/doc/backtype/storm/spout/ISpout.html#deactivate()>.

My guess is that algorithm for swapping topologies would be following:

1.       Deploy new topology in deactivated state. (Not sure yet how to accomplish this).

2.       Call Deactivate for old topology in Nimbus UI or console and specify message timeout.

3.       Spouts in old topology get method deactivate triggered, so they have information that they are about to get shutdown. After this event no new messages will be sent to old topology - only current messages will be drained.

a.       Old topology could be reactivated again if new topology is broken/buggy. For that method activate in topology spouts could be called again.

4.       Meanwhile new topology could be activated.

Does this sound correct? How topology can be deployed in deactivated state?

Thanks,
Alex

From: Alexander S. Klimov [mailto:alexklim@microsoft.com]
Sent: Monday, March 3, 2014 1:50 AM
To: user@storm.incubator.apache.org
Subject: swapping topologies

Hi guys,

In the discussions I found regarding changing scale of the topology (how many bolts and spout tasks we should have in given topology) I found recommendation to do swapping instead of trying to change topology definition dynamically.

For that it is recommended to deploy new topology in deactivated state, pause consumption in spouts of old topology and then after message timeout - activate consumption in new topology.


1.       Is this recommendation correct?

2.       How topology can be deployed in deactivated state?

3.       Is there standard way to pause consumption in spouts of old topology?

Thanks,
Alex