You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Roland Jungnickel (JIRA)" <ji...@apache.org> on 2015/05/29 11:05:19 UTC

[jira] [Commented] (STORM-167) proposal for storm topology online update

    [ https://issues.apache.org/jira/browse/STORM-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564443#comment-14564443 ] 

Roland Jungnickel commented on STORM-167:
-----------------------------------------

Hi [~parth.brahmbhatt],

Just wondering if there is any update on this feature?

Thanks Roland

> proposal for storm topology online update
> -----------------------------------------
>
>                 Key: STORM-167
>                 URL: https://issues.apache.org/jira/browse/STORM-167
>             Project: Apache Storm
>          Issue Type: New Feature
>            Reporter: James Xu
>            Assignee: Parth Brahmbhatt
>            Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/540
> Now update topology code can only be done by kill it and re-submit a new one. During the kill and re-submit process some request may delay or fail. It is not so good for online service. So we consider to add topology online update recently.
> Mission
> update running topology code gracefully one worker after another without service total interrupted. Just update topology code, not update topology DAG structure including component, stream and task number.
> Proposal
> * client use "storm update topology-name new-jar-file" to submit new-jar-file update request
> * nimbus update stormdist dir, link topology-dir to new one
> * nimbus update topology version on zk
> * the supervisors that running this topology update it
> ** check topology version on zk, if it is not the same as local version, a topology update begin
> ** each supervisor schedule the topology's worker update at a rand(expect-max-update-time) time point
> ** sync-supervisor download the latest code from nimbus
> ** sync-process check local worker heartbeat version(to be added), if it is not the same with sync-supervisor downloaded version, kill the worker
> ** sync-process restart killed worker
> ** new worker heartbeat to zk with version(to be added), it can be displayed on web ui to check update progress.
> This feature is deployed in our production clusters. It's really useful for topologys handling online request waiting for response. Topology jar can be updated without entire service offline.
> We hope that this feature is useful for others too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)