You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Rick Kellogg (JIRA)" <ji...@apache.org> on 2015/10/09 02:15:27 UTC

[jira] [Updated] (STORM-44) Replication

     [ https://issues.apache.org/jira/browse/STORM-44?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Kellogg updated STORM-44:
------------------------------
    Component/s: storm-core

> Replication
> -----------
>
>                 Key: STORM-44
>                 URL: https://issues.apache.org/jira/browse/STORM-44
>             Project: Apache Storm
>          Issue Type: Wish
>          Components: storm-core
>            Reporter: James Xu
>
> https://github.com/nathanmarz/storm/issues/132
> This is an idea to replicate a computation across many tasks on different machines. The "replication" part is already possible since you can implement your own grouping which sends the tuple to multiple tasks. What is needed is help from Nimbus to make sure those tasks run on different machines.
> Replicated computation would be useful for doing things like highly available DRPC. Essentially you do the same DRPC multiple times and at the end pick the first one that finishes for the result.
> -----------------------------------------------------------------------------------------------------
> LiSu: I am trying to implement this replication feature. My idea is to have replica tasks for each primary task. The replica and the primary are receiving the same input and doing the same execution. When the primary is running, the output of the replica will be blocked. After the primary failed, the output of the replica will be used instead. What I am doing now is that I set a switch in the component common(or topology context for each task) to control the blocking. After I listened to the heartbeat of primary stopped from the nimbus, the state of the switch will be changed. But my problem now is that change of the ComponentCommon on nimbus will not reflect in the workers. Do you guys have any ideas that the state of the switch will be announced to the tasks in time ?
> -----------------------------------------------------------------------------------------------------
> nathanmarz: Trying to coordinate the replicas with each other like this is a dead-end. The point of this feature is that if a replica suddenly dies, there's no loss of availability because the computation is happening anyway on another task. Obviously, replicated tasks require there to be no side effects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)