You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Yuzhao Chen (JIRA)" <ji...@apache.org> on 2016/08/18 04:00:24 UTC

[jira] [Updated] (STORM-2044) Nimbus should not make assignments crazily when Pacemaker goes down

     [ https://issues.apache.org/jira/browse/STORM-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuzhao Chen updated STORM-2044:
-------------------------------
    Summary: Nimbus should not make assignments crazily when Pacemaker goes down  (was: Nimbus should not make assignments crazy when Pacemaker down)

> Nimbus should not make assignments crazily when Pacemaker goes down
> -------------------------------------------------------------------
>
>                 Key: STORM-2044
>                 URL: https://issues.apache.org/jira/browse/STORM-2044
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-core
>    Affects Versions: 1.0.2
>         Environment: CentOS 6.5
>            Reporter: Yuzhao Chen
>              Labels: patch
>             Fix For: 1.1.0
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Now pacemaker is a stand-alone service and not HA. When is goes down, all the workers's heartbeats will be lost. It will task a long time to recover even if pacemaker goes up immediately if there are dozens GBs of heartbeats. During the time worker heartbeats are not restored completely, Nimbus will think these workers are died because of heartbeats timeout and reassign these "dead" workers continuously until heartbeats restore to normal. So, during recovery time, many topologies will be reassigned and the throughout will goes very down. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)