You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Rick Kellogg (JIRA)" <ji...@apache.org> on 2015/10/05 03:50:26 UTC

[jira] [Updated] (STORM-589) Suboptimal default worker hb timeouts for nimbus & supervisor

     [ https://issues.apache.org/jira/browse/STORM-589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Kellogg updated STORM-589:
-------------------------------
    Component/s: storm-core

> Suboptimal default worker hb timeouts for nimbus & supervisor
> -------------------------------------------------------------
>
>                 Key: STORM-589
>                 URL: https://issues.apache.org/jira/browse/STORM-589
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>    Affects Versions: 0.9.2-incubating
>            Reporter: Derek Dagit
>            Priority: Minor
>
> Both worker heartbeat timeouts for nimbus and supervisor are set to 30 seconds by default:
> https://github.com/apache/storm/blob/3bbdc166bda7fb1a39b6906eda40da9bc83d5d4c/conf/defaults.yaml#L58
> https://github.com/apache/storm/blob/3bbdc166bda7fb1a39b6906eda40da9bc83d5d4c/conf/defaults.yaml#L118
> This means that it is when a worker dies in relation to its heartbeats that would determine whether the supervisor relaunches it or nimbus reassigns it.
> If the supervisor heartbeat is found to have timed out first, it is relaunched.  If the nimbus heartbeat is found to have timed out first, it is rescheduled.
> We may want the nimbus time-out to be larger than the supervisor time-out, to give the supervisor a chance to relaunch the worker before nimbus re-assigns it.
> As always, users administrating clusters are encouraged to set these as needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)