You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Bharath Vissapragada (Jira)" <ji...@apache.org> on 2020/06/22 17:35:00 UTC

[jira] [Updated] (HBASE-24360) RollingBatchRestartRsAction loses track of dead servers

     [ https://issues.apache.org/jira/browse/HBASE-24360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bharath Vissapragada updated HBASE-24360:
-----------------------------------------
    Fix Version/s: 1.7.0

> RollingBatchRestartRsAction loses track of dead servers
> -------------------------------------------------------
>
>                 Key: HBASE-24360
>                 URL: https://issues.apache.org/jira/browse/HBASE-24360
>             Project: HBase
>          Issue Type: Test
>          Components: integration tests
>    Affects Versions: 2.3.0
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.3.0, 1.7.0
>
>
> {{RollingBatchRestartRsAction}} doesn't handle failure cases when tracking its list of dead servers. The original author believed that a failure to restart would result in a retry. However, by removing the dead server from the failed list prematurely, that state is lost, and retry of that server never occurs. Because this action doesn't ever look back to the current state of the cluster, relying only on its local state for the current action invocation, it never realizes the abandoned server is still dead. Instead, be more careful to only remove the dead server from the list when the {{startRs}} invocation claims to have been successful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)