You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2014/11/08 01:10:33 UTC

[jira] [Commented] (HBASE-12450) Unbalance chaos monkey might kill all region servers without starting them back

    [ https://issues.apache.org/jira/browse/HBASE-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14202982#comment-14202982 ] 

Andrew Purtell commented on HBASE-12450:
----------------------------------------

+1

> Unbalance chaos monkey might kill all region servers without starting them back
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-12450
>                 URL: https://issues.apache.org/jira/browse/HBASE-12450
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Virag Kothari
>            Assignee: Virag Kothari
>            Priority: Minor
>             Fix For: 0.98.8, 0.99.2
>
>         Attachments: HBASE-12450.patch
>
>
> UnbalanceKillAndRebalanceAction does kill, balance and then start of region servers. But if the balance fails exception is thrown causing the region servers to not start. For me, the balance always kept on failing with socket timeout (default 1 min) as master runs one iteration of balance for 5 mins (default config). Eventually all servers are killed but never started back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)