You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2018/09/12 19:19:00 UTC

[jira] [Updated] (HBASE-21191) Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).

     [ https://issues.apache.org/jira/browse/HBASE-21191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-21191:
--------------------------
    Attachment: HBASE-21191.branch-2.1.001.patch

> Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-21191
>                 URL: https://issues.apache.org/jira/browse/HBASE-21191
>             Project: HBase
>          Issue Type: Sub-task
>          Components: amv2
>            Reporter: stack
>            Assignee: stack
>            Priority: Major
>         Attachments: HBASE-21191.branch-2.1.001.patch
>
>
> If the masterprocwals have been removed -- operator error, hdfs dataloss, or because we have gotten ourselves into a pathological state where we have hundreds of masterprocwals too process and it is taking too long so we just want to startover -- then master startup will have a dilemma. Master startup needs hbase:meta to be online. If the masterprocwals have been removed, there may be no outstanding assign or a servercrashprocedure with coverage for hbase:meta (I ran into this issue repeatedly in internal testing purging masterprocwals on a large test cluster). Worse, when master startup cannot find an online hbase:meta, it exits after exhausting the RPC retries.
> So, we need a holding-pattern for master startup if hbase:meta is not online if only so an operator can schedule an assign for meta or so they can assign fixup procedures (HBASE-20786 has discussion on why we cannot just auto-schedule an assign of meta).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)