You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Guanghao Zhang (JIRA)" <ji...@apache.org> on 2019/03/04 09:04:00 UTC

[jira] [Updated] (HBASE-21156) [hbck2] Queue an assign of hbase:meta and bulk assign/unassign

     [ https://issues.apache.org/jira/browse/HBASE-21156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Guanghao Zhang updated HBASE-21156:
-----------------------------------
    Fix Version/s: 2.2.0
                   3.0.0

> [hbck2] Queue an assign of hbase:meta and bulk assign/unassign
> --------------------------------------------------------------
>
>                 Key: HBASE-21156
>                 URL: https://issues.apache.org/jira/browse/HBASE-21156
>             Project: HBase
>          Issue Type: Sub-task
>          Components: hbck2
>    Affects Versions: 2.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>             Fix For: 3.0.0, 2.2.0, 2.1.1
>
>         Attachments: HBASE-21156.branch-2.1.001.patch, HBASE-21156.branch-2.1.002.patch, HBASE-21156.branch-2.1.003.patch, HBASE-21156.branch-2.1.004.patch, HBASE-21156.branch-2.1.005.patch
>
>
> We need this to effect repair when damage.
> If procedure WALs AND a server WAL dir are lost or cleaned or we crashed during partial split (unlikely scenarios but nonetheless possible), a Master can be stuck unable to become active because there is no assign procedure for hbase:meta in the system.
> The reasonable argument over in HBASE-21035 has it that attempts at auto-repair under these extremes could cause other issues so at least until we learn more, we for now punt to the operator for fix-up.
> To reproduce the catastrophe, see notes in HBASE-21035 (and [~allan163]'s test).
> UPDATE: HBASE-21191 adds a Master assuming an "holding-pattern" if on startup it does not have an assign for meta (possible if we lose all Master WAL Procs.). Holding pattern is needed because we were exiting after one minute of RPC'ing to old meta location. To inject an assign, the Admin#assign won't work because it gets rejected because the "Master is Initializing". So we need to be able to assign hbase:meta even if "Master is initializing". Also, while in here, add being able to bulk assign because assigning a Region-at-a-time from the shell only works if the offflined region count is in the low 10s; fails when thousands offline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)