You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Kyle Purtell (Jira)" <ji...@apache.org> on 2021/10/20 15:46:00 UTC

[jira] [Commented] (HBASE-26383) HBCK incorrectly reports inconsistencies for recently split regions following a master failover

    [ https://issues.apache.org/jira/browse/HBASE-26383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17431326#comment-17431326 ] 

Andrew Kyle Purtell commented on HBASE-26383:
---------------------------------------------

Let me apply the recommended change and, assuming successful review, make a 2.4.8 release to ship it. 

> HBCK incorrectly reports inconsistencies for recently split regions following a master failover
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-26383
>                 URL: https://issues.apache.org/jira/browse/HBASE-26383
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 2.4.3
>            Reporter: Benoit Sigoure
>            Assignee: Andrew Kyle Purtell
>            Priority: Critical
>             Fix For: 2.4.8
>
>
> When a region P splits into A and B, following a master failover the newly active master reports that P is in an inconsistent state. This seems to be a regression introduced in HBASE-25847 (cc [~andrew.purtell@gmail.com]) which changed {{regionInfo.isParentSplit()}} to {{regionState.isSplit()}}. The region state after restart is CLOSED (rather than SPLIT), so both region state and region info should be checked, presumably with {{regionState.isSplit() || regionInfo.isSplit()}}. This situation resolves itself on its own when a major compaction occurs and P is GCed, but having the master incorrectly report inconsistencies is pretty bad. We had a pretty big outage due to a series of operator errors as our SRE team was trying to fix this inconsistency that, in fact, didn't even exist.
> Thanks to Stack for helping look over this issue and Vlad Hanciuta for root causing the bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)