You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Wellington Chevreuil (Jira)" <ji...@apache.org> on 2024/04/25 09:20:00 UTC

[jira] [Assigned] (HBASE-28533) Region split failure due to region quota limit leaves Hmaster's in memory state for the region in SPLITTING after procedure rollback

     [ https://issues.apache.org/jira/browse/HBASE-28533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wellington Chevreuil reassigned HBASE-28533:
--------------------------------------------

    Assignee: Daniel Roudnitsky

> Region split failure due to region quota limit leaves Hmaster's in memory state for the region in SPLITTING after procedure rollback
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-28533
>                 URL: https://issues.apache.org/jira/browse/HBASE-28533
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>         Environment: Tested on HBase Version 2.5.8 and latest master branch 
>            Reporter: Daniel Roudnitsky
>            Assignee: Daniel Roudnitsky
>            Priority: Major
>
> When a SplitTableRegionProcedure is run for a region whose namespace is at its maximum region quota limit, the split procedure will fail and rollback, and Hmaster's in memory RegionStateNode for the region is left in a SPLITTING state. Hmaster will then refuse to start any subsequent merge/split/move procedures for that region because it believes the region is not OPEN, until it is restarted and the in memory record of region states is reset.
> In the first step of the split procedure SPLIT_TABLE_REGION_PREPARE the parent region's RegionStateNode state is set to SPLITTING, and the transition is not written to the meta table. In the next step SPLIT_TABLE_REGION_PRE_OPERATION the region quota check is done, QuotaExceededException is thrown and the procedure ends in ROLLEDBACK state without reverting the RegionStateNode back to OPEN state. Hmaster is left believing the region is in a SPLITTING state according to its in memory RegionStates, while the region is still online on the assigned region server and according to meta.
> To reproduce in HBase shell:
> {code:java}
> > create_namespace 'test_ns', {'hbase.namespace.quota.maxregions'=> 2}
> > create 'test_ns:test_table', 'f1', {NUMREGIONS => 2, SPLITALGO => 'UniformSplit'}
> > region_a = <first region from list_regions 'test_ns:test_table'>
> > region_b = <second region from list_regions 'test_ns:test_table'>
> > split region_a, 'x'
> # HMaster will report: 
> pid=405, state=ROLLEDBACK, exception=org.apache.hadoop.hbase.quotas.QuotaExceededException via master-split-regions:org.apache.hadoop.hbase.quotas.QuotaExceededException: Region split not possible for :<region_a> as quota limits are exceeded ; SplitTableRegionProcedure table=test_ns:test_table, parent=...
> > merge_region region_a, region_b
> ERROR: org.apache.hadoop.hbase.exceptions.MergeRegionException: org.apache.hadoop.hbase.client.DoNotRetryRegionException: <region_a> is not OPEN; state=SPLITTING
> > stop_master # trigger hmaster failover 
> > merge_region region_a, region_b # merge now succeeds {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)