You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Xiang Li (JIRA)" <ji...@apache.org> on 2019/03/05 15:13:00 UTC

[jira] [Commented] (HBASE-20690) Moving table to target rsgroup needs to handle TableStateNotFoundException

    [ https://issues.apache.org/jira/browse/HBASE-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784549#comment-16784549 ] 

Xiang Li commented on HBASE-20690:
----------------------------------

Hi [~xucang] I have been working on this JIRA for some days, and would like to provide some updates.
# I think the patch v002 uploaded by [~andrewcheng] is the right way to go
# Some potential issues are also introduced by patch v002, as described below.

Patch v002 moves the following operations from postCreateTable() to preCreateTableAction()
# Decide which rsgroup to go to according to table's namespace
# Call RSGroupInfoManagerImpl#moveTables() to update the rsgroup information (but do not move the regions actually)

The changes above also affects the procedure to create hbase:rsgroup table when HMaster starts, by triggering a race condition on cluster schema service. In HMaster#finishActiveMasterInitialization(), hbase:rsgroup is created by
{code}
this.balancer.initialize();  // line 1060
{code}
by HMaster#createSystemTable() internally, in which, a CreateTableProcedure is scheduled. preCreateTableAction is called and the following statement is called to determine the namespace
{code}
String groupName =
        master.getClusterSchema().getNamespace(desc.getTableName().getNamespaceAsString())
                .getConfigurationValue(RSGroupInfo.NAMESPACE_DESC_PROP_GROUP);
{code}
But getClusterSchema might return null because the cluster schema service is not ready yet. Actually, it is not ready until the following statement is called in HMaster#finishActiveMasterInitialization()
{code}
initClusterSchemaService();   // line 1132
{code}


> Moving table to target rsgroup needs to handle TableStateNotFoundException
> --------------------------------------------------------------------------
>
>                 Key: HBASE-20690
>                 URL: https://issues.apache.org/jira/browse/HBASE-20690
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Xiang Li
>            Priority: Major
>         Attachments: HBASE-20690.master.001.patch, HBASE-20690.master.002.patch
>
>
> This is related code:
> {code}
> if (targetGroup != null) {
>   for (TableName table: tables) {
>     if (master.getAssignmentManager().isTableDisabled(table)) {
>       LOG.debug("Skipping move regions because the table" + table + " is disabled.");
>       continue;
>     }
> {code}
> In a stack trace [~rmani] showed me:
> {code}
> 2018-06-06 07:10:44,893 ERROR [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=20000] master.TableStateManager: Unable to get table demo:tbl1 state
> org.apache.hadoop.hbase.master.TableStateManager$TableStateNotFoundException: demo:tbl1
> at org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:193)
> at org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:143)
> at org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:346)
> at org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveTables(RSGroupAdminServer.java:407)
> at org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.assignTableToGroup(RSGroupAdminEndpoint.java:447)
> at org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint.postCreateTable(RSGroupAdminEndpoint.java:470)
> at org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:334)
> at org.apache.hadoop.hbase.master.MasterCoprocessorHost$12.call(MasterCoprocessorHost.java:331)
> at org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
> at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
> at org.apache.hadoop.hbase.master.MasterCoprocessorHost.postCreateTable(MasterCoprocessorHost.java:331)
> at org.apache.hadoop.hbase.master.HMaster$3.run(HMaster.java:1768)
> at org.apache.hadoop.hbase.master.procedure.MasterProcedureUtil.submitProcedure(MasterProcedureUtil.java:131)
> at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1750)
> at org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:593)
> at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> {code}
> The logic should take potential TableStateNotFoundException into account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)