You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2018/01/15 18:17:00 UTC

[jira] [Comment Edited] (HBASE-19757) System table gets stuck after enabling region server group feature in secure cluster

    [ https://issues.apache.org/jira/browse/HBASE-19757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16325436#comment-16325436 ] 

Ted Yu edited comment on HBASE-19757 at 1/15/18 6:16 PM:
---------------------------------------------------------

In master, we have the following code in RSGroupInfoManagerImpl#refresh()
{code:java}
    if(!masterServices.isInitialized()) {
      specialTables = Arrays.asList(AccessControlLists.ACL_TABLE_NAME, TableName.META_TABLE_NAME,
          TableName.NAMESPACE_TABLE_NAME, RSGROUP_TABLE_NAME);
    } else {
      specialTables =
          masterServices.listTableNamesByNamespace(NamespaceDescriptor.SYSTEM_NAMESPACE_NAME_STR);
    }
{code}
If acl table is about to be created, the call in else branch may end up not having hbase:acl as one of the special tables.

In RSGroupBasedLoadBalancer, due to lack of rs group, no server is provided for hbase:acl table, leading to the deadlock.


was (Author: yuzhihong@gmail.com):
In master, we have the following code in RSGroupInfoManagerImpl#refresh()
{code}
    if(!masterServices.isInitialized()) {
      specialTables = Arrays.asList(AccessControlLists.ACL_TABLE_NAME, TableName.META_TABLE_NAME,
          TableName.NAMESPACE_TABLE_NAME, RSGROUP_TABLE_NAME);
    } else {
      specialTables =
          masterServices.listTableNamesByNamespace(NamespaceDescriptor.SYSTEM_NAMESPACE_NAME_STR);
    }
{code}
If acl table is about to be created, the call in else branch may end up not having hbase:acl as one of the special tables.
By always using the assignment in if block, TestRSGroupsWithACL passes.

> System table gets stuck after enabling region server group feature in secure cluster
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-19757
>                 URL: https://issues.apache.org/jira/browse/HBASE-19757
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>            Priority: Major
>         Attachments: 19757.v1.txt, 19757.v2.txt, 19757.v3.txt
>
>
> I was testing on an hbase-2 secure cluster against hadoop 3 where some tables were created without region server group feature.
> After adding the RSGroupAdminEndpoint and RSGroupBasedLoadBalancer to hbase-site, I restarted the whole cluster.
> After the restart, hbase:meta region got stuck in transition (forever).
> {code}
> 2018-01-10 21:20:16,696 INFO  [org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker-ctr-e137-1514896590304-8706-01-000002.hwx.site,20000,1515619212617]  zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at address=ctr-e137-1514896590304-8706-01-000004.hwx.site,16020,1515618538016, exception=org.apache.hadoop.    hbase.NotServingRegionException: hbase:meta,,1 is not online on ctr-e137-1514896590304-8706-01-000004.hwx.site,16020,1515619181453
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3314)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3291)
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1355)
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1667)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)