You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Biju Nair (JIRA)" <ji...@apache.org> on 2018/05/23 23:07:00 UTC

[jira] [Created] (HBASE-20632) Failure of RSes belonging to RSgroup for System tables makes the cluster unavailable

Biju Nair created HBASE-20632:
---------------------------------

             Summary: Failure of RSes belonging to RSgroup for System tables makes the cluster unavailable
                 Key: HBASE-20632
                 URL: https://issues.apache.org/jira/browse/HBASE-20632
             Project: HBase
          Issue Type: Bug
          Components: master, regionserver
    Affects Versions: 3.0.0
            Reporter: Biju Nair


This was done on a local cluster (non hdfs) and following are the steps
 * Start a single node cluster and start an additional RS using {{local-regionservers.sh}}
 * Through hbase shell add a new rs group
 * 
{noformat}
hbase(main):001:0> add_rsgroup 'test_rsgroup'
Took 0.5503 seconds
hbase(main):002:0> list_rsgroups
NAME SERVER / TABLE
test_rsgroup
default server dob2-r3n13:16020
server dob2-r3n13:16022
table hbase:meta
table hbase:acl
table hbase:quota
table hbase:namespace
table hbase:rsgroup
2 row(s)
Took 0.0419 seconds{noformat}

 * Move one of the region servers to the new {{rsgroup}}
 * 
{noformat}
hbase(main):004:0> move_servers_rsgroup 'test_rsgroup',['dob2-r3n13:16020']
Took 6.4894 seconds
hbase(main):005:0> exit{noformat}

 * Stop the regionserver which is left in the {{default}} rsgroup
 * 
{noformat}
local-regionservers.sh stop 2{noformat}

The cluster becomes unusable even if the region server is restarted or even if all the services were brought down and brought up.

In {{1.1.x}} version, the cluster recovers fine. Looks like {{meta}} is assigned to a {{dummy}} regionserver and when the regionserver gets restarted it gets assigned. The following is what we can see in {{master}} UI when the {{rs}} is down
{noformat}
1588230740	hbase:meta,,1.1588230740 state=PENDING_OPEN, ts=Wed May 23 18:24:01 EDT 2018 (1s ago), server=localhost,1,1{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)