You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by WangYQ <wa...@163.com> on 2016/10/31 02:15:01 UTC

may be a bug in ServerShutDownHandler leading to region in RIT

hbase version: 0.98.10
when a rs died, master will use ServerShutDownHanlder to process this dead rs
in ServerShutDownHandler, will call RegionStates.serverOffline to get all regions need to reAssign


in method  RegionStates.serverOffline, line 538
LOG.warn("THIS SHOULD NOT HANPPEN......");


but, there is a scenario can make line 538 happen and leading to region in RIT


setps to reproduce this problem:
1. hmaster asks rs1 to open region1
2. rs1 opens region1 successfully
3. hmaster use method "handleRegion" to process RS_ZK_REGION_OPEN zk-event
    such as : delete ZNode, update RIT, make region to online state, update lastAssignment
4. before hmaster process node-deleted event, rs1 died
    then hmaster will skip the following steps to make region online
5. hmaster submit serverShutDownHanlder to process dead rs1
   find region1 in RIT with online state, and the serverName in RIT equals the dead server


then, in method  RegionStates.serverOffline, line 538
LOG.warn("THIS SHOULD NOT HANPPEN......");   will happen 


finally, region1 will stay in RIT. unless we restart hmaster to refresh hmaster's memory