You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Allan Yang (JIRA)" <ji...@apache.org> on 2018/11/01 11:29:00 UTC

[jira] [Created] (HBASE-21421) Do not kill RS if reportOnlineRegions fails

Allan Yang created HBASE-21421:
----------------------------------

             Summary: Do not kill RS if reportOnlineRegions fails
                 Key: HBASE-21421
                 URL: https://issues.apache.org/jira/browse/HBASE-21421
             Project: HBase
          Issue Type: Sub-task
    Affects Versions: 2.0.2, 2.1.1
            Reporter: Allan Yang
            Assignee: Allan Yang


In the periodic regionServerReport call from RS to master, we will check master.getAssignmentManager().reportOnlineRegions() to make sure the RS has a different state from Master. If RS holds a region which master think should be on another RS, the Master will kill the RS.

But, the regionServerReport could be lagging(due to network or something), which can't represent the current state of RegionServer. Besides, we will call reportRegionStateTransition and try forever until it successfully reported to master  when online a region. We can count on reportRegionStateTransition calls.

I have encountered cases that the regions are closed on the RS and  reportRegionStateTransition to master successfully. But later, a lagging regionServerReport tells the master the region is online on the RS(Which is not at the moment, this call may generated some time ago and delayed by network somehow), the the master think the region should be on another RS, and kill the RS, which should not be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)