You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2010/07/29 02:26:19 UTC
[jira] Updated: (HBASE-2755) Duplicate assignment of a region after region server recovery

     [ https://issues.apache.org/jira/browse/HBASE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-2755:
--------------------------------------

    Fix Version/s: 0.90.0

Running 0.89.20100726, I hit that bug just when I started my little 8 nodes cluster (that uses a 5 nodes ZK ensemble hosted elsewhere) that has about 550 regions. When I disabled my table, 2 regions were left assigned (because they were already double assigned). One thing special about my configuration is that base scanner scans every 30 seconds instead of 60. Here's the evidence that it took the master a long time to even assign the region after figuring it was unassigned:

{noformat}
2010-07-28 15:05:18,585 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of TestTable,0030543447,1277496465702.4f30002dbe81eeb4c0a11892f4af8033. is not valid;  
serverAddress=sv2borg172:60020, startCode=1277492567399 unknown.
2010-07-28 15:05:18,586 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED region 4f30002dbe81eeb4c0a11892f4af8033 in state = M2ZK_REGION_OFFLINE
2010-07-28 15:05:18,588 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeCreated with state SyncConnected with path /hbase-master/UNASSIGNED/4f30002dbe81eeb4c0a11892f4af8033
2010-07-28 15:05:18,588 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeCreated with path /hbase-master/UNASSIGNED/4f30002dbe81eeb4c0a11892f4af8033
2010-07-28 15:05:18,588 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: <sv2borg169,sv2borg167,sv2borg166,sv2borg165,sv2borg164:/hbase-
master,org.apache.hadoop.hbase.master.HMaster>Created ZNode /hbase-master/UNASSIGNED/4f30002dbe81eeb4c0a11892f4af8033 in ZooKeeper
2010-07-28 15:05:18,588 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeCreated state:SyncConnected path:/hbase-
master/UNASSIGNED/4f30002dbe81eeb4c0a11892f4af8033
2010-07-28 15:05:18,590 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created/updated UNASSIGNED zNode TestTable,0030543447,1277496465702.4f30002dbe81eeb4c0a11892f4af8033.
 in state M2ZK_REGION_OFFLINE
2010-07-28 15:05:18,590 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [ M2ZK_REGION_OFFLINE ] for region 4f30002dbe81eeb4c0a11892f4af8033
2010-07-28 15:05:44,314 INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region TestTable,0030543447,1277496465702.4f30002dbe81eeb4c0a11892f4af8033. to 
sv2borg173,60020,1280354711806
{noformat}

After that, the actual updating of .META. happened almost as the same time when the BaseScanner was looking at that row and double assignment happened. This issue was always there, but it looks like assigning regions takes a lot longer than before (I've ran a lot of cluster with the base scanner scanning every 5 seconds in the past). I would guess that this is because of the new ZK calls, which are done synchronously at the moment.

I expect that anyone running with the default 60 seconds sleep but with 2-3x the number of regions will get double assignment too.

> Duplicate assignment of a region after region server recovery
> -------------------------------------------------------------
>
>                 Key: HBASE-2755
>                 URL: https://issues.apache.org/jira/browse/HBASE-2755
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.90.0
>
>
> After a region server recovery, some regions may get assigned to duplicate region servers.
> Note: I am based on a slightly older trunk (prior to the HBASE-2694). Nevertheless, I think HBASE-2694 doesn't address this case.
> Scenario:
> * Three region server setup (store285,286,287), with about 500 regions in the table overall.
> * kill -9 and restart one of the region servers (store286).
> * The 170 odd regions in the failed region server got assigned out. But two of the regions got assigned to multiple region servers.
> * Looking at the log entries for one such region, it appears that there is some race condition that happens between the ProcessRegionOpen (a RegionServerOperation) and BaseScanner which causes the BaseScanner to think this region needs to be reassigned.
> Relevant Logs:
> Master detects that the server start message (from the restarted RS) is from a server it already knows about, but startcode is different. So, it triggers server recovery. Alternatively, the recovery will be triggered by ZNODE expiry in some cases depending on which ever event (restart of RS or Znode expiry) happens first. After that it does logs splits etc. for the failed RS; it then also removes the old region server/startcode from the deadservers map.
> {code}
> 2010-06-17 10:26:06,420 INFO org.apache.hadoop.hbase.master.ServerManager: Server start rejected; we already have 10.138.95.182:60020 registered; existingServer=serverName=store286.xyz.com,60020,1276629467680, load=(requests=22, regions=171, usedHeap=6549, maxHeap=11993), newServer=serverName=store286.xyz.com,60020,1276795566511, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
> 2010-06-17 10:26:06,420 INFO org.apache.hadoop.hbase.master.ServerManager: Triggering server recovery; existingServer looks stale
> 2010-06-17 10:26:06,420 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=store286.xyz.com,60020,1276629467680 to dead servers, added shutdown processing operation
> ... split log processing...
> 2010-06-17 10:29:51,317 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: Removed store286.xyz.com,60020,1276629467680 from deadservers Map
> {code}
> What follows is the relevant log snippet for one of the regions that gets double assigned.
> Master tries to assign the region to store285. 
> At 10:30:20,006, in ProcessRegionOpen, we update META with information about the new assignment. However, just around the same time, BaseScanner processes this entry (at 10:30:20,009), but finds that the region is still assigned to the old region server. There have been some fixes for double assignment in BaseScanner because BaseScanner might be doing a stale read depending on when it started. But looks like there is still another hole left.
> {code}
> 2010-06-17 10:30:10,186 INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region test1,9301760000,1276657884012.acff724037d739bab9af61e3edef0cc9. to store285.xyz.com,60020,1276629468460
> 2010-06-17 10:30:11,701 INFO org.apache.hadoop.hbase.master.ServerManager: Processing MSG_REPORT_PROCESS_OPEN: test1,9301760000,1276657884012.acff724037d739bab9af61e3edef0cc9. from store285.xyz.com,60020,1276629468460; 8 of
> 2010-06-17 10:30:12,800 INFO org.apache.hadoop.hbase.master.ServerManager: Processing MSG_REPORT_PROCESS_OPEN: test1,9301760000,1276657884012.acff724037d739bab9af61e3edef0cc9. from store285.xyz.com,60020,1276629468460; 7 of
> 2010-06-17 10:30:13,905 INFO org.apache.hadoop.hbase.master.ServerManager: Processing MSG_REPORT_PROCESS_OPEN: test1,9301760000,1276657884012.acff724037d739bab9af61e3edef0cc9. from store285.xyz.com,60020,1276629468460; 6 of
> ...
> 2010-06-17 10:30:20,001 INFO org.apache.hadoop.hbase.master.ServerManager: Processing MSG_REPORT_OPEN: test1,9301760000,1276657884012.acff724037d739bab9af61e3edef0cc9. from store285.xyz.com,60020,1276629468460; 1 of 3
> 2010-06-17 10:30:20,001 INFO org.apache.hadoop.hbase.master.RegionServerOperation: test1,9301760000,1276657884012.acff724037d739bab9af61e3edef0cc9. open on store285.xyz.com,60020,1276629468460
> 2010-06-17 10:30:20,006 INFO org.apache.hadoop.hbase.master.RegionServerOperation: Updated row test1,9301760000,1276657884012.acff724037d739bab9af61e3edef0cc9. in region .META.,,1 with startcode=1276629468460, server=store285.xyz.com:60020
> 2010-06-17 10:30:20,009 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of test1,9301760000,1276657884012.acff724037d739bab9af61e3edef0cc9. is not valid;  serverAddress=store286.xyz.com:60020, startCode=1276629467680 unknown.
> {code}
> At this point BaseScanner calls "this.master.getRegionManager().setUnassigned(info, true)" to set the region to be unassigned (even though it is assigned to store285). And later, this region is given to another region server (store287).
> {code}
> 2010-06-17 10:30:20,581 INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region test1,9301760000,1276657884012.acff724037d739bab9af61e3edef0cc9. to store287.xyz.com,60020,1276629468678
> 2010-06-17 10:30:25,525 INFO org.apache.hadoop.hbase.master.ServerManager: Processing MSG_REPORT_OPEN: test1,9301760000,1276657884012.acff724037d739bab9af61e3edef0cc9. from store287.xyz.com,60020,1276629468678; 6 of 6
> 2010-06-17 10:30:25,531 INFO org.apache.hadoop.hbase.master.RegionServerOperation: test1,9301760000,1276657884012.acff724037d739bab9af61e3edef0cc9. open on store287.xyz.com,60020,1276629468678
> 2010-06-17 10:30:25,534 INFO org.apache.hadoop.hbase.master.RegionServerOperation: Updated row test1,9301760000,1276657884012.acff724037d739bab9af61e3edef0cc9. in region .META.,,1 with startcode=1276629468678, server=store287.xyz.com:60020
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.