You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2011/07/04 05:41:21 UTC

[jira] [Updated] (HBASE-4053) Most of the regions were added into AssignmentManager#servers twice

     [ https://issues.apache.org/jira/browse/HBASE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-4053:
--------------------------

    Attachment: 4053.txt

This patch replaces List<HRegionInfo> of servers with Set<HRegionInfo>

> Most of the regions were added into AssignmentManager#servers twice
> -------------------------------------------------------------------
>
>                 Key: HBASE-4053
>                 URL: https://issues.apache.org/jira/browse/HBASE-4053
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.3
>            Reporter: Jieshan Bean
>             Fix For: 0.90.4
>
>         Attachments: 4053.txt
>
>
> Here's the scenario of how did the problem happened:
> 1. When HMaster start, all regionservers checkin ok, and count of regions out on cluster is 10083, which is the actual region number count.
> 2. Then OpenedRegionHandler#process received zookeeper's events, and added 9923 regions to the hris list.
>    but the 9923 regions already exists, force added.
> 3. The LoadBalancer get the wrong Region numbers of 20006 (10083 + 9923).
> AssignmentManager#addToServers method:
> private void addToServers(final HServerInfo hsi, final HRegionInfo hri) {
>   List<HRegionInfo> hris = servers.get(hsi);
>   if (hris == null) {
>     hris = new ArrayList<HRegionInfo>();
>     servers.put(hsi, hris);
>   }
>   hris.add(hri); // Same region was double added here
> }
> logs:
> 2011-06-27 16:13:06,845 INFO org.apache.hadoop.hbase.master.ServerManager: Exiting wait on regionserver(s) to checkin; count=3, stopped=false, count of regions out on cluster=10083
> 2011-06-27 16:13:17,334 INFO org.apache.hadoop.hbase.master.AssignmentManager: Failed-over master needs to process 9923 regions in transition
> 2011-06-27 16:21:45,135 DEBUG org.apache.hadoop.hbase.master.LoadBalancer: Balance parameter: numRegions=20006, numServers=3, max=6669, min=6668

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira