You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2008/04/01 01:40:28 UTC

[jira] Commented: (HBASE-555) Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series

    [ https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583932#action_12583932 ] 

stack commented on HBASE-555:
-----------------------------

Couple of ideas:

Create a worker thread for every message.  That'd be a worker per region to open.  If hundreds, not so smart as all would be contending.  So perhaps an upper bound on threads created.  But then we'd just have same issue again where we'd have queued opens that were not being serviced?

So maybe single Worker ain't so bad.  Issue then is making it so we report the master that regions are being worked on.  Could take out an iterator and queue a PROCESSING message per queued region.

Trying to come up w/ minimal patch for 0.1.  We can fix better in 0.2.

> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-555
>                 URL: https://issues.apache.org/jira/browse/HBASE-555
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.16.0, 0.2.0, 0.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>
> On the Lars clusters, he's up into the thousands of regions.  Starting this cluster, there is a load of churn in the master log as we assign regions, they report their opening and then after the hbase.hbasemaster.maxregionopen of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060158177_20060720,1205765009844 to server 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening of regions.  The region opens are running pretty fast at about a second each but there are hundreds of regions to open in this case so its easy to go over our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.