You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2008/04/01 01:06:25 UTC

[jira] Created: (HBASE-555) Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series

Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series
---------------------------------------------------------------------------------------------------------------------------------

                 Key: HBASE-555
                 URL: https://issues.apache.org/jira/browse/HBASE-555
             Project: Hadoop HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.1.0, 0.16.0, 0.2.0
            Reporter: stack
            Priority: Blocker


On the Lars clusters, he's up into the thousands of regions.  Starting this cluster, there is a load of churn in the master log as we assign regions, they report their opening and then after the hbase.hbasemaster.maxregionopen of one minute elapses, we assign the region elsewhere.

Problem seems to be the fact that we only run a single Worker thread in our regionserver; means that region opens are processed in series.

For example, the below shows when a master assigned a region and then the regionserver side log when it got around to opening it:

{code}
2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060158177_20060720,1205765009844 to server 192.168.105.19:60020
..
2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
{code}

There is > 2 minutes between the two loggings (I checked clocks on this cluster and they are synced).

Looking in the regionserver log, its just filled with logging on the opening of regions.  The region opens are running pretty fast at about a second each but there are hundreds of regions to open in this case so its easy to go over our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-555) Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-555:
------------------------

    Attachment: 555-0.1-v2.patch

Add to original patch assigning max of ten regions at a time.

> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-555
>                 URL: https://issues.apache.org/jira/browse/HBASE-555
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.16.0, 0.2.0, 0.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>         Attachments: 555-0.1-v2.patch, 555-0.1.patch
>
>
> On the Lars clusters, he's up into the thousands of regions.  Starting this cluster, there is a load of churn in the master log as we assign regions, they report their opening and then after the hbase.hbasemaster.maxregionopen of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060158177_20060720,1205765009844 to server 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening of regions.  The region opens are running pretty fast at about a second each but there are hundreds of regions to open in this case so its easy to go over our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-555) Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-555:
------------------------

    Attachment: 555-0.1.patch

{code}
M  src/java/org/apache/hadoop/hbase/HRegionServer.java
   (housekeeping): Call a housekeeping method before we go into
   hibernation.  Currently its only task is review of the todo
   list and adding MSG_REGION_PROCESS_OPEN if any regions waiting
   to be opened in the queue.
   (addProcessMessage): Method to add MSG_REPORT_PROCESS_OPEN to
   messages to send the server.
{code}

> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-555
>                 URL: https://issues.apache.org/jira/browse/HBASE-555
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.16.0, 0.2.0, 0.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>         Attachments: 555-0.1.patch
>
>
> On the Lars clusters, he's up into the thousands of regions.  Starting this cluster, there is a load of churn in the master log as we assign regions, they report their opening and then after the hbase.hbasemaster.maxregionopen of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060158177_20060720,1205765009844 to server 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening of regions.  The region opens are running pretty fast at about a second each but there are hundreds of regions to open in this case so its easy to go over our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-555) Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584318#action_12584318 ] 

stack commented on HBASE-555:
-----------------------------

Going to commit this thing.  Just tested it over on the Lars cluster (2100 regions on 20 servers).  The two fixes in this patch made it so on restart, there were no more "should not have opened region's" -- all came up smoothly and regions are spread pretty evenly (looks better than what it used to be but I only did one restart).

> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-555
>                 URL: https://issues.apache.org/jira/browse/HBASE-555
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.16.0, 0.2.0, 0.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>         Attachments: 555-0.1-v2.patch, 555-0.1.patch
>
>
> On the Lars clusters, he's up into the thousands of regions.  Starting this cluster, there is a load of churn in the master log as we assign regions, they report their opening and then after the hbase.hbasemaster.maxregionopen of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060158177_20060720,1205765009844 to server 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening of regions.  The region opens are running pretty fast at about a second each but there are hundreds of regions to open in this case so its easy to go over our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-555) Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583932#action_12583932 ] 

stack commented on HBASE-555:
-----------------------------

Couple of ideas:

Create a worker thread for every message.  That'd be a worker per region to open.  If hundreds, not so smart as all would be contending.  So perhaps an upper bound on threads created.  But then we'd just have same issue again where we'd have queued opens that were not being serviced?

So maybe single Worker ain't so bad.  Issue then is making it so we report the master that regions are being worked on.  Could take out an iterator and queue a PROCESSING message per queued region.

Trying to come up w/ minimal patch for 0.1.  We can fix better in 0.2.

> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-555
>                 URL: https://issues.apache.org/jira/browse/HBASE-555
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.16.0, 0.2.0, 0.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>
> On the Lars clusters, he's up into the thousands of regions.  Starting this cluster, there is a load of churn in the master log as we assign regions, they report their opening and then after the hbase.hbasemaster.maxregionopen of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060158177_20060720,1205765009844 to server 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening of regions.  The region opens are running pretty fast at about a second each but there are hundreds of regions to open in this case so its easy to go over our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-555) Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584302#action_12584302 ] 

Jim Kellerman commented on HBASE-555:
-------------------------------------

Reviewed new patch. +1

> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-555
>                 URL: https://issues.apache.org/jira/browse/HBASE-555
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.16.0, 0.2.0, 0.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>         Attachments: 555-0.1-v2.patch, 555-0.1.patch
>
>
> On the Lars clusters, he's up into the thousands of regions.  Starting this cluster, there is a load of churn in the master log as we assign regions, they report their opening and then after the hbase.hbasemaster.maxregionopen of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060158177_20060720,1205765009844 to server 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening of regions.  The region opens are running pretty fast at about a second each but there are hundreds of regions to open in this case so its easy to go over our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HBASE-555) Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-555:
---------------------------

    Assignee: stack

> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-555
>                 URL: https://issues.apache.org/jira/browse/HBASE-555
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.16.0, 0.2.0, 0.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>
> On the Lars clusters, he's up into the thousands of regions.  Starting this cluster, there is a load of churn in the master log as we assign regions, they report their opening and then after the hbase.hbasemaster.maxregionopen of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060158177_20060720,1205765009844 to server 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening of regions.  The region opens are running pretty fast at about a second each but there are hundreds of regions to open in this case so its easy to go over our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-555) Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584269#action_12584269 ] 

stack commented on HBASE-555:
-----------------------------

This patch should include upper bound on how many regions we assign at a time.  Here is what poor old server .15 gets when he innocently reports in for duty:

{code}
2008-04-01 00:06:58,851 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20070223009_20070927,1205860531876 to server 192.168.105.19:600202008-04-01 00:06:58,853 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20070176389_20070802,1205827644321 to server 192.168.105.19:600202008-04-01 00:06:58,853 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5980569_19991109,1205804361124 to server 192.168.105.19:600202008-04-01 00:06:58,853 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5039934_19910813,1205720875870 to server 192.168.105.19:600202008-04-01 00:06:58,853 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6294754_20010925,1205782333675 to server 192.168.105.19:600202008-04-01 00:06:58,853 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6263923_20010724,1206042922873 to server 192.168.105.19:600202008-04-01 00:06:58,854 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5932059_19990803,1205808352696 to server 192.168.105.19:600202008-04-01 00:06:58,854 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,EP00940773NWA1,1206819926889 to server 192.168.105.19:600202008-04-01 00:06:58,854 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20070070734_20070329,1205797891897 to server 192.168.105.19:600202008-04-01 00:06:58,854 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6651733_20031125,1205799265993 to server 192.168.105.19:600202008-04-01 00:06:58,854 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20030084507_20030508,1205783548816 to server 192.168.105.19:600202008-04-01 00:06:58,854 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5750981_19980512,1205695699495 to server 192.168.105.19:600202008-04-01 00:06:58,854 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20040244841_20041209,1205813956207 to server 192.168.105.19:600202008-04-01 00:06:58,854 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6341961_20020129,1205838276676 to server 192.168.105.19:600202008-04-01 00:06:58,855 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20040178030_20040916,1205745073891 to server 192.168.105.19:600202008-04-01 00:06:58,855 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20070053028_20070308,1205799199459 to server 192.168.105.19:600202008-04-01 00:06:58,855 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5956468_19990921,1205807569500 to server 192.168.105.19:600202008-04-01 00:06:58,855 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6273754_20010814,1205780845479 to server 192.168.105.19:600202008-04-01 00:06:58,855 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20040027221_20040212,1205768933859 to server 192.168.105.19:600202008-04-01 00:06:58,855 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20020168470_20021114,1205839625070 to server 192.168.105.19:600202008-04-01 00:06:58,855 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6433670_20020813,1205883860520 to server 192.168.105.19:600202008-04-01 00:06:58,855 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20080016321_20080117,1205710780019 to server 192.168.105.19:600202008-04-01 00:06:58,855 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20020193527_20021219,1205835604483 to server 192.168.105.19:600202008-04-01 00:06:58,856 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,EP95306226NWA1,1206744544751 to server 192.168.105.19:600202008-04-01 00:06:58,856 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,EP97305470NWA2,1206746225615 to server 192.168.105.19:600202008-04-01 00:06:58,856 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20010053170_20011220,1205853437060 to server 192.168.105.19:600202008-04-01 00:06:58,856 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20050183021_20050818,1205844147766 to server 192.168.105.19:600202008-04-01 00:06:58,856 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20020155447_20021024,1205837665094 to server 192.168.105.19:600202008-04-01 00:06:58,856 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5781595_19980714,1205694512635 to server 192.168.105.19:600202008-04-01 00:06:58,856 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,EP00303162NWA1,1206745983715 to server 192.168.105.19:600202008-04-01 00:06:58,856 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20010047773_20011206,1205853775465 to server 192.168.105.19:600202008-04-01 00:06:58,856 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5485829_19960123,1205788999315 to server 192.168.105.19:600202008-04-01 00:06:58,857 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6457870_20021001,1205883490706 to server 192.168.105.19:600202008-04-01 00:06:58,857 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20030036046_20030220,1205786541377 to server 192.168.105.19:600202008-04-01 00:06:58,857 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5381387_19950110,1205758160021 to server 192.168.105.19:600202008-04-01 00:06:58,857 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20040226216_20041118,1205814276434 to server 192.168.105.19:600202008-04-01 00:06:58,857 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5093930_19920303,1205714978791 to server 192.168.105.19:600202008-04-01 00:06:58,857 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20020142901_20021003,1205834554230 to server 192.168.105.19:600202008-04-01 00:06:58,857 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20080023488_20080131,1205705334720 to server 192.168.105.19:600202008-04-01 00:06:58,857 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,EP94107781NWA1,1206821060442 to server 192.168.105.19:600202008-04-01 00:06:58,857 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20020161344_20021031,1205837872228 to server 192.168.105.19:600202008-04-01 00:06:58,858 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20070049359_20070301,1205799262064 to server 192.168.105.19:600202008-04-01 00:06:58,858 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060132645_20060622,1205723325691 to server 192.168.105.19:600202008-04-01 00:06:58,858 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6944277_20050913,1205811736641 to server 192.168.105.19:600202008-04-01 00:06:58,858 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,EP05251320NWA2,1206745588948 to server 192.168.105.19:600202008-04-01 00:06:58,858 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20050045317_20050303,1205775329018 to server 192.168.105.19:600202008-04-01 00:06:58,858 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20050015798_20050120,1205775930084 to server 192.168.105.19:600202008-04-01 00:06:58,858 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5586783_19961224,1205768599392 to server 192.168.105.19:600202008-04-01 00:06:58,858 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5951449_19990914,1205805748422 to server 192.168.105.19:600202008-04-01 00:06:58,858 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6074923_20000613,1205828088350 to server 192.168.105.19:600202008-04-01 00:06:58,859 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6874596_20050405,1205873222614 to server 192.168.105.19:600202008-04-01 00:06:58,859 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6334752_20020101,1205834669245 to server 192.168.105.19:600202008-04-01 00:06:58,859 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20020132259_20020919,1205835048047 to server 192.168.105.19:600202008-04-01 00:06:58,859 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,EP99923891NWA1,1206819796374 to server 192.168.105.19:600202008-04-01 00:06:58,859 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20050033656_20050210,1205774486855 to server 192.168.105.19:600202008-04-01 00:06:58,859 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6901250_20050531,1205838762335 to server 192.168.105.19:600202008-04-01 00:06:58,859 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20030162836_20030828,1205729556792 to server 192.168.105.19:600202008-04-01 00:06:58,859 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5964518_19991012,1205807734920 to server 192.168.105.19:600202008-04-01 00:06:58,859 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20050021366_20050127,1205776780745 to server 192.168.105.19:600202008-04-01 00:06:58,860 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20070225042_20070927,1205858223863 to server 192.168.105.19:600202008-04-01 00:06:58,860 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5368612_19941129,1205728257271 to server 192.168.105.19:600202008-04-01 00:06:58,860 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20050237098_20051027,1205752084253 to server 192.168.105.19:60020
2008-04-01 00:06:58,860 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6252229_20010626,1205783728462 to server 192.168.105.19:600202008-04-01 00:06:58,860 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6501540_20021231,1205864683038 to server 192.168.105.19:600202008-04-01 00:06:58,860 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6088180_20000711,1205826753254 to server 192.168.105.19:600202008-04-01 00:06:58,860 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5717290_19980210,1205695728836 to server 192.168.105.19:600202008-04-01 00:06:58,860 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,EP99905280NWA1,1206819796374 to server 192.168.105.19:600202008-04-01 00:06:58,861 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20030006741_20030109,1205786460019 to server 192.168.105.19:600202008-04-01 00:06:58,861 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US7227408_20070605,1205693336841 to server 192.168.105.19:600202008-04-01 00:06:58,861 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20070191249_20070816,1205827681299 to server 192.168.105.19:600202008-04-01 00:06:58,861 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5726633_19980310,1205695056643 to server 192.168.105.19:600202008-04-01 00:06:58,861 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US4421936_19831220,1205691633404 to server 192.168.105.19:600202008-04-01 00:06:58,861 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6883063_20050419,1205876207585 to server 192.168.105.19:600202008-04-01 00:06:58,861 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6993367_20060131,1205812286647 to server 192.168.105.19:600202008-04-01 00:06:58,861 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20040032680_20040219,1205767506895 to server 192.168.105.19:600202008-04-01 00:06:58,862 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5074216_19911224,1205714927547 to server 192.168.105.19:600202008-04-01 00:06:58,862 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20070288792_20071213,1205859526643 to server 192.168.105.19:600202008-04-01 00:06:58,862 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US4606287_19860819,1205675999782 to server 192.168.105.19:600202008-04-01 00:06:58,862 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5340031_19940823,1205726837938 to server 192.168.105.19:600202008-04-01 00:06:58,862 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5426911_19950627,1205790317865 to server 192.168.105.19:600202008-04-01 00:06:58,862 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6055976_20000502,1205828098128 to server 192.168.105.19:600202008-04-01 00:06:58,862 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6728058_20040427,1205855011163 to server 192.168.105.19:600202008-04-01 00:06:58,862 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060198313_20060907,1205751115734 to server 192.168.105.19:600202008-04-01 00:06:58,862 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6104363_20000815,1205793128469 to server 192.168.105.19:600202008-04-01 00:06:58,863 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20070213898_20070913,1205862269967 to server 192.168.105.19:600202008-04-01 00:06:58,863 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,EP96901122NWA1,1206819790225 to server 192.168.105.19:600202008-04-01 00:06:58,863 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5029810_19910709,1205720959956 to server 192.168.105.19:600202008-04-01 00:06:58,863 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5816096_19981006,1205709691980 to server 192.168.105.19:600202008-04-01 00:06:58,863 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US7146851_20061212,1205679297543 to server 192.168.105.19:600202008-04-01 00:06:58,863 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20030088471_20030508,1205788707432 to server 192.168.105.19:600202008-04-01 00:06:58,863 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US4495261_19850122,1205690337442 to server 192.168.105.19:600202008-04-01 00:06:58,863 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US7183766_20070227,1205681384269 to server 192.168.105.19:600202008-04-01 00:06:58,864 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20020053547_20020509,1205810627716 to server 192.168.105.19:600202008-04-01 00:06:58,864 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US4672707_19870616,1205674955927 to server 192.168.105.19:600202008-04-01 00:06:58,864 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6702864_20040309,1205855019624 to server 192.168.105.19:600202008-04-01 00:06:58,864 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,EP97307534NWA2,1206746225615 to server 192.168.105.19:600202008-04-01 00:06:58,864 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060233634_20061019,1205760231971 to server 192.168.105.19:600202008-04-01 00:06:58,864 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,EP94201405NWA2,1205775922023 to server 192.168.105.19:600202008-04-01 00:06:58,864 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20040152203_20040805,1205745590108 to server 192.168.105.19:600202008-04-01 00:06:58,864 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20020073831_20020620,1205807980345 to server 192.168.105.19:600202008-04-01 00:06:58,864 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20050097179_20050505,1205777885543 to server 192.168.105.19:600202008-04-01 00:06:58,865 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20070255107_20071101,1205858558826 to server 192.168.105.19:600202008-04-01 00:06:58,865 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6543920_20030408,1205868417155 to server 192.168.105.19:600202008-04-01 00:06:58,865 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20050146279_20050707,1205850390258 to server 192.168.105.19:600202008-04-01 00:06:58,865 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6404147_20020611,1205882087032 to server 192.168.105.19:600202008-04-01 00:06:58,865 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US7271671_20070918,1205692997730 to server 192.168.105.19:600202008-04-01 00:06:58,865 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060252822_20061109,1205755072012 to server 192.168.105.19:600202008-04-01 00:06:58,865 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5490702_19960213,1205788999315 to server 192.168.105.19:600202008-04-01 00:06:58,865 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6210006_20010403,1205780665633 to server 192.168.105.19:600202008-04-01 00:06:58,865 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5090537_19920225,1205720385674 to server 192.168.105.19:600202008-04-01 00:06:58,866 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20050165964_20050728,1205845409575 to server 192.168.105.19:600202008-04-01 00:06:58,866 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US5760377_19980602,1205692663986 to server 192.168.105.19:600202008-04-01 00:06:58,866 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20020154020_20021024,1205837665094 to server 192.168.105.19:600202008-04-01 00:06:58,866 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6838426_20050104,1205873519309 to server 192.168.105.19:600202008-04-01 00:06:58,866 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US6630043_20031007,1205800273625 to server 192.168.105.19:600202008-04-01 00:06:58,866 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20070000242_20070104,1205799280126 to server 192.168.105.19:60020
{code}

A hundred plus regions...

> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-555
>                 URL: https://issues.apache.org/jira/browse/HBASE-555
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.16.0, 0.2.0, 0.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>         Attachments: 555-0.1.patch
>
>
> On the Lars clusters, he's up into the thousands of regions.  Starting this cluster, there is a load of churn in the master log as we assign regions, they report their opening and then after the hbase.hbasemaster.maxregionopen of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060158177_20060720,1205765009844 to server 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening of regions.  The region opens are running pretty fast at about a second each but there are hundreds of regions to open in this case so its easy to go over our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-555) Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-555:
------------------------

    Comment: was deleted

> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-555
>                 URL: https://issues.apache.org/jira/browse/HBASE-555
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.16.0, 0.2.0, 0.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>         Attachments: 555-0.1.patch
>
>
> On the Lars clusters, he's up into the thousands of regions.  Starting this cluster, there is a load of churn in the master log as we assign regions, they report their opening and then after the hbase.hbasemaster.maxregionopen of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060158177_20060720,1205765009844 to server 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening of regions.  The region opens are running pretty fast at about a second each but there are hundreds of regions to open in this case so its easy to go over our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-555) Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584274#action_12584274 ] 

stack commented on HBASE-555:
-----------------------------

When .19 server shows up, gets 100+ regions.  This patch should include upper bound on how many regions we assign at a time.

Looking at the Lars cluster, this patch seems to be doing what its supposed to.... Its ugly in that currently there is a log for every MSG_REPORT_PROCESS_OPEN message -- one per region every time it reports in -- but its just during startup (Previous our startup logs were clogged with reassigning regions already assigned).

Here is illustration that patch is basically working...  After assignment, 7 seconds after last assigned region message is logged we see this:

{code}
2008-04-01 00:07:05,460 DEBUG org.apache.hadoop.hbase.HMaster: Received MSG_REPORT_PROCESS_OPEN : pdc-docs,US20070223009_20070927,1205860531876 from 192.168.105.19:60020
{code}

A few regions open, then we get this:

{code}
2008-04-01 00:07:11,528 DEBUG org.apache.hadoop.hbase.HMaster: Received MSG_REPORT_PROCESS_OPEN : pdc-docs,US4881767_19891121,1205704528908 from 192.168.105.19:60020
{code}

... about 6 seconds after one from previous batch... then...

{code}
2008-04-01 00:07:14,534 DEBUG org.apache.hadoop.hbase.HMaster: Received MSG_REPORT_PROCESS_OPEN : pdc-docs,US5399923_19950321,1205793104251 from 192.168.105.19:60020
{code}

later....

{code}
2008-04-01 00:07:23,614 DEBUG org.apache.hadoop.hbase.HMaster: Received MSG_REPORT_PROCESS_OPEN : pdc-docs,EP04011653NWA1,1205771873299 from 192.168.105.19:60020
{code}

etc.


> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-555
>                 URL: https://issues.apache.org/jira/browse/HBASE-555
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.16.0, 0.2.0, 0.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>         Attachments: 555-0.1.patch
>
>
> On the Lars clusters, he's up into the thousands of regions.  Starting this cluster, there is a load of churn in the master log as we assign regions, they report their opening and then after the hbase.hbasemaster.maxregionopen of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060158177_20060720,1205765009844 to server 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening of regions.  The region opens are running pretty fast at about a second each but there are hundreds of regions to open in this case so its easy to go over our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HBASE-555) Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-555.
-------------------------

       Resolution: Fixed
    Fix Version/s: 0.1.1

Committed branch and trunk.

> Only one Worker in HRS; on startup, if assigned tens of regions, havoc of reassignments because open processing is done in series
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-555
>                 URL: https://issues.apache.org/jira/browse/HBASE-555
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.16.0, 0.2.0, 0.1.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.1.1
>
>         Attachments: 555-0.1-v2.patch, 555-0.1.patch
>
>
> On the Lars clusters, he's up into the thousands of regions.  Starting this cluster, there is a load of churn in the master log as we assign regions, they report their opening and then after the hbase.hbasemaster.maxregionopen of one minute elapses, we assign the region elsewhere.
> Problem seems to be the fact that we only run a single Worker thread in our regionserver; means that region opens are processed in series.
> For example, the below shows when a master assigned a region and then the regionserver side log when it got around to opening it:
> {code}
> 2008-03-29 04:48:51,638 INFO org.apache.hadoop.hbase.HMaster: assigning region pdc-docs,US20060158177_20060720,1205765009844 to server 192.168.105.19:60020
> ..
> 2008-03-29 04:50:58,124 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN : pdc-docs,US20060158177_20060720,1205765009844
> {code}
> There is > 2 minutes between the two loggings (I checked clocks on this cluster and they are synced).
> Looking in the regionserver log, its just filled with logging on the opening of regions.  The region opens are running pretty fast at about a second each but there are hundreds of regions to open in this case so its easy to go over our default of 60 seconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.