You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2009/01/13 08:38:59 UTC

[jira] Created: (HBASE-1124) Balancer kicks in way too early

Balancer kicks in way too early
-------------------------------

                 Key: HBASE-1124
                 URL: https://issues.apache.org/jira/browse/HBASE-1124
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: Andrew Purtell
             Fix For: 0.19.0


Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663267#action_12663267 ] 

Andrew Purtell commented on HBASE-1124:
---------------------------------------

Perhaps HBASE-1104 changed that?

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1124) Balancer kicks in way too early

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1124:
-------------------------

    Attachment: 1124.patch

This patch makes it so if a regionserver report includes N 'opening' messages, then we do not let the balancer run and add close messages to the message back.   Currently N is set to 3.  If > 3 opening messages, we'll not let balancer have a say.  We just skip it.

Testing it on my small cluster with 1400 regions on 4 nodes, it started without balancer running.

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: 1124.patch
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


+1 on 0.19.0 RC (was: Re: [jira] Commented: (HBASE-1124) Balancer kicks in way too early)

Posted by Andrew Purtell <ap...@apache.org>.
+1 on RC if it includes patches for 1124 and 1125. 

Thanks Stack and Jim!

   - Andy



      

[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663821#action_12663821 ] 

Jim Kellerman commented on HBASE-1124:
--------------------------------------

Nice patch. +1

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: 1124.patch
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663262#action_12663262 ] 

Andrew Purtell commented on HBASE-1124:
---------------------------------------

Initial region assignment is lumpy even with all HRS known (all check in while master is in safe mode). Why?

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-1124) Balancer kicks in way too early

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-1124:
----------------------------

    Assignee: stack

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>            Assignee: stack
>             Fix For: 0.19.0
>
>         Attachments: 1124.patch
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663598#action_12663598 ] 

Andrew Purtell commented on HBASE-1124:
---------------------------------------

Ok, I'll wait.

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: 1124.patch
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663421#action_12663421 ] 

stack commented on HBASE-1124:
------------------------------

Looking at Andrew's logs, you're both 'right'.

Yes, balancer doesn't cut in till regions are all assigned only, when big cluster there is a big gap between all assigned and all open.  In this gap, I see in Andrew's log the balancer cutting in.  We don't want it working here while all regionservers have a big queue of region opens that they are currently working on.

Here is an example.

All regions have been handed out and master is just waiting on the opens to come in.

{code}
....
009-01-13 06:57:09,006 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: result_domain,com.chawlk,1231796870012 from XX.XX.XX.37:60020
2009-01-13 06:57:09,006 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: content,28e2ec17934b05f11a77a88b1528d905,1231822159077 from XX.XX.XX.37:60020
2009-01-13 06:57:09,006 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 10.30.94.37:60020 is overloaded. Server load: 26 avg: 21.0, slop: 0.2
2009-01-13 06:57:09,006 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 5 regions. mostLoadedRegions has 10 regions in it.
2009-01-13 06:57:09,006 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region content,afebbf5e615585830ebe6f74e1014f3d,1231766212960
2009-01-13 06:57:09,006 INFO org.apache.hadoop.hbase.master.RegionManager: Skipped 9 region(s) that are in transition states
...
{code}

Above we are closing 'content,afebbf5e615585830ebe6f74e1014f3d,1231766212960' which had just opened 3 seconds earlier.  About 5% of all regions assigned have reported back as opened.  We shouldn't be balancing at this time.

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663427#action_12663427 ] 

stack commented on HBASE-1124:
------------------------------

What if when a HRS came in and if it has a MSG_REPORT_PROCESS_OPEN, then we do not do balancing?

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663370#action_12663370 ] 

Jim Kellerman commented on HBASE-1124:
--------------------------------------

Ron-En is correct. The region balancer does not kick in until all regions have been assigned. See master.RegionManager.assignRegions lines 187 - 204.

HBASE-1104 should have had no effect on region assignment other than preventing assignment to multiple servers.

During any region assignment, the master will assign up to RegionManager.maxAssignInOneGo (default = 10). Depending on the number of regions and the responsiveness of the region servers, some may get (a lot) more regions assigned than others. After the initial assignment of all regions, region balancing then kicks in and may result a lot of churn until it has balanced the region load.

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663585#action_12663585 ] 

Andrew Purtell commented on HBASE-1124:
---------------------------------------

The lease timeouts got me wondering so I ran Wireshark and looked over some packet traces. The lease timeouts are legit. Can't blame master if HRS are not contacting it in time.

HRPC requires prompt name resolution when (re)establishing connections for IPC. Affects all aspects of system operation: HBase heartbeats, DFS block shuffling and replication, etc. Increase DNS resolver latency and HDFS and HBase become unstable.

Root cause might just be overloaded DNS servers -- BIND cache too large, swapping. Taking steps now, will monitor to see what happens. 

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663655#action_12663655 ] 

Andrew Purtell commented on HBASE-1124:
---------------------------------------

+1

Great patch, stack. Seems to work as advertised. Cluster comes up fast. I see some reassignment activity toward the end but it's not like the churn before. I have ~500 regions now. All regions are assigned out and opened before any HRS leaves safe mode and begins to compact/split. 

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: 1124.patch
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (HBASE-1124) Balancer kicks in way too early

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell reopened HBASE-1124:
-----------------------------------


Whoops, reopen. 

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663642#action_12663642 ] 

stack commented on HBASE-1124:
------------------------------

I can't get on the cluster this evening Andrew (I'm down a hole and can't get out).  Since you're leaving in morning and if you are currently twiddling your thumbs -- can you even do that? --  would be interesting to hear what happens when you do startup with this patch in place and see if it helps

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: 1124.patch
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-1124) Balancer kicks in way too early

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-1124.
--------------------------

    Resolution: Fixed

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: 1124.patch
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "Rong-En Fan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663264#action_12663264 ] 

Rong-En Fan commented on HBASE-1124:
------------------------------------

I *thought* Jim changed the code (HBASE-918) so that the balancer won't kick in until all regions are assigned...

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-1124) Balancer kicks in way too early

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell resolved HBASE-1124.
-----------------------------------

    Resolution: Invalid

Thanks Jim. You definitely know the master better than I. I'll close this issue as invalid but there's definitely something here. Will work with stack to file new issues. 

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663592#action_12663592 ] 

stack commented on HBASE-1124:
------------------------------

I think this patch turns off load balancing for startup, crashed server (just tried it).  Need to make sure it don't break anything else.

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: 1124.patch
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663590#action_12663590 ] 

Andrew Purtell commented on HBASE-1124:
---------------------------------------

Deploying patch now.

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: 1124.patch
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1124) Balancer kicks in way too early

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663259#action_12663259 ] 

Andrew Purtell commented on HBASE-1124:
---------------------------------------

In some tests, I observed that master did not respond in time to a heartbeat from the HRS carrying META, so it would reinitialize. Master would then try to update META from ProcessRegionOpen but update would fail (processBatchOfRows complains about HRS assigned to META not accepting writes). Observed warnings about META being partially offline. This was a terminal state. 

> Balancer kicks in way too early
> -------------------------------
>
>                 Key: HBASE-1124
>                 URL: https://issues.apache.org/jira/browse/HBASE-1124
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>
> Balancer kicks in before all regions are assigned out. Causes confusion. Master won't accept OPENs from "overloaded" HRS. Master is slow to respond to UI and HRS during. Master sometimes takes too long to respond to a HRS heartbeat and so the HRS will reinit. This causes more confusion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.