You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by ji...@apache.org on 2008/10/11 21:04:10 UTC
svn commit: r703711 - in /hadoop/hbase/trunk: CHANGES.txt
src/java/org/apache/hadoop/hbase/RegionHistorian.java
src/java/org/apache/hadoop/hbase/master/RegionManager.java
Author: jimk
Date: Sat Oct 11 12:04:09 2008
New Revision: 703711
URL: http://svn.apache.org/viewvc?rev=703711&view=rev
Log:
HBASE-918 Region balancing during startup makes cluster unstable
Modified:
hadoop/hbase/trunk/CHANGES.txt
hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java
hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/master/RegionManager.java
Modified: hadoop/hbase/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/hbase/trunk/CHANGES.txt?rev=703711&r1=703710&r2=703711&view=diff
==============================================================================
--- hadoop/hbase/trunk/CHANGES.txt (original)
+++ hadoop/hbase/trunk/CHANGES.txt Sat Oct 11 12:04:09 2008
@@ -20,6 +20,7 @@
HBASE-837 Add unit tests for ThriftServer.HBaseHandler (Izaak Rubin via Stack)
HBASE-913 Classes using log4j directly
HBASE-914 MSG_REPORT_CLOSE has a byte array for a message
+ HBASE-918 Region balancing during startup makes cluster unstable
IMPROVEMENTS
HBASE-901 Add a limit to key length, check key and value length on client side
Modified: hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java
URL: http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java?rev=703711&r1=703710&r2=703711&view=diff
==============================================================================
--- hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java (original)
+++ hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java Sat Oct 11 12:04:09 2008
@@ -86,7 +86,7 @@
* Get the RegionHistorian Singleton instance.
* @return The region historian
*/
- public static RegionHistorian getInstance() {
+ public synchronized static RegionHistorian getInstance() {
if (historian == null) {
historian = new RegionHistorian();
}
Modified: hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/master/RegionManager.java
URL: http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/master/RegionManager.java?rev=703711&r1=703710&r2=703711&view=diff
==============================================================================
--- hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/master/RegionManager.java (original)
+++ hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/master/RegionManager.java Sat Oct 11 12:04:09 2008
@@ -178,16 +178,19 @@
// worked on elsewhere.
Set<HRegionInfo> regionsToAssign = regionsAwaitingAssignment();
if (regionsToAssign.size() == 0) {
- // There are no regions waiting to be assigned. This is an opportunity
- // for us to check if this server is overloaded.
- double avgLoad = master.serverManager.getAverageLoad();
- if (avgLoad > 2.0 && thisServersLoad.getNumberOfRegions() > avgLoad) {
- if (LOG.isDebugEnabled()) {
- LOG.debug("Server " + serverName + " is overloaded. Server load: " +
- thisServersLoad.getNumberOfRegions() + " avg: " + avgLoad);
+ // There are no regions waiting to be assigned.
+ if (allRegionsAssigned()) {
+ // We only do load balancing once all regions are assigned.
+ // This prevents churn while the cluster is starting up.
+ double avgLoad = master.serverManager.getAverageLoad();
+ if (avgLoad > 2.0 && thisServersLoad.getNumberOfRegions() > avgLoad) {
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("Server " + serverName + " is overloaded. Server load: " +
+ thisServersLoad.getNumberOfRegions() + " avg: " + avgLoad);
+ }
+ unassignSomeRegions(thisServersLoad, avgLoad, mostLoadedRegions,
+ returnMsgs);
}
- unassignSomeRegions(thisServersLoad, avgLoad, mostLoadedRegions,
- returnMsgs);
}
} else {
// if there's only one server, just give it all the regions
@@ -843,6 +846,15 @@
}
/**
+ * @return true if the initial meta scan is complete and there are no
+ * unassigned or pending regions
+ */
+ public boolean allRegionsAssigned() {
+ return isInitialMetaScanComplete() && unassignedRegions.size() == 0 &&
+ pendingRegions.size() == 0;
+ }
+
+ /**
* Get the root region location.
* @return HServerAddress describing root region server.
*/
RE: svn commit: r703711 - in /hadoop/hbase/trunk: CHANGES.txt
src/java/org/apache/hadoop/hbase/RegionHistorian.java
src/java/org/apache/hadoop/hbase/master/RegionManager.java
Posted by "Jim Kellerman (POWERSET)" <Ji...@microsoft.com>.
919 was not fixed in the patch for 918 ultimately. 918 only
addresses not doing load balancing until all regions are
assigned.
919 is more related to "safe mode" in which the master does
not return a root region address until all regions are
assigned. This behavior prevented the cluster from starting
since the servers make use of HTable rather than the
HRegionInterface API.
---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
> -----Original Message-----
> From: stack [mailto:stack@duboce.net]
> Sent: Sunday, October 12, 2008 5:33 PM
> To: hbase-dev@hadoop.apache.org
> Subject: Re: svn commit: r703711 - in /hadoop/hbase/trunk: CHANGES.txt
> src/java/org/apache/hadoop/hbase/RegionHistorian.java
> src/java/org/apache/hadoop/hbase/master/RegionManager.java
>
> I was just interested in what in the patch addressed HBASE-919? (Am I
> wrong thinking that 919 got in the way of the fixing of this issue?)
> Thanks,
> St.Ack
>
> Jim Kellerman (POWERSET) wrote:
> > No, the change to RegionHistorian was only to make the instantiation
> > of the singleton thread safe, which it was not before.
> >
> > ---
> > Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
> >
> >
> >
> >> -----Original Message-----
> >> From: stack [mailto:stack@duboce.net]
> >> Sent: Saturday, October 11, 2008 4:02 PM
> >> To: hbase-dev@hadoop.apache.org
> >> Subject: Re: svn commit: r703711 - in /hadoop/hbase/trunk: CHANGES.txt
> >> src/java/org/apache/hadoop/hbase/RegionHistorian.java
> >> src/java/org/apache/hadoop/hbase/master/RegionManager.java
> >>
> >> jimk@apache.org wrote:
> >>
> >>> + HBASE-918 Region balancing during startup makes cluster unstable
> >>> ...
> >>> URL:
> >>>
> >>
> http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/java/org/apache/hadoop
> >> /hbase/RegionHistorian.java?rev=703711&r1=703710&r2=703711&view=diff
> >>
> >>
> ==========================================================================
> >> ====
> >>
> >>> ---
> >>>
> >>
> hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java
> >> (original)
> >>
> >>> +++
> >>>
> >>
> hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java
> >> Sat Oct 11 12:04:09 2008
> >>
> >>> @@ -86,7 +86,7 @@
> >>> * Get the RegionHistorian Singleton instance.
> >>> * @return The region historian
> >>> */
> >>> - public static RegionHistorian getInstance() {
> >>> + public synchronized static RegionHistorian getInstance() {
> >>>
> >>>
> >> Was it adding the synchronized here that made it so regionhistorian
> >> could continue to use HTable? i.e. fixes HBASE-919?
> >> St.Ack
> >>
> >>
> >
> >
>
Re: svn commit: r703711 - in /hadoop/hbase/trunk: CHANGES.txt src/java/org/apache/hadoop/hbase/RegionHistorian.java
src/java/org/apache/hadoop/hbase/master/RegionManager.java
Posted by stack <st...@duboce.net>.
I was just interested in what in the patch addressed HBASE-919? (Am I
wrong thinking that 919 got in the way of the fixing of this issue?)
Thanks,
St.Ack
Jim Kellerman (POWERSET) wrote:
> No, the change to RegionHistorian was only to make the instantiation
> of the singleton thread safe, which it was not before.
>
> ---
> Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
>
>
>
>> -----Original Message-----
>> From: stack [mailto:stack@duboce.net]
>> Sent: Saturday, October 11, 2008 4:02 PM
>> To: hbase-dev@hadoop.apache.org
>> Subject: Re: svn commit: r703711 - in /hadoop/hbase/trunk: CHANGES.txt
>> src/java/org/apache/hadoop/hbase/RegionHistorian.java
>> src/java/org/apache/hadoop/hbase/master/RegionManager.java
>>
>> jimk@apache.org wrote:
>>
>>> + HBASE-918 Region balancing during startup makes cluster unstable
>>> ...
>>> URL:
>>>
>> http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/java/org/apache/hadoop
>> /hbase/RegionHistorian.java?rev=703711&r1=703710&r2=703711&view=diff
>>
>> ==========================================================================
>> ====
>>
>>> ---
>>>
>> hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java
>> (original)
>>
>>> +++
>>>
>> hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java
>> Sat Oct 11 12:04:09 2008
>>
>>> @@ -86,7 +86,7 @@
>>> * Get the RegionHistorian Singleton instance.
>>> * @return The region historian
>>> */
>>> - public static RegionHistorian getInstance() {
>>> + public synchronized static RegionHistorian getInstance() {
>>>
>>>
>> Was it adding the synchronized here that made it so regionhistorian
>> could continue to use HTable? i.e. fixes HBASE-919?
>> St.Ack
>>
>>
>
>
Re: svn commit: r703711 - in /hadoop/hbase/trunk: CHANGES.txt src/java/org/apache/hadoop/hbase/RegionHistorian.java
src/java/org/apache/hadoop/hbase/master/RegionManager.java
Posted by stack <st...@duboce.net>.
Doğacan Güney wrote:
> On Mon, Oct 13, 2008 at 12:43 AM, Jim Kellerman (POWERSET)
> <Ji...@microsoft.com> wrote:
>
>> No, the change to RegionHistorian was only to make the instantiation
>> of the singleton thread safe, which it was not before.
>>
>>
>
> IIRC what I read about java memory model, you do not need synchronized
> there if you make a few changes:
>
> /** Singleton reference */
> private static RegionHistorian historian = new RegionHistorian();
>
> I think if you do this, Java guarantees that historian won't be
> initialized until it is accessed
> (because static members are only initialized when a class is loaded)
> and it will be thread-safe.
> So you can get rid of "synchronized" in getInstance.
Agreed. The above is more the idiom but for some reason, the
RegionHistorian needs to be lazily instantiated (I looked into this
before but don't remember why -- smile). I tried changing it yesterday
and ran unit tests. They failed. It seems the lazy-instantiation is
still necessary.
St.Ack
Re: svn commit: r703711 - in /hadoop/hbase/trunk: CHANGES.txt src/java/org/apache/hadoop/hbase/RegionHistorian.java src/java/org/apache/hadoop/hbase/master/RegionManager.java
Posted by Doğacan Güney <do...@gmail.com>.
On Mon, Oct 13, 2008 at 12:43 AM, Jim Kellerman (POWERSET)
<Ji...@microsoft.com> wrote:
> No, the change to RegionHistorian was only to make the instantiation
> of the singleton thread safe, which it was not before.
>
IIRC what I read about java memory model, you do not need synchronized
there if you make a few changes:
/** Singleton reference */
private static RegionHistorian historian = new RegionHistorian();
I think if you do this, Java guarantees that historian won't be
initialized until it is accessed
(because static members are only initialized when a class is loaded)
and it will be thread-safe.
So you can get rid of "synchronized" in getInstance.
> ---
> Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
>
>
>> -----Original Message-----
>> From: stack [mailto:stack@duboce.net]
>> Sent: Saturday, October 11, 2008 4:02 PM
>> To: hbase-dev@hadoop.apache.org
>> Subject: Re: svn commit: r703711 - in /hadoop/hbase/trunk: CHANGES.txt
>> src/java/org/apache/hadoop/hbase/RegionHistorian.java
>> src/java/org/apache/hadoop/hbase/master/RegionManager.java
>>
>> jimk@apache.org wrote:
>> > + HBASE-918 Region balancing during startup makes cluster unstable
>> > ...
>> > URL:
>> http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/java/org/apache/hadoop
>> /hbase/RegionHistorian.java?rev=703711&r1=703710&r2=703711&view=diff
>> >
>> ==========================================================================
>> ====
>> > ---
>> hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java
>> (original)
>> > +++
>> hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java
>> Sat Oct 11 12:04:09 2008
>> > @@ -86,7 +86,7 @@
>> > * Get the RegionHistorian Singleton instance.
>> > * @return The region historian
>> > */
>> > - public static RegionHistorian getInstance() {
>> > + public synchronized static RegionHistorian getInstance() {
>> >
>>
>> Was it adding the synchronized here that made it so regionhistorian
>> could continue to use HTable? i.e. fixes HBASE-919?
>> St.Ack
>>
>
>
--
Doğacan Güney
RE: svn commit: r703711 - in /hadoop/hbase/trunk: CHANGES.txt
src/java/org/apache/hadoop/hbase/RegionHistorian.java
src/java/org/apache/hadoop/hbase/master/RegionManager.java
Posted by "Jim Kellerman (POWERSET)" <Ji...@microsoft.com>.
No, the change to RegionHistorian was only to make the instantiation
of the singleton thread safe, which it was not before.
---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
> -----Original Message-----
> From: stack [mailto:stack@duboce.net]
> Sent: Saturday, October 11, 2008 4:02 PM
> To: hbase-dev@hadoop.apache.org
> Subject: Re: svn commit: r703711 - in /hadoop/hbase/trunk: CHANGES.txt
> src/java/org/apache/hadoop/hbase/RegionHistorian.java
> src/java/org/apache/hadoop/hbase/master/RegionManager.java
>
> jimk@apache.org wrote:
> > + HBASE-918 Region balancing during startup makes cluster unstable
> > ...
> > URL:
> http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/java/org/apache/hadoop
> /hbase/RegionHistorian.java?rev=703711&r1=703710&r2=703711&view=diff
> >
> ==========================================================================
> ====
> > ---
> hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java
> (original)
> > +++
> hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java
> Sat Oct 11 12:04:09 2008
> > @@ -86,7 +86,7 @@
> > * Get the RegionHistorian Singleton instance.
> > * @return The region historian
> > */
> > - public static RegionHistorian getInstance() {
> > + public synchronized static RegionHistorian getInstance() {
> >
>
> Was it adding the synchronized here that made it so regionhistorian
> could continue to use HTable? i.e. fixes HBASE-919?
> St.Ack
>
Re: svn commit: r703711 - in /hadoop/hbase/trunk: CHANGES.txt src/java/org/apache/hadoop/hbase/RegionHistorian.java
src/java/org/apache/hadoop/hbase/master/RegionManager.java
Posted by stack <st...@duboce.net>.
jimk@apache.org wrote:
> + HBASE-918 Region balancing during startup makes cluster unstable
> ...
> URL: http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java?rev=703711&r1=703710&r2=703711&view=diff
> ==============================================================================
> --- hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java (original)
> +++ hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/RegionHistorian.java Sat Oct 11 12:04:09 2008
> @@ -86,7 +86,7 @@
> * Get the RegionHistorian Singleton instance.
> * @return The region historian
> */
> - public static RegionHistorian getInstance() {
> + public synchronized static RegionHistorian getInstance() {
>
Was it adding the synchronized here that made it so regionhistorian
could continue to use HTable? i.e. fixes HBASE-919?
St.Ack