You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Clint Morgan <cm...@troove.net> on 2008/07/08 02:23:32 UTC

hbase slow to startup on copy of hbase directory

Hi all,

I'm having a little problem with our tests that use hbase.

First, I run a test which generate all of the hbase tables, and exits.

Then for each test, I copy over the hbase directory, and the start up hbase.

So far, so good, hbase quickly starts up and finds all my tables.
However, I then get NotServingRegion exceptions for the next minute or
so. Afterwards the regions get assigned, and everything is fine.

Looking at the logs, just before the regions start to come online, I see:

07/07/08 17:07:40] 60780  [ger.metaScanner] INFO
adoop.hbase.master.BaseScanner  - RegionManager.metaScanner scanning
meta region {regionname: .META.,,1, startKey: <>, server:
127.0.0.1:60012}
[07/07/08 17:07:40] 60746  [dler 0 on 60001] DEBUG
oop.hbase.master.ServerManager  - Total Load: 2, Num Servers: 1, Avg
Load: 2.0
[07/07/08 17:07:40] 60812  [ger.metaScanner] DEBUG
adoop.hbase.master.BaseScanner  - RegionManager.metaScannerREGION =>
{NAME => '__DDBC_META_TABLE__,,1215473430109', STARTKEY => '', ENDKEY
=> '', ENCODED => 928348903, TABLE => {NAME => '__DDBC_META_TABLE__',
FAMILIES => [{NAME => 'Meta', VERSIONS => 3, COMPRESSION => 'NONE',
IN_MEMORY => false, BLOCKCACHE => false, LENGTH => 2147483647, TTL =>
FOREVER, BLOOMFILTER => NONE}]}}, SERVER => '127.0.0.1:60012',
STARTCODE => 1215473423349
[07/07/08 17:07:40] 60812  [ger.metaScanner] DEBUG
adoop.hbase.master.BaseScanner  - Current assignment of
__DDBC_META_TABLE__,,1215473430109 is not valid: serverInfo: address:
127.0.0.1:60012, startcode: 1215475600737, load: (requests: 2 regions:
2), passed startCode: 1215473423349, storedInfo.startCode:
1215475600737, unassignedRegions: false, pendingRegions: false
...

My question: what is going on here, and how can I speed it up?

cheers,
-clint

Re: hbase slow to startup on copy of hbase directory

Posted by Jean-Daniel Cryans <jd...@gmail.com>.
Clint,

I just uploaded a patch. See if it works better for you!

J-D

On Tue, Jul 8, 2008 at 2:30 PM, Clint Morgan <cm...@troove.net> wrote:

> > I'll open a JIRA.
>
> Nevermind, I see you guys are one step ahead:
> https://issues.apache.org/jira/browse/HBASE-730
>

Re: hbase slow to startup on copy of hbase directory

Posted by Clint Morgan <cm...@troove.net>.
> I'll open a JIRA.

Nevermind, I see you guys are one step ahead:
https://issues.apache.org/jira/browse/HBASE-730

Re: hbase slow to startup on copy of hbase directory

Posted by Clint Morgan <cm...@troove.net>.
Thanks for the response, that sounds good. I had a quick peek at the
code, but I don't understand what is going on there well enough to
implement the proposed solution...

I'll open a JIRA.

cheers,
-clint

On Tue, Jul 8, 2008 at 9:14 AM, Jim Kellerman <ji...@powerset.com> wrote:
>> -----Original Message-----
>> From: stack [mailto:stack@duboce.net]
>> Sent: Monday, July 07, 2008 8:41 PM
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: hbase slow to startup on copy of hbase directory
>>
>> Looking at code, we have the concept of an 'initial' scan.  I
>> wonder if things would run faster for you if on the initial
>> scan we just cleared all SERVER and STARTCODE entries in
>> .META. rather than wait on regionserver reports?
>
> +1
> No virus found in this outgoing message.
> Checked by AVG.
> Version: 8.0.138 / Virus Database: 270.4.6/1538 - Release Date: 7/7/2008 7:40 AM
>

RE: hbase slow to startup on copy of hbase directory

Posted by Jim Kellerman <ji...@powerset.com>.
> -----Original Message-----
> From: stack [mailto:stack@duboce.net]
> Sent: Monday, July 07, 2008 8:41 PM
> To: hbase-user@hadoop.apache.org
> Subject: Re: hbase slow to startup on copy of hbase directory
>
> Looking at code, we have the concept of an 'initial' scan.  I
> wonder if things would run faster for you if on the initial
> scan we just cleared all SERVER and STARTCODE entries in
> .META. rather than wait on regionserver reports?

+1
No virus found in this outgoing message.
Checked by AVG.
Version: 8.0.138 / Virus Database: 270.4.6/1538 - Release Date: 7/7/2008 7:40 AM

Re: hbase slow to startup on copy of hbase directory

Posted by stack <st...@duboce.net>.
When a regionserver starts-up, it chooses a random number, its 
STARTCODE.  Every time it reports in to the master, it volunteers its 
start code as part of its message.

Where regions are assigned to -- their SERVER in the log snippet below 
-- and their STARTCODES, are kept in distinct columns in the .META. 
table; i.e. each SERVER has an associated STARTCODE.

The master scans the .META. table on a period to ensure all regions are 
allocated.  Part of its check ensures the STARTCODEs' regionservers have 
volunteered match what it has stamped into .META.  If a discrepancy, 
something has happened; a restart of the cluster or at a minimum a crash 
or restart of that regionserver.  The region is marked 'not valid' and 
the master goes about the business of cleanup and reassignment of the 
region.

Looking at code, we have the concept of an 'initial' scan.  I wonder if 
things would run faster for you if on the initial scan we just cleared 
all SERVER and STARTCODE entries in .META. rather than wait on 
regionserver reports?

St.Ack



Clint Morgan wrote:
> Hi all,
>
> I'm having a little problem with our tests that use hbase.
>
> First, I run a test which generate all of the hbase tables, and exits.
>
> Then for each test, I copy over the hbase directory, and the start up hbase.
>
> So far, so good, hbase quickly starts up and finds all my tables.
> However, I then get NotServingRegion exceptions for the next minute or
> so. Afterwards the regions get assigned, and everything is fine.
>
> Looking at the logs, just before the regions start to come online, I see:
>
> 07/07/08 17:07:40] 60780  [ger.metaScanner] INFO
> adoop.hbase.master.BaseScanner  - RegionManager.metaScanner scanning
> meta region {regionname: .META.,,1, startKey: <>, server:
> 127.0.0.1:60012}
> [07/07/08 17:07:40] 60746  [dler 0 on 60001] DEBUG
> oop.hbase.master.ServerManager  - Total Load: 2, Num Servers: 1, Avg
> Load: 2.0
> [07/07/08 17:07:40] 60812  [ger.metaScanner] DEBUG
> adoop.hbase.master.BaseScanner  - RegionManager.metaScannerREGION =>
> {NAME => '__DDBC_META_TABLE__,,1215473430109', STARTKEY => '', ENDKEY
> => '', ENCODED => 928348903, TABLE => {NAME => '__DDBC_META_TABLE__',
> FAMILIES => [{NAME => 'Meta', VERSIONS => 3, COMPRESSION => 'NONE',
> IN_MEMORY => false, BLOCKCACHE => false, LENGTH => 2147483647, TTL =>
> FOREVER, BLOOMFILTER => NONE}]}}, SERVER => '127.0.0.1:60012',
> STARTCODE => 1215473423349
> [07/07/08 17:07:40] 60812  [ger.metaScanner] DEBUG
> adoop.hbase.master.BaseScanner  - Current assignment of
> __DDBC_META_TABLE__,,1215473430109 is not valid: serverInfo: address:
> 127.0.0.1:60012, startcode: 1215475600737, load: (requests: 2 regions:
> 2), passed startCode: 1215473423349, storedInfo.startCode:
> 1215475600737, unassignedRegions: false, pendingRegions: false
> ...
>
> My question: what is going on here, and how can I speed it up?
>
> cheers,
> -clint
>