You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by leith <el...@diffbot.com> on 2008/07/23 21:25:28 UTC

advice in overcoming our hbase roadblocks

we've been trying for a couple of days (without success) to import our data into 
hbase.

initially we ran into quite a few OOME errors, but we've seem to overcome that 
by adjusting our jvm memory heap sizes.

however, we're still running into many other roadblocks, and in my opinion we 
just don't have the right configuration options in our conf files (or maybe not 
enough resources to get the job done)

in the end, this is a one time task. if we can be successful, we believe this 
will be a good introduction for us using hbase, and we can continue by 
integrating it further into our project.

i'ld appreciate if someone would offer us some advice for the following 
task/setup that we are trying to accomplish. here are the details:

---------------------------------------------------------------

1) everything (hdfs/hbase) is running on one machine currently (short term)

2) we are importing 60k files, each ranging between 100k and 64MB along with 
necessary meta-data in other column-families

3) our machine has 2GB, amd64 dual core, dedicated to the import task, hbase 
heapsize is set to 1000

4) our import program is single threaded java program, iterating through our 
files and doing batchoperations for each file into hbase

5) after about 15 minutes of successful importing, we see

'INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC 
Server handler 8 on 60020' on region dmls,,1216768730386: Memcache size 64.0m is 
 >= than blocking 64.0m size '

after another 15 minutes of inactivity, we see the threads slowly get unblocked, 
and importing continues (albeit much slower)

6) we've then hit only one 'FileNotFoundErrors', and for the rest of the import, 
it continually runs into 'org.apache.hadoop.hbase.NotServingRegionException: 
org.apache.hadoop.hbase.NotServingRegionException:'

occasionally a file or two will import, but generally we hit the 
NotServingRegionException, and the majority of files just don't get imported

------------------------------------------------------------------

thanks for the support, we appreciate it!

/leith


Re: advice in overcoming our hbase roadblocks

Posted by stack <st...@duboce.net>.
Jean-Daniel Cryans wrote:
> ...
> 6) we've then hit only one 'FileNotFoundErrors', and for the rest of the
> import, it continually runs into 'org.apache.hadoop.hbase.
>   
>> NotServingRegionException:
>> org.apache.hadoop.hbase.NotServingRegionException:'
>>     
>
>
> The NotServingRegionException is also normal. After a split, the client
> still has old metadata so when hitting this exception it means to the client
> "hey, refresh your cache because regions changed".
>
>   
One thing to add is that client may get a NSRE in the situation where 
we're never successfully deploying the region.  Please see recent notes 
on this list where a user's region would not deploy because of 
FileNotFoundExceptions; see 
https://issues.apache.org/jira/browse/HBASE-766 and if you can, try the 
patch and report back your findings?

St.Ack

Re: advice in overcoming our hbase roadblocks

Posted by Jean-Daniel Cryans <jd...@gmail.com>.
'INFO org.apache.hadoop.hbase.
>
> regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020'
> on region dmls,,1216768730386: Memcache size 64.0m is >= than blocking 64.0m
> size '


This is normal. Have a look at the following documentation to understand how
HBase works: http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture

6) we've then hit only one 'FileNotFoundErrors', and for the rest of the
import, it continually runs into 'org.apache.hadoop.hbase.
>
> NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException:'


The NotServingRegionException is also normal. After a split, the client
still has old metadata so when hitting this exception it means to the client
"hey, refresh your cache because regions changed".


1) everything (hdfs/hbase) is running on one machine currently (short term)

This is the bulk of your problem. HBase is designed to distribute load
across a lot of machines so when a compaction or a flush occurs it normally
won't affect the performance too much. The short term solution for your
short term situation is to slow down the inserts. For example, first try
inserting batchs of 60mg of data and wait after each batch for the flush to
finish (or compaction or split).

J-D


On Wed, Jul 23, 2008 at 3:25 PM, leith <el...@diffbot.com> wrote:

> we've been trying for a couple of days (without success) to import our data
> into hbase.
>
> initially we ran into quite a few OOME errors, but we've seem to overcome
> that by adjusting our jvm memory heap sizes.
>
> however, we're still running into many other roadblocks, and in my opinion
> we just don't have the right configuration options in our conf files (or
> maybe not enough resources to get the job done)
>
> in the end, this is a one time task. if we can be successful, we believe
> this will be a good introduction for us using hbase, and we can continue by
> integrating it further into our project.
>
> i'ld appreciate if someone would offer us some advice for the following
> task/setup that we are trying to accomplish. here are the details:
>
> ---------------------------------------------------------------
>
> 1) everything (hdfs/hbase) is running on one machine currently (short term)
>
> 2) we are importing 60k files, each ranging between 100k and 64MB along
> with necessary meta-data in other column-families
>
> 3) our machine has 2GB, amd64 dual core, dedicated to the import task,
> hbase heapsize is set to 1000
>
> 4) our import program is single threaded java program, iterating through
> our files and doing batchoperations for each file into hbase
>
> 5) after about 15 minutes of successful importing, we see
>
> 'INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for
> 'IPC Server handler 8 on 60020' on region dmls,,1216768730386: Memcache size
> 64.0m is >= than blocking 64.0m size '
>
> after another 15 minutes of inactivity, we see the threads slowly get
> unblocked, and importing continues (albeit much slower)
>
> 6) we've then hit only one 'FileNotFoundErrors', and for the rest of the
> import, it continually runs into
> 'org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException:'
>
> occasionally a file or two will import, but generally we hit the
> NotServingRegionException, and the majority of files just don't get imported
>
> ------------------------------------------------------------------
>
> thanks for the support, we appreciate it!
>
> /leith
>
>

Re: advice in overcoming our hbase roadblocks

Posted by Billy Pearson <sa...@pearsonwholesale.com>.
I seen the blocking problem also in some of my imports to a fresh install 
but they seam to go away after the region servers have more regions hosted. 
In the past what I had to do was import one file at a time and sleep between 
jobs. But there was some patch added that should make hbase much more stable 
in version 0.2.0.


"leith" <el...@diffbot.com> wrote in message 
news:488785A8.5080903@diffbot.com...
> we've been trying for a couple of days (without success) to import our 
> data into hbase.
>
> initially we ran into quite a few OOME errors, but we've seem to overcome 
> that by adjusting our jvm memory heap sizes.
>
> however, we're still running into many other roadblocks, and in my opinion 
> we just don't have the right configuration options in our conf files (or 
> maybe not enough resources to get the job done)
>
> in the end, this is a one time task. if we can be successful, we believe 
> this will be a good introduction for us using hbase, and we can continue 
> by integrating it further into our project.
>
> i'ld appreciate if someone would offer us some advice for the following 
> task/setup that we are trying to accomplish. here are the details:
>
> ---------------------------------------------------------------
>
> 1) everything (hdfs/hbase) is running on one machine currently (short 
> term)
>
> 2) we are importing 60k files, each ranging between 100k and 64MB along 
> with necessary meta-data in other column-families
>
> 3) our machine has 2GB, amd64 dual core, dedicated to the import task, 
> hbase heapsize is set to 1000
>
> 4) our import program is single threaded java program, iterating through 
> our files and doing batchoperations for each file into hbase
>
> 5) after about 15 minutes of successful importing, we see
>
> 'INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 
> 'IPC Server handler 8 on 60020' on region dmls,,1216768730386: Memcache 
> size 64.0m is
> >= than blocking 64.0m size '
>
> after another 15 minutes of inactivity, we see the threads slowly get 
> unblocked, and importing continues (albeit much slower)
>
> 6) we've then hit only one 'FileNotFoundErrors', and for the rest of the 
> import, it continually runs into 
> 'org.apache.hadoop.hbase.NotServingRegionException: 
> org.apache.hadoop.hbase.NotServingRegionException:'
>
> occasionally a file or two will import, but generally we hit the 
> NotServingRegionException, and the majority of files just don't get 
> imported
>
> ------------------------------------------------------------------
>
> thanks for the support, we appreciate it!
>
> /leith
>
>