You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by leith <el...@diffbot.com> on 2008/07/23 21:25:28 UTC
advice in overcoming our hbase roadblocks
we've been trying for a couple of days (without success) to import our data into
hbase.
initially we ran into quite a few OOME errors, but we've seem to overcome that
by adjusting our jvm memory heap sizes.
however, we're still running into many other roadblocks, and in my opinion we
just don't have the right configuration options in our conf files (or maybe not
enough resources to get the job done)
in the end, this is a one time task. if we can be successful, we believe this
will be a good introduction for us using hbase, and we can continue by
integrating it further into our project.
i'ld appreciate if someone would offer us some advice for the following
task/setup that we are trying to accomplish. here are the details:
---------------------------------------------------------------
1) everything (hdfs/hbase) is running on one machine currently (short term)
2) we are importing 60k files, each ranging between 100k and 64MB along with
necessary meta-data in other column-families
3) our machine has 2GB, amd64 dual core, dedicated to the import task, hbase
heapsize is set to 1000
4) our import program is single threaded java program, iterating through our
files and doing batchoperations for each file into hbase
5) after about 15 minutes of successful importing, we see
'INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC
Server handler 8 on 60020' on region dmls,,1216768730386: Memcache size 64.0m is
>= than blocking 64.0m size '
after another 15 minutes of inactivity, we see the threads slowly get unblocked,
and importing continues (albeit much slower)
6) we've then hit only one 'FileNotFoundErrors', and for the rest of the import,
it continually runs into 'org.apache.hadoop.hbase.NotServingRegionException:
org.apache.hadoop.hbase.NotServingRegionException:'
occasionally a file or two will import, but generally we hit the
NotServingRegionException, and the majority of files just don't get imported
------------------------------------------------------------------
thanks for the support, we appreciate it!
/leith
Re: advice in overcoming our hbase roadblocks
Posted by stack <st...@duboce.net>.
Jean-Daniel Cryans wrote:
> ...
> 6) we've then hit only one 'FileNotFoundErrors', and for the rest of the
> import, it continually runs into 'org.apache.hadoop.hbase.
>
>> NotServingRegionException:
>> org.apache.hadoop.hbase.NotServingRegionException:'
>>
>
>
> The NotServingRegionException is also normal. After a split, the client
> still has old metadata so when hitting this exception it means to the client
> "hey, refresh your cache because regions changed".
>
>
One thing to add is that client may get a NSRE in the situation where
we're never successfully deploying the region. Please see recent notes
on this list where a user's region would not deploy because of
FileNotFoundExceptions; see
https://issues.apache.org/jira/browse/HBASE-766 and if you can, try the
patch and report back your findings?
St.Ack
Re: advice in overcoming our hbase roadblocks
Posted by Jean-Daniel Cryans <jd...@gmail.com>.
'INFO org.apache.hadoop.hbase.
>
> regionserver.HRegion: Blocking updates for 'IPC Server handler 8 on 60020'
> on region dmls,,1216768730386: Memcache size 64.0m is >= than blocking 64.0m
> size '
This is normal. Have a look at the following documentation to understand how
HBase works: http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture
6) we've then hit only one 'FileNotFoundErrors', and for the rest of the
import, it continually runs into 'org.apache.hadoop.hbase.
>
> NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException:'
The NotServingRegionException is also normal. After a split, the client
still has old metadata so when hitting this exception it means to the client
"hey, refresh your cache because regions changed".
1) everything (hdfs/hbase) is running on one machine currently (short term)
This is the bulk of your problem. HBase is designed to distribute load
across a lot of machines so when a compaction or a flush occurs it normally
won't affect the performance too much. The short term solution for your
short term situation is to slow down the inserts. For example, first try
inserting batchs of 60mg of data and wait after each batch for the flush to
finish (or compaction or split).
J-D
On Wed, Jul 23, 2008 at 3:25 PM, leith <el...@diffbot.com> wrote:
> we've been trying for a couple of days (without success) to import our data
> into hbase.
>
> initially we ran into quite a few OOME errors, but we've seem to overcome
> that by adjusting our jvm memory heap sizes.
>
> however, we're still running into many other roadblocks, and in my opinion
> we just don't have the right configuration options in our conf files (or
> maybe not enough resources to get the job done)
>
> in the end, this is a one time task. if we can be successful, we believe
> this will be a good introduction for us using hbase, and we can continue by
> integrating it further into our project.
>
> i'ld appreciate if someone would offer us some advice for the following
> task/setup that we are trying to accomplish. here are the details:
>
> ---------------------------------------------------------------
>
> 1) everything (hdfs/hbase) is running on one machine currently (short term)
>
> 2) we are importing 60k files, each ranging between 100k and 64MB along
> with necessary meta-data in other column-families
>
> 3) our machine has 2GB, amd64 dual core, dedicated to the import task,
> hbase heapsize is set to 1000
>
> 4) our import program is single threaded java program, iterating through
> our files and doing batchoperations for each file into hbase
>
> 5) after about 15 minutes of successful importing, we see
>
> 'INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for
> 'IPC Server handler 8 on 60020' on region dmls,,1216768730386: Memcache size
> 64.0m is >= than blocking 64.0m size '
>
> after another 15 minutes of inactivity, we see the threads slowly get
> unblocked, and importing continues (albeit much slower)
>
> 6) we've then hit only one 'FileNotFoundErrors', and for the rest of the
> import, it continually runs into
> 'org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException:'
>
> occasionally a file or two will import, but generally we hit the
> NotServingRegionException, and the majority of files just don't get imported
>
> ------------------------------------------------------------------
>
> thanks for the support, we appreciate it!
>
> /leith
>
>
Re: advice in overcoming our hbase roadblocks
Posted by Billy Pearson <sa...@pearsonwholesale.com>.
I seen the blocking problem also in some of my imports to a fresh install
but they seam to go away after the region servers have more regions hosted.
In the past what I had to do was import one file at a time and sleep between
jobs. But there was some patch added that should make hbase much more stable
in version 0.2.0.
"leith" <el...@diffbot.com> wrote in message
news:488785A8.5080903@diffbot.com...
> we've been trying for a couple of days (without success) to import our
> data into hbase.
>
> initially we ran into quite a few OOME errors, but we've seem to overcome
> that by adjusting our jvm memory heap sizes.
>
> however, we're still running into many other roadblocks, and in my opinion
> we just don't have the right configuration options in our conf files (or
> maybe not enough resources to get the job done)
>
> in the end, this is a one time task. if we can be successful, we believe
> this will be a good introduction for us using hbase, and we can continue
> by integrating it further into our project.
>
> i'ld appreciate if someone would offer us some advice for the following
> task/setup that we are trying to accomplish. here are the details:
>
> ---------------------------------------------------------------
>
> 1) everything (hdfs/hbase) is running on one machine currently (short
> term)
>
> 2) we are importing 60k files, each ranging between 100k and 64MB along
> with necessary meta-data in other column-families
>
> 3) our machine has 2GB, amd64 dual core, dedicated to the import task,
> hbase heapsize is set to 1000
>
> 4) our import program is single threaded java program, iterating through
> our files and doing batchoperations for each file into hbase
>
> 5) after about 15 minutes of successful importing, we see
>
> 'INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for
> 'IPC Server handler 8 on 60020' on region dmls,,1216768730386: Memcache
> size 64.0m is
> >= than blocking 64.0m size '
>
> after another 15 minutes of inactivity, we see the threads slowly get
> unblocked, and importing continues (albeit much slower)
>
> 6) we've then hit only one 'FileNotFoundErrors', and for the rest of the
> import, it continually runs into
> 'org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException:'
>
> occasionally a file or two will import, but generally we hit the
> NotServingRegionException, and the majority of files just don't get
> imported
>
> ------------------------------------------------------------------
>
> thanks for the support, we appreciate it!
>
> /leith
>
>