You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Vishal Kapoor <vi...@gmail.com> on 2011/02/16 04:24:49 UTC

Hbase inserts very slow

all was working fine and suddenly I see a lot of logs like below

2011-02-15 22:19:04,023 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 19.88 MB of total=168.64 MB
2011-02-15 22:19:04,025 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB, multi=92.37 MB,
memory=166.09 KB
2011-02-15 22:19:11,207 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 19.88 MB of total=168.64 MB
2011-02-15 22:19:11,444 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB, multi=93.32 MB,
memory=166.09 KB
2011-02-15 22:19:21,494 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 19.87 MB of total=168.62 MB
2011-02-15 22:19:21,760 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB, multi=92.73 MB,
memory=166.09 KB
2011-02-15 22:19:39,838 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 19.87 MB of total=168.62 MB
2011-02-15 22:19:39,852 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB, multi=91.48 MB,
memory=166.09 KB
2011-02-15 22:19:49,768 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free 19.87 MB of total=168.62 MB
2011-02-15 22:19:49,770 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB, multi=90.35 MB,
memory=166.09 KB


I haven't changed anything including the table definitions.
please let me know where to look...

thanks,
Vishal Kapoor

Re: Hbase inserts very slow

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Good to know, and yeah cluster performance will definitely be
different. Optimizing on a pseudo-distributed setup only gets you so
far.

To answer your other question, use
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html

J-D

On Thu, Feb 17, 2011 at 4:26 PM, Vishal Kapoor
<vi...@gmail.com> wrote:
> J-D,
>
> setCacheBlocks false improved the variance in the time it took for my
> explode, now its consistently reporting similar times.
>
> I have also tested inserting the master data to the other two families with
> separate map reduce jobs and I like the results so far.
> since I am still on pseudo distributed, it freaks my intel i5 but I my gut
> says anything processor intensive will work much better on the cluster, I
> can be wrong!
>
> I know I should ask this in a separate email thread so that others can
> benefit from it as well
> but can I add some sort of filter to scan so that I only see the row Ids, I
> don't care about the data in the table, to attach master data to this table
> I only care about the composite row key.
>
>
> thanks for all your help...
>
> Vishal
>
> On Thu, Feb 17, 2011 at 12:44 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> Bummer, well have you tried the other thing about setCacheBlocks? This
>> at least would get rid of the block cache churning and give us a
>> better picture of what's going on in the logs.
>>
>> J-D
>>
>> On Thu, Feb 17, 2011 at 9:20 AM, Vishal Kapoor
>> <vi...@gmail.com> wrote:
>> > J-D,
>> > I do not see any significant improvement on combining the data into a
>> single
>> > family,
>> > maybe its because my data in the families are spread as 1:1:1, (single
>> > dimension for all CF )
>> > the next iteration I am planning is to write to these three families from
>> > three different map/reduce jobs.
>> > will keep all posted on my findings...
>> > Vishal
>> > On Wed, Feb 16, 2011 at 8:00 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >
>> > wrote:
>> >>
>> >> It's best to have different families for data of different nature and
>> >> when you usually don't read/write them together. For sure it shouldn't
>> >> slow you down as much as it does (because of HBASE-3149), but given
>> >> the current situation it's hard to recommend multiple families.
>> >>
>> >> J-D
>> >>
>> >> On Wed, Feb 16, 2011 at 4:32 PM, Vishal Kapoor
>> >> <vi...@gmail.com> wrote:
>> >> > thanks J-D. for all your help, I will combine the three families and
>> >> > re-baseline the performance.
>> >> > but I was just wondering if I was using the family as they were
>> suppose
>> >> > to
>> >> > be used or not.
>> >> > the data in these three families are different, one of them is live
>> feed
>> >> > and
>> >> > the two other two are master (static kind) data and it made a lot of
>> >> > logical
>> >> > sense to separate them in different families.
>> >> > maybe if updating a family in a different map/reduce operation works
>> >> > fine
>> >> > then I will go that route.
>> >> > but the critical to quality factor here is speed for inserts and I am
>> >> > going
>> >> > to definitely give a try to single family approach.
>> >> >
>> >> > Vishal
>> >> >
>> >> >
>> >> > On Wed, Feb 16, 2011 at 6:53 PM, Jean-Daniel Cryans
>> >> > <jd...@apache.org>wrote:
>> >> >
>> >> >> I don't understand... is having the same qualifier a hard
>> requirement?
>> >> >> Worst case you could have a prefix.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Wed, Feb 16, 2011 at 3:29 PM, Vishal Kapoor
>> >> >> <vi...@gmail.com> wrote:
>> >> >> > J-D,
>> >> >> > I also should mention that my data distribution in the three
>> families
>> >> >> > are
>> >> >> > 1:1:1
>> >> >> > I have three families so that I can have same qualifiers in them.
>> and
>> >> >> also
>> >> >> > the data in those families are LIVE:MasterA:MasterB
>> >> >> >
>> >> >> > Vishal
>> >> >> >
>> >> >> > On Wed, Feb 16, 2011 at 6:22 PM, Jean-Daniel Cryans
>> >> >> > <jdcryans@apache.org
>> >> >> >wrote:
>> >> >> >
>> >> >> >> Very often there's no need for more than 1 family, I would suggest
>> >> >> >> you
>> >> >> >> explore that possibility first.
>> >> >> >>
>> >> >> >> J-D
>> >> >> >>
>> >> >> >> On Wed, Feb 16, 2011 at 3:13 PM, Vishal Kapoor
>> >> >> >> <vi...@gmail.com> wrote:
>> >> >> >> > does that mean I am only left with the choice of writing to the
>> >> >> >> > three
>> >> >> >> > families in three different map jobs?
>> >> >> >> > or can I do it any other way?
>> >> >> >> > thanks,
>> >> >> >> > Vishal
>> >> >> >> >
>> >> >> >> > On Wed, Feb 16, 2011 at 12:56 PM, Jean-Daniel Cryans <
>> >> >> >> jdcryans@apache.org>
>> >> >> >> > wrote:
>> >> >> >> >>
>> >> >> >> >> First, loading into 3 families is currently a bad idea and is
>> >> >> >> >> bound
>> >> >> to
>> >> >> >> >> be inefficient, here's the reason why:
>> >> >> >> >> https://issues.apache.org/jira/browse/HBASE-3149
>> >> >> >> >>
>> >> >> >> >> Those log lines mean that your scanning of the first table is
>> >> >> >> >> generating a log of block cache churn. When setting up the Map,
>> >> >> >> >> set
>> >> >> >> >> your scanner to setCacheBlocks(false) before passing it to
>> >> >> >> >> TableMapReduceUtil.initTableMapperJob
>> >> >> >> >>
>> >> >> >> >> Finally, you may want to give more memory to the region server.
>> >> >> >> >>
>> >> >> >> >> J-D
>> >> >> >> >>
>> >> >> >> >> On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor
>> >> >> >> >> <vi...@gmail.com> wrote:
>> >> >> >> >> > Lars,
>> >> >> >> >> >
>> >> >> >> >> > I am still working on pseudo distributed.
>> >> >> >> >> > hadoop-0.20.2+737/
>> >> >> >> >> > and hbase-0.90.0 with the hadoop jar from the hadoop install.
>> >> >> >> >> >
>> >> >> >> >> > I have a LIVE_RAW_TABLE table, which gets values from a live
>> >> >> >> >> > system
>> >> >> >> >> > I go through each row of that table and get the row ids of
>> two
>> >> >> >> reference
>> >> >> >> >> > tables from it.
>> >> >> >> >> > TABLE_A and TABLE_B, then I explode this to a new table
>> >> >> >> >> > LIVE_TABLE
>> >> >> >> >> > I use
>> >> >> >> >> > TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null,
>> >> >> >> >> > job);
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > LIVE_TABLE has three families, LIVE, A, B and the row id is a
>> >> >> >> composite
>> >> >> >> >> > key
>> >> >> >> >> > reverseTimeStamp/rowidA/rowIdB
>> >> >> >> >> > after that a run a bunch of map reduce to consolidate the
>> data,
>> >> >> >> >> > to start with I have around 15000 rows in LIVE_RAW_TABLE.
>> >> >> >> >> >
>> >> >> >> >> > when I start with my job, i see it running quite well till i
>> am
>> >> >> almost
>> >> >> >> >> > done
>> >> >> >> >> > with 5000 rows
>> >> >> >> >> > then it starts printing the message in the logs, which I use
>> to
>> >> >> >> >> > not
>> >> >> >> see
>> >> >> >> >> > before.
>> >> >> >> >> > the job use to run for around 900 sec ( I have a lot of data
>> >> >> parsing
>> >> >> >> >> > while
>> >> >> >> >> > exploding )
>> >> >> >> >> > 15000 rows from LIVE_RAW_TABLE explodes to around 500,000
>> rows
>> >> >> >> >> > in
>> >> >> >> >> > LIVE_TABLE.
>> >> >> >> >> >
>> >> >> >> >> > after those debug messages, the job runs for around 2500 sec,
>> >> >> >> >> > I have not changed anything, including the table design.
>> >> >> >> >> >
>> >> >> >> >> > here is my table description.
>> >> >> >> >> >
>> >> >> >> >> > {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE',
>> >> >> >> >> > BLOOMFILTER =>
>> >> >> >> >> > 'NONE',
>> >> >> >> >> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION =>
>> >> >> >> >> > 'NONE',
>> >> >> TTL
>> >> >> >> =>
>> >> >> >> >> > '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
>> >> >> BLOCKCACHE
>> >> >> >> =>
>> >> >> >> >> > 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE',
>> REPLICATION_SCOPE
>> >> >> >> >> > =>
>> >> >> >> '0',
>> >> >> >> >> > VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647',
>> >> >> BLOCKSIZE
>> >> >> >> >> > =>
>> >> >> >> >> > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME
>> =>
>> >> >> >> >> > 'B',
>> >> >> >> >> > BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS =>
>> >> >> >> >> > '1',
>> >> >> >> >> > COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
>> >> >> >> >> > '65536',
>> >> >> >> >> > IN_MEMORY
>> >> >> >> >> > => 'false', BLOCKCACHE => 'true'}]}
>> >> >> >> >> >
>> >> >> >> >> > thanks for all your help.
>> >> >> >> >> >
>> >> >> >> >> > Vishal
>> >> >> >> >> >
>> >> >> >> >> > On Wed, Feb 16, 2011 at 4:26 AM, Lars George <
>> >> >> lars.george@gmail.com>
>> >> >> >> >> > wrote:
>> >> >> >> >> >
>> >> >> >> >> >> Hi Vishal,
>> >> >> >> >> >>
>> >> >> >> >> >> These are DEBUG level messages and are from the block cache,
>> >> >> >> >> >> there
>> >> >> is
>> >> >> >> >> >> nothing wrong with that. Can you explain more what you do
>> and
>> >> >> >> >> >> see?
>> >> >> >> >> >>
>> >> >> >> >> >> Lars
>> >> >> >> >> >>
>> >> >> >> >> >> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
>> >> >> >> >> >> <vi...@gmail.com> wrote:
>> >> >> >> >> >> > all was working fine and suddenly I see a lot of logs like
>> >> >> >> >> >> > below
>> >> >> >> >> >> >
>> >> >> >> >> >> > 2011-02-15 22:19:04,023 DEBUG
>> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
>> cache
>> >> >> >> >> >> > LRU
>> >> >> >> >> >> > eviction
>> >> >> >> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
>> >> >> >> >> >> > 2011-02-15 22:19:04,025 DEBUG
>> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
>> cache
>> >> >> >> >> >> > LRU
>> >> >> >> >> >> > eviction
>> >> >> >> >> >> > completed; freed=19.91 MB, total=148.73 MB, single=74.47
>> MB,
>> >> >> >> >> >> > multi=92.37
>> >> >> >> >> >> MB,
>> >> >> >> >> >> > memory=166.09 KB
>> >> >> >> >> >> > 2011-02-15 22:19:11,207 DEBUG
>> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
>> cache
>> >> >> >> >> >> > LRU
>> >> >> >> >> >> > eviction
>> >> >> >> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
>> >> >> >> >> >> > 2011-02-15 22:19:11,444 DEBUG
>> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
>> cache
>> >> >> >> >> >> > LRU
>> >> >> >> >> >> > eviction
>> >> >> >> >> >> > completed; freed=19.93 MB, total=149.09 MB, single=73.91
>> MB,
>> >> >> >> >> >> > multi=93.32
>> >> >> >> >> >> MB,
>> >> >> >> >> >> > memory=166.09 KB
>> >> >> >> >> >> > 2011-02-15 22:19:21,494 DEBUG
>> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
>> cache
>> >> >> >> >> >> > LRU
>> >> >> >> >> >> > eviction
>> >> >> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> >> >> >> > 2011-02-15 22:19:21,760 DEBUG
>> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
>> cache
>> >> >> >> >> >> > LRU
>> >> >> >> >> >> > eviction
>> >> >> >> >> >> > completed; freed=19.91 MB, total=148.84 MB, single=74.22
>> MB,
>> >> >> >> >> >> > multi=92.73
>> >> >> >> >> >> MB,
>> >> >> >> >> >> > memory=166.09 KB
>> >> >> >> >> >> > 2011-02-15 22:19:39,838 DEBUG
>> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
>> cache
>> >> >> >> >> >> > LRU
>> >> >> >> >> >> > eviction
>> >> >> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> >> >> >> > 2011-02-15 22:19:39,852 DEBUG
>> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
>> cache
>> >> >> >> >> >> > LRU
>> >> >> >> >> >> > eviction
>> >> >> >> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=75.35
>> MB,
>> >> >> >> >> >> > multi=91.48
>> >> >> >> >> >> MB,
>> >> >> >> >> >> > memory=166.09 KB
>> >> >> >> >> >> > 2011-02-15 22:19:49,768 DEBUG
>> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
>> cache
>> >> >> >> >> >> > LRU
>> >> >> >> >> >> > eviction
>> >> >> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> >> >> >> > 2011-02-15 22:19:49,770 DEBUG
>> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
>> cache
>> >> >> >> >> >> > LRU
>> >> >> >> >> >> > eviction
>> >> >> >> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=76.48
>> MB,
>> >> >> >> >> >> > multi=90.35
>> >> >> >> >> >> MB,
>> >> >> >> >> >> > memory=166.09 KB
>> >> >> >> >> >> >
>> >> >> >> >> >> >
>> >> >> >> >> >> > I haven't changed anything including the table
>> definitions.
>> >> >> >> >> >> > please let me know where to look...
>> >> >> >> >> >> >
>> >> >> >> >> >> > thanks,
>> >> >> >> >> >> > Vishal Kapoor
>> >> >> >> >> >> >
>> >> >> >> >> >>
>> >> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >
>> >
>>
>

Re: Hbase inserts very slow

Posted by Vishal Kapoor <vi...@gmail.com>.
J-D,

setCacheBlocks false improved the variance in the time it took for my
explode, now its consistently reporting similar times.

I have also tested inserting the master data to the other two families with
separate map reduce jobs and I like the results so far.
since I am still on pseudo distributed, it freaks my intel i5 but I my gut
says anything processor intensive will work much better on the cluster, I
can be wrong!

I know I should ask this in a separate email thread so that others can
benefit from it as well
but can I add some sort of filter to scan so that I only see the row Ids, I
don't care about the data in the table, to attach master data to this table
I only care about the composite row key.


thanks for all your help...

Vishal

On Thu, Feb 17, 2011 at 12:44 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Bummer, well have you tried the other thing about setCacheBlocks? This
> at least would get rid of the block cache churning and give us a
> better picture of what's going on in the logs.
>
> J-D
>
> On Thu, Feb 17, 2011 at 9:20 AM, Vishal Kapoor
> <vi...@gmail.com> wrote:
> > J-D,
> > I do not see any significant improvement on combining the data into a
> single
> > family,
> > maybe its because my data in the families are spread as 1:1:1, (single
> > dimension for all CF )
> > the next iteration I am planning is to write to these three families from
> > three different map/reduce jobs.
> > will keep all posted on my findings...
> > Vishal
> > On Wed, Feb 16, 2011 at 8:00 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >
> > wrote:
> >>
> >> It's best to have different families for data of different nature and
> >> when you usually don't read/write them together. For sure it shouldn't
> >> slow you down as much as it does (because of HBASE-3149), but given
> >> the current situation it's hard to recommend multiple families.
> >>
> >> J-D
> >>
> >> On Wed, Feb 16, 2011 at 4:32 PM, Vishal Kapoor
> >> <vi...@gmail.com> wrote:
> >> > thanks J-D. for all your help, I will combine the three families and
> >> > re-baseline the performance.
> >> > but I was just wondering if I was using the family as they were
> suppose
> >> > to
> >> > be used or not.
> >> > the data in these three families are different, one of them is live
> feed
> >> > and
> >> > the two other two are master (static kind) data and it made a lot of
> >> > logical
> >> > sense to separate them in different families.
> >> > maybe if updating a family in a different map/reduce operation works
> >> > fine
> >> > then I will go that route.
> >> > but the critical to quality factor here is speed for inserts and I am
> >> > going
> >> > to definitely give a try to single family approach.
> >> >
> >> > Vishal
> >> >
> >> >
> >> > On Wed, Feb 16, 2011 at 6:53 PM, Jean-Daniel Cryans
> >> > <jd...@apache.org>wrote:
> >> >
> >> >> I don't understand... is having the same qualifier a hard
> requirement?
> >> >> Worst case you could have a prefix.
> >> >>
> >> >> J-D
> >> >>
> >> >> On Wed, Feb 16, 2011 at 3:29 PM, Vishal Kapoor
> >> >> <vi...@gmail.com> wrote:
> >> >> > J-D,
> >> >> > I also should mention that my data distribution in the three
> families
> >> >> > are
> >> >> > 1:1:1
> >> >> > I have three families so that I can have same qualifiers in them.
> and
> >> >> also
> >> >> > the data in those families are LIVE:MasterA:MasterB
> >> >> >
> >> >> > Vishal
> >> >> >
> >> >> > On Wed, Feb 16, 2011 at 6:22 PM, Jean-Daniel Cryans
> >> >> > <jdcryans@apache.org
> >> >> >wrote:
> >> >> >
> >> >> >> Very often there's no need for more than 1 family, I would suggest
> >> >> >> you
> >> >> >> explore that possibility first.
> >> >> >>
> >> >> >> J-D
> >> >> >>
> >> >> >> On Wed, Feb 16, 2011 at 3:13 PM, Vishal Kapoor
> >> >> >> <vi...@gmail.com> wrote:
> >> >> >> > does that mean I am only left with the choice of writing to the
> >> >> >> > three
> >> >> >> > families in three different map jobs?
> >> >> >> > or can I do it any other way?
> >> >> >> > thanks,
> >> >> >> > Vishal
> >> >> >> >
> >> >> >> > On Wed, Feb 16, 2011 at 12:56 PM, Jean-Daniel Cryans <
> >> >> >> jdcryans@apache.org>
> >> >> >> > wrote:
> >> >> >> >>
> >> >> >> >> First, loading into 3 families is currently a bad idea and is
> >> >> >> >> bound
> >> >> to
> >> >> >> >> be inefficient, here's the reason why:
> >> >> >> >> https://issues.apache.org/jira/browse/HBASE-3149
> >> >> >> >>
> >> >> >> >> Those log lines mean that your scanning of the first table is
> >> >> >> >> generating a log of block cache churn. When setting up the Map,
> >> >> >> >> set
> >> >> >> >> your scanner to setCacheBlocks(false) before passing it to
> >> >> >> >> TableMapReduceUtil.initTableMapperJob
> >> >> >> >>
> >> >> >> >> Finally, you may want to give more memory to the region server.
> >> >> >> >>
> >> >> >> >> J-D
> >> >> >> >>
> >> >> >> >> On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor
> >> >> >> >> <vi...@gmail.com> wrote:
> >> >> >> >> > Lars,
> >> >> >> >> >
> >> >> >> >> > I am still working on pseudo distributed.
> >> >> >> >> > hadoop-0.20.2+737/
> >> >> >> >> > and hbase-0.90.0 with the hadoop jar from the hadoop install.
> >> >> >> >> >
> >> >> >> >> > I have a LIVE_RAW_TABLE table, which gets values from a live
> >> >> >> >> > system
> >> >> >> >> > I go through each row of that table and get the row ids of
> two
> >> >> >> reference
> >> >> >> >> > tables from it.
> >> >> >> >> > TABLE_A and TABLE_B, then I explode this to a new table
> >> >> >> >> > LIVE_TABLE
> >> >> >> >> > I use
> >> >> >> >> > TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null,
> >> >> >> >> > job);
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > LIVE_TABLE has three families, LIVE, A, B and the row id is a
> >> >> >> composite
> >> >> >> >> > key
> >> >> >> >> > reverseTimeStamp/rowidA/rowIdB
> >> >> >> >> > after that a run a bunch of map reduce to consolidate the
> data,
> >> >> >> >> > to start with I have around 15000 rows in LIVE_RAW_TABLE.
> >> >> >> >> >
> >> >> >> >> > when I start with my job, i see it running quite well till i
> am
> >> >> almost
> >> >> >> >> > done
> >> >> >> >> > with 5000 rows
> >> >> >> >> > then it starts printing the message in the logs, which I use
> to
> >> >> >> >> > not
> >> >> >> see
> >> >> >> >> > before.
> >> >> >> >> > the job use to run for around 900 sec ( I have a lot of data
> >> >> parsing
> >> >> >> >> > while
> >> >> >> >> > exploding )
> >> >> >> >> > 15000 rows from LIVE_RAW_TABLE explodes to around 500,000
> rows
> >> >> >> >> > in
> >> >> >> >> > LIVE_TABLE.
> >> >> >> >> >
> >> >> >> >> > after those debug messages, the job runs for around 2500 sec,
> >> >> >> >> > I have not changed anything, including the table design.
> >> >> >> >> >
> >> >> >> >> > here is my table description.
> >> >> >> >> >
> >> >> >> >> > {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE',
> >> >> >> >> > BLOOMFILTER =>
> >> >> >> >> > 'NONE',
> >> >> >> >> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION =>
> >> >> >> >> > 'NONE',
> >> >> TTL
> >> >> >> =>
> >> >> >> >> > '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> >> >> BLOCKCACHE
> >> >> >> =>
> >> >> >> >> > 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE',
> REPLICATION_SCOPE
> >> >> >> >> > =>
> >> >> >> '0',
> >> >> >> >> > VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647',
> >> >> BLOCKSIZE
> >> >> >> >> > =>
> >> >> >> >> > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME
> =>
> >> >> >> >> > 'B',
> >> >> >> >> > BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS =>
> >> >> >> >> > '1',
> >> >> >> >> > COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
> >> >> >> >> > '65536',
> >> >> >> >> > IN_MEMORY
> >> >> >> >> > => 'false', BLOCKCACHE => 'true'}]}
> >> >> >> >> >
> >> >> >> >> > thanks for all your help.
> >> >> >> >> >
> >> >> >> >> > Vishal
> >> >> >> >> >
> >> >> >> >> > On Wed, Feb 16, 2011 at 4:26 AM, Lars George <
> >> >> lars.george@gmail.com>
> >> >> >> >> > wrote:
> >> >> >> >> >
> >> >> >> >> >> Hi Vishal,
> >> >> >> >> >>
> >> >> >> >> >> These are DEBUG level messages and are from the block cache,
> >> >> >> >> >> there
> >> >> is
> >> >> >> >> >> nothing wrong with that. Can you explain more what you do
> and
> >> >> >> >> >> see?
> >> >> >> >> >>
> >> >> >> >> >> Lars
> >> >> >> >> >>
> >> >> >> >> >> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
> >> >> >> >> >> <vi...@gmail.com> wrote:
> >> >> >> >> >> > all was working fine and suddenly I see a lot of logs like
> >> >> >> >> >> > below
> >> >> >> >> >> >
> >> >> >> >> >> > 2011-02-15 22:19:04,023 DEBUG
> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
> cache
> >> >> >> >> >> > LRU
> >> >> >> >> >> > eviction
> >> >> >> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
> >> >> >> >> >> > 2011-02-15 22:19:04,025 DEBUG
> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
> cache
> >> >> >> >> >> > LRU
> >> >> >> >> >> > eviction
> >> >> >> >> >> > completed; freed=19.91 MB, total=148.73 MB, single=74.47
> MB,
> >> >> >> >> >> > multi=92.37
> >> >> >> >> >> MB,
> >> >> >> >> >> > memory=166.09 KB
> >> >> >> >> >> > 2011-02-15 22:19:11,207 DEBUG
> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
> cache
> >> >> >> >> >> > LRU
> >> >> >> >> >> > eviction
> >> >> >> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
> >> >> >> >> >> > 2011-02-15 22:19:11,444 DEBUG
> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
> cache
> >> >> >> >> >> > LRU
> >> >> >> >> >> > eviction
> >> >> >> >> >> > completed; freed=19.93 MB, total=149.09 MB, single=73.91
> MB,
> >> >> >> >> >> > multi=93.32
> >> >> >> >> >> MB,
> >> >> >> >> >> > memory=166.09 KB
> >> >> >> >> >> > 2011-02-15 22:19:21,494 DEBUG
> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
> cache
> >> >> >> >> >> > LRU
> >> >> >> >> >> > eviction
> >> >> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> >> >> >> >> > 2011-02-15 22:19:21,760 DEBUG
> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
> cache
> >> >> >> >> >> > LRU
> >> >> >> >> >> > eviction
> >> >> >> >> >> > completed; freed=19.91 MB, total=148.84 MB, single=74.22
> MB,
> >> >> >> >> >> > multi=92.73
> >> >> >> >> >> MB,
> >> >> >> >> >> > memory=166.09 KB
> >> >> >> >> >> > 2011-02-15 22:19:39,838 DEBUG
> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
> cache
> >> >> >> >> >> > LRU
> >> >> >> >> >> > eviction
> >> >> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> >> >> >> >> > 2011-02-15 22:19:39,852 DEBUG
> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
> cache
> >> >> >> >> >> > LRU
> >> >> >> >> >> > eviction
> >> >> >> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=75.35
> MB,
> >> >> >> >> >> > multi=91.48
> >> >> >> >> >> MB,
> >> >> >> >> >> > memory=166.09 KB
> >> >> >> >> >> > 2011-02-15 22:19:49,768 DEBUG
> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
> cache
> >> >> >> >> >> > LRU
> >> >> >> >> >> > eviction
> >> >> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> >> >> >> >> > 2011-02-15 22:19:49,770 DEBUG
> >> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block
> cache
> >> >> >> >> >> > LRU
> >> >> >> >> >> > eviction
> >> >> >> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=76.48
> MB,
> >> >> >> >> >> > multi=90.35
> >> >> >> >> >> MB,
> >> >> >> >> >> > memory=166.09 KB
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> >> >> >> > I haven't changed anything including the table
> definitions.
> >> >> >> >> >> > please let me know where to look...
> >> >> >> >> >> >
> >> >> >> >> >> > thanks,
> >> >> >> >> >> > Vishal Kapoor
> >> >> >> >> >> >
> >> >> >> >> >>
> >> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >>
> >> >> >
> >> >>
> >> >
> >
> >
>

Re: Hbase inserts very slow

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Bummer, well have you tried the other thing about setCacheBlocks? This
at least would get rid of the block cache churning and give us a
better picture of what's going on in the logs.

J-D

On Thu, Feb 17, 2011 at 9:20 AM, Vishal Kapoor
<vi...@gmail.com> wrote:
> J-D,
> I do not see any significant improvement on combining the data into a single
> family,
> maybe its because my data in the families are spread as 1:1:1, (single
> dimension for all CF )
> the next iteration I am planning is to write to these three families from
> three different map/reduce jobs.
> will keep all posted on my findings...
> Vishal
> On Wed, Feb 16, 2011 at 8:00 PM, Jean-Daniel Cryans <jd...@apache.org>
> wrote:
>>
>> It's best to have different families for data of different nature and
>> when you usually don't read/write them together. For sure it shouldn't
>> slow you down as much as it does (because of HBASE-3149), but given
>> the current situation it's hard to recommend multiple families.
>>
>> J-D
>>
>> On Wed, Feb 16, 2011 at 4:32 PM, Vishal Kapoor
>> <vi...@gmail.com> wrote:
>> > thanks J-D. for all your help, I will combine the three families and
>> > re-baseline the performance.
>> > but I was just wondering if I was using the family as they were suppose
>> > to
>> > be used or not.
>> > the data in these three families are different, one of them is live feed
>> > and
>> > the two other two are master (static kind) data and it made a lot of
>> > logical
>> > sense to separate them in different families.
>> > maybe if updating a family in a different map/reduce operation works
>> > fine
>> > then I will go that route.
>> > but the critical to quality factor here is speed for inserts and I am
>> > going
>> > to definitely give a try to single family approach.
>> >
>> > Vishal
>> >
>> >
>> > On Wed, Feb 16, 2011 at 6:53 PM, Jean-Daniel Cryans
>> > <jd...@apache.org>wrote:
>> >
>> >> I don't understand... is having the same qualifier a hard requirement?
>> >> Worst case you could have a prefix.
>> >>
>> >> J-D
>> >>
>> >> On Wed, Feb 16, 2011 at 3:29 PM, Vishal Kapoor
>> >> <vi...@gmail.com> wrote:
>> >> > J-D,
>> >> > I also should mention that my data distribution in the three families
>> >> > are
>> >> > 1:1:1
>> >> > I have three families so that I can have same qualifiers in them. and
>> >> also
>> >> > the data in those families are LIVE:MasterA:MasterB
>> >> >
>> >> > Vishal
>> >> >
>> >> > On Wed, Feb 16, 2011 at 6:22 PM, Jean-Daniel Cryans
>> >> > <jdcryans@apache.org
>> >> >wrote:
>> >> >
>> >> >> Very often there's no need for more than 1 family, I would suggest
>> >> >> you
>> >> >> explore that possibility first.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Wed, Feb 16, 2011 at 3:13 PM, Vishal Kapoor
>> >> >> <vi...@gmail.com> wrote:
>> >> >> > does that mean I am only left with the choice of writing to the
>> >> >> > three
>> >> >> > families in three different map jobs?
>> >> >> > or can I do it any other way?
>> >> >> > thanks,
>> >> >> > Vishal
>> >> >> >
>> >> >> > On Wed, Feb 16, 2011 at 12:56 PM, Jean-Daniel Cryans <
>> >> >> jdcryans@apache.org>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> First, loading into 3 families is currently a bad idea and is
>> >> >> >> bound
>> >> to
>> >> >> >> be inefficient, here's the reason why:
>> >> >> >> https://issues.apache.org/jira/browse/HBASE-3149
>> >> >> >>
>> >> >> >> Those log lines mean that your scanning of the first table is
>> >> >> >> generating a log of block cache churn. When setting up the Map,
>> >> >> >> set
>> >> >> >> your scanner to setCacheBlocks(false) before passing it to
>> >> >> >> TableMapReduceUtil.initTableMapperJob
>> >> >> >>
>> >> >> >> Finally, you may want to give more memory to the region server.
>> >> >> >>
>> >> >> >> J-D
>> >> >> >>
>> >> >> >> On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor
>> >> >> >> <vi...@gmail.com> wrote:
>> >> >> >> > Lars,
>> >> >> >> >
>> >> >> >> > I am still working on pseudo distributed.
>> >> >> >> > hadoop-0.20.2+737/
>> >> >> >> > and hbase-0.90.0 with the hadoop jar from the hadoop install.
>> >> >> >> >
>> >> >> >> > I have a LIVE_RAW_TABLE table, which gets values from a live
>> >> >> >> > system
>> >> >> >> > I go through each row of that table and get the row ids of two
>> >> >> reference
>> >> >> >> > tables from it.
>> >> >> >> > TABLE_A and TABLE_B, then I explode this to a new table
>> >> >> >> > LIVE_TABLE
>> >> >> >> > I use
>> >> >> >> > TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null,
>> >> >> >> > job);
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > LIVE_TABLE has three families, LIVE, A, B and the row id is a
>> >> >> composite
>> >> >> >> > key
>> >> >> >> > reverseTimeStamp/rowidA/rowIdB
>> >> >> >> > after that a run a bunch of map reduce to consolidate the data,
>> >> >> >> > to start with I have around 15000 rows in LIVE_RAW_TABLE.
>> >> >> >> >
>> >> >> >> > when I start with my job, i see it running quite well till i am
>> >> almost
>> >> >> >> > done
>> >> >> >> > with 5000 rows
>> >> >> >> > then it starts printing the message in the logs, which I use to
>> >> >> >> > not
>> >> >> see
>> >> >> >> > before.
>> >> >> >> > the job use to run for around 900 sec ( I have a lot of data
>> >> parsing
>> >> >> >> > while
>> >> >> >> > exploding )
>> >> >> >> > 15000 rows from LIVE_RAW_TABLE explodes to around 500,000 rows
>> >> >> >> > in
>> >> >> >> > LIVE_TABLE.
>> >> >> >> >
>> >> >> >> > after those debug messages, the job runs for around 2500 sec,
>> >> >> >> > I have not changed anything, including the table design.
>> >> >> >> >
>> >> >> >> > here is my table description.
>> >> >> >> >
>> >> >> >> > {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE',
>> >> >> >> > BLOOMFILTER =>
>> >> >> >> > 'NONE',
>> >> >> >> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION =>
>> >> >> >> > 'NONE',
>> >> TTL
>> >> >> =>
>> >> >> >> > '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
>> >> BLOCKCACHE
>> >> >> =>
>> >> >> >> > 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE', REPLICATION_SCOPE
>> >> >> >> > =>
>> >> >> '0',
>> >> >> >> > VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647',
>> >> BLOCKSIZE
>> >> >> >> > =>
>> >> >> >> > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
>> >> >> >> > 'B',
>> >> >> >> > BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS =>
>> >> >> >> > '1',
>> >> >> >> > COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
>> >> >> >> > '65536',
>> >> >> >> > IN_MEMORY
>> >> >> >> > => 'false', BLOCKCACHE => 'true'}]}
>> >> >> >> >
>> >> >> >> > thanks for all your help.
>> >> >> >> >
>> >> >> >> > Vishal
>> >> >> >> >
>> >> >> >> > On Wed, Feb 16, 2011 at 4:26 AM, Lars George <
>> >> lars.george@gmail.com>
>> >> >> >> > wrote:
>> >> >> >> >
>> >> >> >> >> Hi Vishal,
>> >> >> >> >>
>> >> >> >> >> These are DEBUG level messages and are from the block cache,
>> >> >> >> >> there
>> >> is
>> >> >> >> >> nothing wrong with that. Can you explain more what you do and
>> >> >> >> >> see?
>> >> >> >> >>
>> >> >> >> >> Lars
>> >> >> >> >>
>> >> >> >> >> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
>> >> >> >> >> <vi...@gmail.com> wrote:
>> >> >> >> >> > all was working fine and suddenly I see a lot of logs like
>> >> >> >> >> > below
>> >> >> >> >> >
>> >> >> >> >> > 2011-02-15 22:19:04,023 DEBUG
>> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
>> >> >> >> >> > LRU
>> >> >> >> >> > eviction
>> >> >> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
>> >> >> >> >> > 2011-02-15 22:19:04,025 DEBUG
>> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
>> >> >> >> >> > LRU
>> >> >> >> >> > eviction
>> >> >> >> >> > completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB,
>> >> >> >> >> > multi=92.37
>> >> >> >> >> MB,
>> >> >> >> >> > memory=166.09 KB
>> >> >> >> >> > 2011-02-15 22:19:11,207 DEBUG
>> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
>> >> >> >> >> > LRU
>> >> >> >> >> > eviction
>> >> >> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
>> >> >> >> >> > 2011-02-15 22:19:11,444 DEBUG
>> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
>> >> >> >> >> > LRU
>> >> >> >> >> > eviction
>> >> >> >> >> > completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB,
>> >> >> >> >> > multi=93.32
>> >> >> >> >> MB,
>> >> >> >> >> > memory=166.09 KB
>> >> >> >> >> > 2011-02-15 22:19:21,494 DEBUG
>> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
>> >> >> >> >> > LRU
>> >> >> >> >> > eviction
>> >> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> >> >> > 2011-02-15 22:19:21,760 DEBUG
>> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
>> >> >> >> >> > LRU
>> >> >> >> >> > eviction
>> >> >> >> >> > completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB,
>> >> >> >> >> > multi=92.73
>> >> >> >> >> MB,
>> >> >> >> >> > memory=166.09 KB
>> >> >> >> >> > 2011-02-15 22:19:39,838 DEBUG
>> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
>> >> >> >> >> > LRU
>> >> >> >> >> > eviction
>> >> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> >> >> > 2011-02-15 22:19:39,852 DEBUG
>> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
>> >> >> >> >> > LRU
>> >> >> >> >> > eviction
>> >> >> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB,
>> >> >> >> >> > multi=91.48
>> >> >> >> >> MB,
>> >> >> >> >> > memory=166.09 KB
>> >> >> >> >> > 2011-02-15 22:19:49,768 DEBUG
>> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
>> >> >> >> >> > LRU
>> >> >> >> >> > eviction
>> >> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> >> >> > 2011-02-15 22:19:49,770 DEBUG
>> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
>> >> >> >> >> > LRU
>> >> >> >> >> > eviction
>> >> >> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB,
>> >> >> >> >> > multi=90.35
>> >> >> >> >> MB,
>> >> >> >> >> > memory=166.09 KB
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > I haven't changed anything including the table definitions.
>> >> >> >> >> > please let me know where to look...
>> >> >> >> >> >
>> >> >> >> >> > thanks,
>> >> >> >> >> > Vishal Kapoor
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >
>> >> >> >
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>
>

Re: Hbase inserts very slow

Posted by Vishal Kapoor <vi...@gmail.com>.
J-D,

I do not see any significant improvement on combining the data into a single
family,
maybe its because my data in the families are spread as 1:1:1, (single
dimension for all CF )
the next iteration I am planning is to write to these three families from
three different map/reduce jobs.
will keep all posted on my findings...

Vishal

On Wed, Feb 16, 2011 at 8:00 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> It's best to have different families for data of different nature and
> when you usually don't read/write them together. For sure it shouldn't
> slow you down as much as it does (because of HBASE-3149), but given
> the current situation it's hard to recommend multiple families.
>
> J-D
>
> On Wed, Feb 16, 2011 at 4:32 PM, Vishal Kapoor
> <vi...@gmail.com> wrote:
> > thanks J-D. for all your help, I will combine the three families and
> > re-baseline the performance.
> > but I was just wondering if I was using the family as they were suppose
> to
> > be used or not.
> > the data in these three families are different, one of them is live feed
> and
> > the two other two are master (static kind) data and it made a lot of
> logical
> > sense to separate them in different families.
> > maybe if updating a family in a different map/reduce operation works fine
> > then I will go that route.
> > but the critical to quality factor here is speed for inserts and I am
> going
> > to definitely give a try to single family approach.
> >
> > Vishal
> >
> >
> > On Wed, Feb 16, 2011 at 6:53 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >
> >> I don't understand... is having the same qualifier a hard requirement?
> >> Worst case you could have a prefix.
> >>
> >> J-D
> >>
> >> On Wed, Feb 16, 2011 at 3:29 PM, Vishal Kapoor
> >> <vi...@gmail.com> wrote:
> >> > J-D,
> >> > I also should mention that my data distribution in the three families
> are
> >> > 1:1:1
> >> > I have three families so that I can have same qualifiers in them. and
> >> also
> >> > the data in those families are LIVE:MasterA:MasterB
> >> >
> >> > Vishal
> >> >
> >> > On Wed, Feb 16, 2011 at 6:22 PM, Jean-Daniel Cryans <
> jdcryans@apache.org
> >> >wrote:
> >> >
> >> >> Very often there's no need for more than 1 family, I would suggest
> you
> >> >> explore that possibility first.
> >> >>
> >> >> J-D
> >> >>
> >> >> On Wed, Feb 16, 2011 at 3:13 PM, Vishal Kapoor
> >> >> <vi...@gmail.com> wrote:
> >> >> > does that mean I am only left with the choice of writing to the
> three
> >> >> > families in three different map jobs?
> >> >> > or can I do it any other way?
> >> >> > thanks,
> >> >> > Vishal
> >> >> >
> >> >> > On Wed, Feb 16, 2011 at 12:56 PM, Jean-Daniel Cryans <
> >> >> jdcryans@apache.org>
> >> >> > wrote:
> >> >> >>
> >> >> >> First, loading into 3 families is currently a bad idea and is
> bound
> >> to
> >> >> >> be inefficient, here's the reason why:
> >> >> >> https://issues.apache.org/jira/browse/HBASE-3149
> >> >> >>
> >> >> >> Those log lines mean that your scanning of the first table is
> >> >> >> generating a log of block cache churn. When setting up the Map,
> set
> >> >> >> your scanner to setCacheBlocks(false) before passing it to
> >> >> >> TableMapReduceUtil.initTableMapperJob
> >> >> >>
> >> >> >> Finally, you may want to give more memory to the region server.
> >> >> >>
> >> >> >> J-D
> >> >> >>
> >> >> >> On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor
> >> >> >> <vi...@gmail.com> wrote:
> >> >> >> > Lars,
> >> >> >> >
> >> >> >> > I am still working on pseudo distributed.
> >> >> >> > hadoop-0.20.2+737/
> >> >> >> > and hbase-0.90.0 with the hadoop jar from the hadoop install.
> >> >> >> >
> >> >> >> > I have a LIVE_RAW_TABLE table, which gets values from a live
> system
> >> >> >> > I go through each row of that table and get the row ids of two
> >> >> reference
> >> >> >> > tables from it.
> >> >> >> > TABLE_A and TABLE_B, then I explode this to a new table
> LIVE_TABLE
> >> >> >> > I use
> >> >> >> > TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null, job);
> >> >> >> >
> >> >> >> >
> >> >> >> > LIVE_TABLE has three families, LIVE, A, B and the row id is a
> >> >> composite
> >> >> >> > key
> >> >> >> > reverseTimeStamp/rowidA/rowIdB
> >> >> >> > after that a run a bunch of map reduce to consolidate the data,
> >> >> >> > to start with I have around 15000 rows in LIVE_RAW_TABLE.
> >> >> >> >
> >> >> >> > when I start with my job, i see it running quite well till i am
> >> almost
> >> >> >> > done
> >> >> >> > with 5000 rows
> >> >> >> > then it starts printing the message in the logs, which I use to
> not
> >> >> see
> >> >> >> > before.
> >> >> >> > the job use to run for around 900 sec ( I have a lot of data
> >> parsing
> >> >> >> > while
> >> >> >> > exploding )
> >> >> >> > 15000 rows from LIVE_RAW_TABLE explodes to around 500,000 rows
> in
> >> >> >> > LIVE_TABLE.
> >> >> >> >
> >> >> >> > after those debug messages, the job runs for around 2500 sec,
> >> >> >> > I have not changed anything, including the table design.
> >> >> >> >
> >> >> >> > here is my table description.
> >> >> >> >
> >> >> >> > {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE', BLOOMFILTER
> =>
> >> >> >> > 'NONE',
> >> >> >> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION =>
> 'NONE',
> >> TTL
> >> >> =>
> >> >> >> > '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> >> BLOCKCACHE
> >> >> =>
> >> >> >> > 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE', REPLICATION_SCOPE
> =>
> >> >> '0',
> >> >> >> > VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647',
> >> BLOCKSIZE
> >> >> >> > =>
> >> >> >> > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
> 'B',
> >> >> >> > BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS =>
> '1',
> >> >> >> > COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
> '65536',
> >> >> >> > IN_MEMORY
> >> >> >> > => 'false', BLOCKCACHE => 'true'}]}
> >> >> >> >
> >> >> >> > thanks for all your help.
> >> >> >> >
> >> >> >> > Vishal
> >> >> >> >
> >> >> >> > On Wed, Feb 16, 2011 at 4:26 AM, Lars George <
> >> lars.george@gmail.com>
> >> >> >> > wrote:
> >> >> >> >
> >> >> >> >> Hi Vishal,
> >> >> >> >>
> >> >> >> >> These are DEBUG level messages and are from the block cache,
> there
> >> is
> >> >> >> >> nothing wrong with that. Can you explain more what you do and
> see?
> >> >> >> >>
> >> >> >> >> Lars
> >> >> >> >>
> >> >> >> >> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
> >> >> >> >> <vi...@gmail.com> wrote:
> >> >> >> >> > all was working fine and suddenly I see a lot of logs like
> below
> >> >> >> >> >
> >> >> >> >> > 2011-02-15 22:19:04,023 DEBUG
> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
> LRU
> >> >> >> >> > eviction
> >> >> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
> >> >> >> >> > 2011-02-15 22:19:04,025 DEBUG
> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
> LRU
> >> >> >> >> > eviction
> >> >> >> >> > completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB,
> >> >> >> >> > multi=92.37
> >> >> >> >> MB,
> >> >> >> >> > memory=166.09 KB
> >> >> >> >> > 2011-02-15 22:19:11,207 DEBUG
> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
> LRU
> >> >> >> >> > eviction
> >> >> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
> >> >> >> >> > 2011-02-15 22:19:11,444 DEBUG
> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
> LRU
> >> >> >> >> > eviction
> >> >> >> >> > completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB,
> >> >> >> >> > multi=93.32
> >> >> >> >> MB,
> >> >> >> >> > memory=166.09 KB
> >> >> >> >> > 2011-02-15 22:19:21,494 DEBUG
> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
> LRU
> >> >> >> >> > eviction
> >> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> >> >> >> > 2011-02-15 22:19:21,760 DEBUG
> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
> LRU
> >> >> >> >> > eviction
> >> >> >> >> > completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB,
> >> >> >> >> > multi=92.73
> >> >> >> >> MB,
> >> >> >> >> > memory=166.09 KB
> >> >> >> >> > 2011-02-15 22:19:39,838 DEBUG
> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
> LRU
> >> >> >> >> > eviction
> >> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> >> >> >> > 2011-02-15 22:19:39,852 DEBUG
> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
> LRU
> >> >> >> >> > eviction
> >> >> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB,
> >> >> >> >> > multi=91.48
> >> >> >> >> MB,
> >> >> >> >> > memory=166.09 KB
> >> >> >> >> > 2011-02-15 22:19:49,768 DEBUG
> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
> LRU
> >> >> >> >> > eviction
> >> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> >> >> >> > 2011-02-15 22:19:49,770 DEBUG
> >> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache
> LRU
> >> >> >> >> > eviction
> >> >> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB,
> >> >> >> >> > multi=90.35
> >> >> >> >> MB,
> >> >> >> >> > memory=166.09 KB
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > I haven't changed anything including the table definitions.
> >> >> >> >> > please let me know where to look...
> >> >> >> >> >
> >> >> >> >> > thanks,
> >> >> >> >> > Vishal Kapoor
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >
> >> >> >
> >> >> >
> >> >>
> >> >
> >>
> >
>

Re: Hbase inserts very slow

Posted by Jean-Daniel Cryans <jd...@apache.org>.
It's best to have different families for data of different nature and
when you usually don't read/write them together. For sure it shouldn't
slow you down as much as it does (because of HBASE-3149), but given
the current situation it's hard to recommend multiple families.

J-D

On Wed, Feb 16, 2011 at 4:32 PM, Vishal Kapoor
<vi...@gmail.com> wrote:
> thanks J-D. for all your help, I will combine the three families and
> re-baseline the performance.
> but I was just wondering if I was using the family as they were suppose to
> be used or not.
> the data in these three families are different, one of them is live feed and
> the two other two are master (static kind) data and it made a lot of logical
> sense to separate them in different families.
> maybe if updating a family in a different map/reduce operation works fine
> then I will go that route.
> but the critical to quality factor here is speed for inserts and I am going
> to definitely give a try to single family approach.
>
> Vishal
>
>
> On Wed, Feb 16, 2011 at 6:53 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> I don't understand... is having the same qualifier a hard requirement?
>> Worst case you could have a prefix.
>>
>> J-D
>>
>> On Wed, Feb 16, 2011 at 3:29 PM, Vishal Kapoor
>> <vi...@gmail.com> wrote:
>> > J-D,
>> > I also should mention that my data distribution in the three families are
>> > 1:1:1
>> > I have three families so that I can have same qualifiers in them. and
>> also
>> > the data in those families are LIVE:MasterA:MasterB
>> >
>> > Vishal
>> >
>> > On Wed, Feb 16, 2011 at 6:22 PM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>> >
>> >> Very often there's no need for more than 1 family, I would suggest you
>> >> explore that possibility first.
>> >>
>> >> J-D
>> >>
>> >> On Wed, Feb 16, 2011 at 3:13 PM, Vishal Kapoor
>> >> <vi...@gmail.com> wrote:
>> >> > does that mean I am only left with the choice of writing to the three
>> >> > families in three different map jobs?
>> >> > or can I do it any other way?
>> >> > thanks,
>> >> > Vishal
>> >> >
>> >> > On Wed, Feb 16, 2011 at 12:56 PM, Jean-Daniel Cryans <
>> >> jdcryans@apache.org>
>> >> > wrote:
>> >> >>
>> >> >> First, loading into 3 families is currently a bad idea and is bound
>> to
>> >> >> be inefficient, here's the reason why:
>> >> >> https://issues.apache.org/jira/browse/HBASE-3149
>> >> >>
>> >> >> Those log lines mean that your scanning of the first table is
>> >> >> generating a log of block cache churn. When setting up the Map, set
>> >> >> your scanner to setCacheBlocks(false) before passing it to
>> >> >> TableMapReduceUtil.initTableMapperJob
>> >> >>
>> >> >> Finally, you may want to give more memory to the region server.
>> >> >>
>> >> >> J-D
>> >> >>
>> >> >> On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor
>> >> >> <vi...@gmail.com> wrote:
>> >> >> > Lars,
>> >> >> >
>> >> >> > I am still working on pseudo distributed.
>> >> >> > hadoop-0.20.2+737/
>> >> >> > and hbase-0.90.0 with the hadoop jar from the hadoop install.
>> >> >> >
>> >> >> > I have a LIVE_RAW_TABLE table, which gets values from a live system
>> >> >> > I go through each row of that table and get the row ids of two
>> >> reference
>> >> >> > tables from it.
>> >> >> > TABLE_A and TABLE_B, then I explode this to a new table LIVE_TABLE
>> >> >> > I use
>> >> >> > TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null, job);
>> >> >> >
>> >> >> >
>> >> >> > LIVE_TABLE has three families, LIVE, A, B and the row id is a
>> >> composite
>> >> >> > key
>> >> >> > reverseTimeStamp/rowidA/rowIdB
>> >> >> > after that a run a bunch of map reduce to consolidate the data,
>> >> >> > to start with I have around 15000 rows in LIVE_RAW_TABLE.
>> >> >> >
>> >> >> > when I start with my job, i see it running quite well till i am
>> almost
>> >> >> > done
>> >> >> > with 5000 rows
>> >> >> > then it starts printing the message in the logs, which I use to not
>> >> see
>> >> >> > before.
>> >> >> > the job use to run for around 900 sec ( I have a lot of data
>> parsing
>> >> >> > while
>> >> >> > exploding )
>> >> >> > 15000 rows from LIVE_RAW_TABLE explodes to around 500,000 rows in
>> >> >> > LIVE_TABLE.
>> >> >> >
>> >> >> > after those debug messages, the job runs for around 2500 sec,
>> >> >> > I have not changed anything, including the table design.
>> >> >> >
>> >> >> > here is my table description.
>> >> >> >
>> >> >> > {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE', BLOOMFILTER =>
>> >> >> > 'NONE',
>> >> >> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE',
>> TTL
>> >> =>
>> >> >> > '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
>> BLOCKCACHE
>> >> =>
>> >> >> > 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE', REPLICATION_SCOPE =>
>> >> '0',
>> >> >> > VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647',
>> BLOCKSIZE
>> >> >> > =>
>> >> >> > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'B',
>> >> >> > BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '1',
>> >> >> > COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536',
>> >> >> > IN_MEMORY
>> >> >> > => 'false', BLOCKCACHE => 'true'}]}
>> >> >> >
>> >> >> > thanks for all your help.
>> >> >> >
>> >> >> > Vishal
>> >> >> >
>> >> >> > On Wed, Feb 16, 2011 at 4:26 AM, Lars George <
>> lars.george@gmail.com>
>> >> >> > wrote:
>> >> >> >
>> >> >> >> Hi Vishal,
>> >> >> >>
>> >> >> >> These are DEBUG level messages and are from the block cache, there
>> is
>> >> >> >> nothing wrong with that. Can you explain more what you do and see?
>> >> >> >>
>> >> >> >> Lars
>> >> >> >>
>> >> >> >> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
>> >> >> >> <vi...@gmail.com> wrote:
>> >> >> >> > all was working fine and suddenly I see a lot of logs like below
>> >> >> >> >
>> >> >> >> > 2011-02-15 22:19:04,023 DEBUG
>> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> >> > eviction
>> >> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
>> >> >> >> > 2011-02-15 22:19:04,025 DEBUG
>> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> >> > eviction
>> >> >> >> > completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB,
>> >> >> >> > multi=92.37
>> >> >> >> MB,
>> >> >> >> > memory=166.09 KB
>> >> >> >> > 2011-02-15 22:19:11,207 DEBUG
>> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> >> > eviction
>> >> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
>> >> >> >> > 2011-02-15 22:19:11,444 DEBUG
>> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> >> > eviction
>> >> >> >> > completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB,
>> >> >> >> > multi=93.32
>> >> >> >> MB,
>> >> >> >> > memory=166.09 KB
>> >> >> >> > 2011-02-15 22:19:21,494 DEBUG
>> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> >> > eviction
>> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> >> > 2011-02-15 22:19:21,760 DEBUG
>> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> >> > eviction
>> >> >> >> > completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB,
>> >> >> >> > multi=92.73
>> >> >> >> MB,
>> >> >> >> > memory=166.09 KB
>> >> >> >> > 2011-02-15 22:19:39,838 DEBUG
>> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> >> > eviction
>> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> >> > 2011-02-15 22:19:39,852 DEBUG
>> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> >> > eviction
>> >> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB,
>> >> >> >> > multi=91.48
>> >> >> >> MB,
>> >> >> >> > memory=166.09 KB
>> >> >> >> > 2011-02-15 22:19:49,768 DEBUG
>> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> >> > eviction
>> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> >> > 2011-02-15 22:19:49,770 DEBUG
>> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> >> > eviction
>> >> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB,
>> >> >> >> > multi=90.35
>> >> >> >> MB,
>> >> >> >> > memory=166.09 KB
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > I haven't changed anything including the table definitions.
>> >> >> >> > please let me know where to look...
>> >> >> >> >
>> >> >> >> > thanks,
>> >> >> >> > Vishal Kapoor
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >
>> >> >
>> >>
>> >
>>
>

Re: Hbase inserts very slow

Posted by Vishal Kapoor <vi...@gmail.com>.
thanks J-D. for all your help, I will combine the three families and
re-baseline the performance.
but I was just wondering if I was using the family as they were suppose to
be used or not.
the data in these three families are different, one of them is live feed and
the two other two are master (static kind) data and it made a lot of logical
sense to separate them in different families.
maybe if updating a family in a different map/reduce operation works fine
then I will go that route.
but the critical to quality factor here is speed for inserts and I am going
to definitely give a try to single family approach.

Vishal


On Wed, Feb 16, 2011 at 6:53 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> I don't understand... is having the same qualifier a hard requirement?
> Worst case you could have a prefix.
>
> J-D
>
> On Wed, Feb 16, 2011 at 3:29 PM, Vishal Kapoor
> <vi...@gmail.com> wrote:
> > J-D,
> > I also should mention that my data distribution in the three families are
> > 1:1:1
> > I have three families so that I can have same qualifiers in them. and
> also
> > the data in those families are LIVE:MasterA:MasterB
> >
> > Vishal
> >
> > On Wed, Feb 16, 2011 at 6:22 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
> >
> >> Very often there's no need for more than 1 family, I would suggest you
> >> explore that possibility first.
> >>
> >> J-D
> >>
> >> On Wed, Feb 16, 2011 at 3:13 PM, Vishal Kapoor
> >> <vi...@gmail.com> wrote:
> >> > does that mean I am only left with the choice of writing to the three
> >> > families in three different map jobs?
> >> > or can I do it any other way?
> >> > thanks,
> >> > Vishal
> >> >
> >> > On Wed, Feb 16, 2011 at 12:56 PM, Jean-Daniel Cryans <
> >> jdcryans@apache.org>
> >> > wrote:
> >> >>
> >> >> First, loading into 3 families is currently a bad idea and is bound
> to
> >> >> be inefficient, here's the reason why:
> >> >> https://issues.apache.org/jira/browse/HBASE-3149
> >> >>
> >> >> Those log lines mean that your scanning of the first table is
> >> >> generating a log of block cache churn. When setting up the Map, set
> >> >> your scanner to setCacheBlocks(false) before passing it to
> >> >> TableMapReduceUtil.initTableMapperJob
> >> >>
> >> >> Finally, you may want to give more memory to the region server.
> >> >>
> >> >> J-D
> >> >>
> >> >> On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor
> >> >> <vi...@gmail.com> wrote:
> >> >> > Lars,
> >> >> >
> >> >> > I am still working on pseudo distributed.
> >> >> > hadoop-0.20.2+737/
> >> >> > and hbase-0.90.0 with the hadoop jar from the hadoop install.
> >> >> >
> >> >> > I have a LIVE_RAW_TABLE table, which gets values from a live system
> >> >> > I go through each row of that table and get the row ids of two
> >> reference
> >> >> > tables from it.
> >> >> > TABLE_A and TABLE_B, then I explode this to a new table LIVE_TABLE
> >> >> > I use
> >> >> > TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null, job);
> >> >> >
> >> >> >
> >> >> > LIVE_TABLE has three families, LIVE, A, B and the row id is a
> >> composite
> >> >> > key
> >> >> > reverseTimeStamp/rowidA/rowIdB
> >> >> > after that a run a bunch of map reduce to consolidate the data,
> >> >> > to start with I have around 15000 rows in LIVE_RAW_TABLE.
> >> >> >
> >> >> > when I start with my job, i see it running quite well till i am
> almost
> >> >> > done
> >> >> > with 5000 rows
> >> >> > then it starts printing the message in the logs, which I use to not
> >> see
> >> >> > before.
> >> >> > the job use to run for around 900 sec ( I have a lot of data
> parsing
> >> >> > while
> >> >> > exploding )
> >> >> > 15000 rows from LIVE_RAW_TABLE explodes to around 500,000 rows in
> >> >> > LIVE_TABLE.
> >> >> >
> >> >> > after those debug messages, the job runs for around 2500 sec,
> >> >> > I have not changed anything, including the table design.
> >> >> >
> >> >> > here is my table description.
> >> >> >
> >> >> > {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE', BLOOMFILTER =>
> >> >> > 'NONE',
> >> >> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE',
> TTL
> >> =>
> >> >> > '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> BLOCKCACHE
> >> =>
> >> >> > 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE', REPLICATION_SCOPE =>
> >> '0',
> >> >> > VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647',
> BLOCKSIZE
> >> >> > =>
> >> >> > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'B',
> >> >> > BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '1',
> >> >> > COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536',
> >> >> > IN_MEMORY
> >> >> > => 'false', BLOCKCACHE => 'true'}]}
> >> >> >
> >> >> > thanks for all your help.
> >> >> >
> >> >> > Vishal
> >> >> >
> >> >> > On Wed, Feb 16, 2011 at 4:26 AM, Lars George <
> lars.george@gmail.com>
> >> >> > wrote:
> >> >> >
> >> >> >> Hi Vishal,
> >> >> >>
> >> >> >> These are DEBUG level messages and are from the block cache, there
> is
> >> >> >> nothing wrong with that. Can you explain more what you do and see?
> >> >> >>
> >> >> >> Lars
> >> >> >>
> >> >> >> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
> >> >> >> <vi...@gmail.com> wrote:
> >> >> >> > all was working fine and suddenly I see a lot of logs like below
> >> >> >> >
> >> >> >> > 2011-02-15 22:19:04,023 DEBUG
> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> >> > eviction
> >> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
> >> >> >> > 2011-02-15 22:19:04,025 DEBUG
> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> >> > eviction
> >> >> >> > completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB,
> >> >> >> > multi=92.37
> >> >> >> MB,
> >> >> >> > memory=166.09 KB
> >> >> >> > 2011-02-15 22:19:11,207 DEBUG
> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> >> > eviction
> >> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
> >> >> >> > 2011-02-15 22:19:11,444 DEBUG
> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> >> > eviction
> >> >> >> > completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB,
> >> >> >> > multi=93.32
> >> >> >> MB,
> >> >> >> > memory=166.09 KB
> >> >> >> > 2011-02-15 22:19:21,494 DEBUG
> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> >> > eviction
> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> >> >> > 2011-02-15 22:19:21,760 DEBUG
> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> >> > eviction
> >> >> >> > completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB,
> >> >> >> > multi=92.73
> >> >> >> MB,
> >> >> >> > memory=166.09 KB
> >> >> >> > 2011-02-15 22:19:39,838 DEBUG
> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> >> > eviction
> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> >> >> > 2011-02-15 22:19:39,852 DEBUG
> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> >> > eviction
> >> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB,
> >> >> >> > multi=91.48
> >> >> >> MB,
> >> >> >> > memory=166.09 KB
> >> >> >> > 2011-02-15 22:19:49,768 DEBUG
> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> >> > eviction
> >> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> >> >> > 2011-02-15 22:19:49,770 DEBUG
> >> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> >> > eviction
> >> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB,
> >> >> >> > multi=90.35
> >> >> >> MB,
> >> >> >> > memory=166.09 KB
> >> >> >> >
> >> >> >> >
> >> >> >> > I haven't changed anything including the table definitions.
> >> >> >> > please let me know where to look...
> >> >> >> >
> >> >> >> > thanks,
> >> >> >> > Vishal Kapoor
> >> >> >> >
> >> >> >>
> >> >> >
> >> >
> >> >
> >>
> >
>

Re: Hbase inserts very slow

Posted by Jean-Daniel Cryans <jd...@apache.org>.
I don't understand... is having the same qualifier a hard requirement?
Worst case you could have a prefix.

J-D

On Wed, Feb 16, 2011 at 3:29 PM, Vishal Kapoor
<vi...@gmail.com> wrote:
> J-D,
> I also should mention that my data distribution in the three families are
> 1:1:1
> I have three families so that I can have same qualifiers in them. and also
> the data in those families are LIVE:MasterA:MasterB
>
> Vishal
>
> On Wed, Feb 16, 2011 at 6:22 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> Very often there's no need for more than 1 family, I would suggest you
>> explore that possibility first.
>>
>> J-D
>>
>> On Wed, Feb 16, 2011 at 3:13 PM, Vishal Kapoor
>> <vi...@gmail.com> wrote:
>> > does that mean I am only left with the choice of writing to the three
>> > families in three different map jobs?
>> > or can I do it any other way?
>> > thanks,
>> > Vishal
>> >
>> > On Wed, Feb 16, 2011 at 12:56 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org>
>> > wrote:
>> >>
>> >> First, loading into 3 families is currently a bad idea and is bound to
>> >> be inefficient, here's the reason why:
>> >> https://issues.apache.org/jira/browse/HBASE-3149
>> >>
>> >> Those log lines mean that your scanning of the first table is
>> >> generating a log of block cache churn. When setting up the Map, set
>> >> your scanner to setCacheBlocks(false) before passing it to
>> >> TableMapReduceUtil.initTableMapperJob
>> >>
>> >> Finally, you may want to give more memory to the region server.
>> >>
>> >> J-D
>> >>
>> >> On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor
>> >> <vi...@gmail.com> wrote:
>> >> > Lars,
>> >> >
>> >> > I am still working on pseudo distributed.
>> >> > hadoop-0.20.2+737/
>> >> > and hbase-0.90.0 with the hadoop jar from the hadoop install.
>> >> >
>> >> > I have a LIVE_RAW_TABLE table, which gets values from a live system
>> >> > I go through each row of that table and get the row ids of two
>> reference
>> >> > tables from it.
>> >> > TABLE_A and TABLE_B, then I explode this to a new table LIVE_TABLE
>> >> > I use
>> >> > TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null, job);
>> >> >
>> >> >
>> >> > LIVE_TABLE has three families, LIVE, A, B and the row id is a
>> composite
>> >> > key
>> >> > reverseTimeStamp/rowidA/rowIdB
>> >> > after that a run a bunch of map reduce to consolidate the data,
>> >> > to start with I have around 15000 rows in LIVE_RAW_TABLE.
>> >> >
>> >> > when I start with my job, i see it running quite well till i am almost
>> >> > done
>> >> > with 5000 rows
>> >> > then it starts printing the message in the logs, which I use to not
>> see
>> >> > before.
>> >> > the job use to run for around 900 sec ( I have a lot of data parsing
>> >> > while
>> >> > exploding )
>> >> > 15000 rows from LIVE_RAW_TABLE explodes to around 500,000 rows in
>> >> > LIVE_TABLE.
>> >> >
>> >> > after those debug messages, the job runs for around 2500 sec,
>> >> > I have not changed anything, including the table design.
>> >> >
>> >> > here is my table description.
>> >> >
>> >> > {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE', BLOOMFILTER =>
>> >> > 'NONE',
>> >> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', TTL
>> =>
>> >> > '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE
>> =>
>> >> > 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE', REPLICATION_SCOPE =>
>> '0',
>> >> > VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE
>> >> > =>
>> >> > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'B',
>> >> > BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '1',
>> >> > COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536',
>> >> > IN_MEMORY
>> >> > => 'false', BLOCKCACHE => 'true'}]}
>> >> >
>> >> > thanks for all your help.
>> >> >
>> >> > Vishal
>> >> >
>> >> > On Wed, Feb 16, 2011 at 4:26 AM, Lars George <la...@gmail.com>
>> >> > wrote:
>> >> >
>> >> >> Hi Vishal,
>> >> >>
>> >> >> These are DEBUG level messages and are from the block cache, there is
>> >> >> nothing wrong with that. Can you explain more what you do and see?
>> >> >>
>> >> >> Lars
>> >> >>
>> >> >> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
>> >> >> <vi...@gmail.com> wrote:
>> >> >> > all was working fine and suddenly I see a lot of logs like below
>> >> >> >
>> >> >> > 2011-02-15 22:19:04,023 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> > eviction
>> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
>> >> >> > 2011-02-15 22:19:04,025 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> > eviction
>> >> >> > completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB,
>> >> >> > multi=92.37
>> >> >> MB,
>> >> >> > memory=166.09 KB
>> >> >> > 2011-02-15 22:19:11,207 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> > eviction
>> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
>> >> >> > 2011-02-15 22:19:11,444 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> > eviction
>> >> >> > completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB,
>> >> >> > multi=93.32
>> >> >> MB,
>> >> >> > memory=166.09 KB
>> >> >> > 2011-02-15 22:19:21,494 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> > eviction
>> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> > 2011-02-15 22:19:21,760 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> > eviction
>> >> >> > completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB,
>> >> >> > multi=92.73
>> >> >> MB,
>> >> >> > memory=166.09 KB
>> >> >> > 2011-02-15 22:19:39,838 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> > eviction
>> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> > 2011-02-15 22:19:39,852 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> > eviction
>> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB,
>> >> >> > multi=91.48
>> >> >> MB,
>> >> >> > memory=166.09 KB
>> >> >> > 2011-02-15 22:19:49,768 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> > eviction
>> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> >> > 2011-02-15 22:19:49,770 DEBUG
>> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> >> > eviction
>> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB,
>> >> >> > multi=90.35
>> >> >> MB,
>> >> >> > memory=166.09 KB
>> >> >> >
>> >> >> >
>> >> >> > I haven't changed anything including the table definitions.
>> >> >> > please let me know where to look...
>> >> >> >
>> >> >> > thanks,
>> >> >> > Vishal Kapoor
>> >> >> >
>> >> >>
>> >> >
>> >
>> >
>>
>

Re: Hbase inserts very slow

Posted by Vishal Kapoor <vi...@gmail.com>.
J-D,
I also should mention that my data distribution in the three families are
1:1:1
I have three families so that I can have same qualifiers in them. and also
the data in those families are LIVE:MasterA:MasterB

Vishal

On Wed, Feb 16, 2011 at 6:22 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Very often there's no need for more than 1 family, I would suggest you
> explore that possibility first.
>
> J-D
>
> On Wed, Feb 16, 2011 at 3:13 PM, Vishal Kapoor
> <vi...@gmail.com> wrote:
> > does that mean I am only left with the choice of writing to the three
> > families in three different map jobs?
> > or can I do it any other way?
> > thanks,
> > Vishal
> >
> > On Wed, Feb 16, 2011 at 12:56 PM, Jean-Daniel Cryans <
> jdcryans@apache.org>
> > wrote:
> >>
> >> First, loading into 3 families is currently a bad idea and is bound to
> >> be inefficient, here's the reason why:
> >> https://issues.apache.org/jira/browse/HBASE-3149
> >>
> >> Those log lines mean that your scanning of the first table is
> >> generating a log of block cache churn. When setting up the Map, set
> >> your scanner to setCacheBlocks(false) before passing it to
> >> TableMapReduceUtil.initTableMapperJob
> >>
> >> Finally, you may want to give more memory to the region server.
> >>
> >> J-D
> >>
> >> On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor
> >> <vi...@gmail.com> wrote:
> >> > Lars,
> >> >
> >> > I am still working on pseudo distributed.
> >> > hadoop-0.20.2+737/
> >> > and hbase-0.90.0 with the hadoop jar from the hadoop install.
> >> >
> >> > I have a LIVE_RAW_TABLE table, which gets values from a live system
> >> > I go through each row of that table and get the row ids of two
> reference
> >> > tables from it.
> >> > TABLE_A and TABLE_B, then I explode this to a new table LIVE_TABLE
> >> > I use
> >> > TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null, job);
> >> >
> >> >
> >> > LIVE_TABLE has three families, LIVE, A, B and the row id is a
> composite
> >> > key
> >> > reverseTimeStamp/rowidA/rowIdB
> >> > after that a run a bunch of map reduce to consolidate the data,
> >> > to start with I have around 15000 rows in LIVE_RAW_TABLE.
> >> >
> >> > when I start with my job, i see it running quite well till i am almost
> >> > done
> >> > with 5000 rows
> >> > then it starts printing the message in the logs, which I use to not
> see
> >> > before.
> >> > the job use to run for around 900 sec ( I have a lot of data parsing
> >> > while
> >> > exploding )
> >> > 15000 rows from LIVE_RAW_TABLE explodes to around 500,000 rows in
> >> > LIVE_TABLE.
> >> >
> >> > after those debug messages, the job runs for around 2500 sec,
> >> > I have not changed anything, including the table design.
> >> >
> >> > here is my table description.
> >> >
> >> > {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE', BLOOMFILTER =>
> >> > 'NONE',
> >> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', TTL
> =>
> >> > '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE
> =>
> >> > 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE', REPLICATION_SCOPE =>
> '0',
> >> > VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE
> >> > =>
> >> > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'B',
> >> > BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '1',
> >> > COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536',
> >> > IN_MEMORY
> >> > => 'false', BLOCKCACHE => 'true'}]}
> >> >
> >> > thanks for all your help.
> >> >
> >> > Vishal
> >> >
> >> > On Wed, Feb 16, 2011 at 4:26 AM, Lars George <la...@gmail.com>
> >> > wrote:
> >> >
> >> >> Hi Vishal,
> >> >>
> >> >> These are DEBUG level messages and are from the block cache, there is
> >> >> nothing wrong with that. Can you explain more what you do and see?
> >> >>
> >> >> Lars
> >> >>
> >> >> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
> >> >> <vi...@gmail.com> wrote:
> >> >> > all was working fine and suddenly I see a lot of logs like below
> >> >> >
> >> >> > 2011-02-15 22:19:04,023 DEBUG
> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> > eviction
> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
> >> >> > 2011-02-15 22:19:04,025 DEBUG
> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> > eviction
> >> >> > completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB,
> >> >> > multi=92.37
> >> >> MB,
> >> >> > memory=166.09 KB
> >> >> > 2011-02-15 22:19:11,207 DEBUG
> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> > eviction
> >> >> > started; Attempting to free 19.88 MB of total=168.64 MB
> >> >> > 2011-02-15 22:19:11,444 DEBUG
> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> > eviction
> >> >> > completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB,
> >> >> > multi=93.32
> >> >> MB,
> >> >> > memory=166.09 KB
> >> >> > 2011-02-15 22:19:21,494 DEBUG
> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> > eviction
> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> >> > 2011-02-15 22:19:21,760 DEBUG
> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> > eviction
> >> >> > completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB,
> >> >> > multi=92.73
> >> >> MB,
> >> >> > memory=166.09 KB
> >> >> > 2011-02-15 22:19:39,838 DEBUG
> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> > eviction
> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> >> > 2011-02-15 22:19:39,852 DEBUG
> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> > eviction
> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB,
> >> >> > multi=91.48
> >> >> MB,
> >> >> > memory=166.09 KB
> >> >> > 2011-02-15 22:19:49,768 DEBUG
> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> > eviction
> >> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> >> > 2011-02-15 22:19:49,770 DEBUG
> >> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> >> >> > eviction
> >> >> > completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB,
> >> >> > multi=90.35
> >> >> MB,
> >> >> > memory=166.09 KB
> >> >> >
> >> >> >
> >> >> > I haven't changed anything including the table definitions.
> >> >> > please let me know where to look...
> >> >> >
> >> >> > thanks,
> >> >> > Vishal Kapoor
> >> >> >
> >> >>
> >> >
> >
> >
>

Re: Hbase inserts very slow

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Very often there's no need for more than 1 family, I would suggest you
explore that possibility first.

J-D

On Wed, Feb 16, 2011 at 3:13 PM, Vishal Kapoor
<vi...@gmail.com> wrote:
> does that mean I am only left with the choice of writing to the three
> families in three different map jobs?
> or can I do it any other way?
> thanks,
> Vishal
>
> On Wed, Feb 16, 2011 at 12:56 PM, Jean-Daniel Cryans <jd...@apache.org>
> wrote:
>>
>> First, loading into 3 families is currently a bad idea and is bound to
>> be inefficient, here's the reason why:
>> https://issues.apache.org/jira/browse/HBASE-3149
>>
>> Those log lines mean that your scanning of the first table is
>> generating a log of block cache churn. When setting up the Map, set
>> your scanner to setCacheBlocks(false) before passing it to
>> TableMapReduceUtil.initTableMapperJob
>>
>> Finally, you may want to give more memory to the region server.
>>
>> J-D
>>
>> On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor
>> <vi...@gmail.com> wrote:
>> > Lars,
>> >
>> > I am still working on pseudo distributed.
>> > hadoop-0.20.2+737/
>> > and hbase-0.90.0 with the hadoop jar from the hadoop install.
>> >
>> > I have a LIVE_RAW_TABLE table, which gets values from a live system
>> > I go through each row of that table and get the row ids of two reference
>> > tables from it.
>> > TABLE_A and TABLE_B, then I explode this to a new table LIVE_TABLE
>> > I use
>> > TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null, job);
>> >
>> >
>> > LIVE_TABLE has three families, LIVE, A, B and the row id is a composite
>> > key
>> > reverseTimeStamp/rowidA/rowIdB
>> > after that a run a bunch of map reduce to consolidate the data,
>> > to start with I have around 15000 rows in LIVE_RAW_TABLE.
>> >
>> > when I start with my job, i see it running quite well till i am almost
>> > done
>> > with 5000 rows
>> > then it starts printing the message in the logs, which I use to not see
>> > before.
>> > the job use to run for around 900 sec ( I have a lot of data parsing
>> > while
>> > exploding )
>> > 15000 rows from LIVE_RAW_TABLE explodes to around 500,000 rows in
>> > LIVE_TABLE.
>> >
>> > after those debug messages, the job runs for around 2500 sec,
>> > I have not changed anything, including the table design.
>> >
>> > here is my table description.
>> >
>> > {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE', BLOOMFILTER =>
>> > 'NONE',
>> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', TTL =>
>> > '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>
>> > 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0',
>> > VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE
>> > =>
>> > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'B',
>> > BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '1',
>> > COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536',
>> > IN_MEMORY
>> > => 'false', BLOCKCACHE => 'true'}]}
>> >
>> > thanks for all your help.
>> >
>> > Vishal
>> >
>> > On Wed, Feb 16, 2011 at 4:26 AM, Lars George <la...@gmail.com>
>> > wrote:
>> >
>> >> Hi Vishal,
>> >>
>> >> These are DEBUG level messages and are from the block cache, there is
>> >> nothing wrong with that. Can you explain more what you do and see?
>> >>
>> >> Lars
>> >>
>> >> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
>> >> <vi...@gmail.com> wrote:
>> >> > all was working fine and suddenly I see a lot of logs like below
>> >> >
>> >> > 2011-02-15 22:19:04,023 DEBUG
>> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> > eviction
>> >> > started; Attempting to free 19.88 MB of total=168.64 MB
>> >> > 2011-02-15 22:19:04,025 DEBUG
>> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> > eviction
>> >> > completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB,
>> >> > multi=92.37
>> >> MB,
>> >> > memory=166.09 KB
>> >> > 2011-02-15 22:19:11,207 DEBUG
>> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> > eviction
>> >> > started; Attempting to free 19.88 MB of total=168.64 MB
>> >> > 2011-02-15 22:19:11,444 DEBUG
>> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> > eviction
>> >> > completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB,
>> >> > multi=93.32
>> >> MB,
>> >> > memory=166.09 KB
>> >> > 2011-02-15 22:19:21,494 DEBUG
>> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> > eviction
>> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> > 2011-02-15 22:19:21,760 DEBUG
>> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> > eviction
>> >> > completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB,
>> >> > multi=92.73
>> >> MB,
>> >> > memory=166.09 KB
>> >> > 2011-02-15 22:19:39,838 DEBUG
>> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> > eviction
>> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> > 2011-02-15 22:19:39,852 DEBUG
>> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> > eviction
>> >> > completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB,
>> >> > multi=91.48
>> >> MB,
>> >> > memory=166.09 KB
>> >> > 2011-02-15 22:19:49,768 DEBUG
>> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> > eviction
>> >> > started; Attempting to free 19.87 MB of total=168.62 MB
>> >> > 2011-02-15 22:19:49,770 DEBUG
>> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
>> >> > eviction
>> >> > completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB,
>> >> > multi=90.35
>> >> MB,
>> >> > memory=166.09 KB
>> >> >
>> >> >
>> >> > I haven't changed anything including the table definitions.
>> >> > please let me know where to look...
>> >> >
>> >> > thanks,
>> >> > Vishal Kapoor
>> >> >
>> >>
>> >
>
>

Re: Hbase inserts very slow

Posted by Vishal Kapoor <vi...@gmail.com>.
does that mean I am only left with the choice of writing to the three
families in three different map jobs?
or can I do it any other way?

thanks,
Vishal

On Wed, Feb 16, 2011 at 12:56 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> First, loading into 3 families is currently a bad idea and is bound to
> be inefficient, here's the reason why:
> https://issues.apache.org/jira/browse/HBASE-3149
>
> Those log lines mean that your scanning of the first table is
> generating a log of block cache churn. When setting up the Map, set
> your scanner to setCacheBlocks(false) before passing it to
> TableMapReduceUtil.initTableMapperJob
>
> Finally, you may want to give more memory to the region server.
>
> J-D
>
> On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor
> <vi...@gmail.com> wrote:
> > Lars,
> >
> > I am still working on pseudo distributed.
> > hadoop-0.20.2+737/
> > and hbase-0.90.0 with the hadoop jar from the hadoop install.
> >
> > I have a LIVE_RAW_TABLE table, which gets values from a live system
> > I go through each row of that table and get the row ids of two reference
> > tables from it.
> > TABLE_A and TABLE_B, then I explode this to a new table LIVE_TABLE
> > I use
> > TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null, job);
> >
> >
> > LIVE_TABLE has three families, LIVE, A, B and the row id is a composite
> key
> > reverseTimeStamp/rowidA/rowIdB
> > after that a run a bunch of map reduce to consolidate the data,
> > to start with I have around 15000 rows in LIVE_RAW_TABLE.
> >
> > when I start with my job, i see it running quite well till i am almost
> done
> > with 5000 rows
> > then it starts printing the message in the logs, which I use to not see
> > before.
> > the job use to run for around 900 sec ( I have a lot of data parsing
> while
> > exploding )
> > 15000 rows from LIVE_RAW_TABLE explodes to around 500,000 rows in
> > LIVE_TABLE.
> >
> > after those debug messages, the job runs for around 2500 sec,
> > I have not changed anything, including the table design.
> >
> > here is my table description.
> >
> > {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE', BLOOMFILTER =>
> 'NONE',
> > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', TTL =>
> > '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>
> > 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0',
> > VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
> > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'B',
> > BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '1',
> > COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536',
> IN_MEMORY
> > => 'false', BLOCKCACHE => 'true'}]}
> >
> > thanks for all your help.
> >
> > Vishal
> >
> > On Wed, Feb 16, 2011 at 4:26 AM, Lars George <la...@gmail.com>
> wrote:
> >
> >> Hi Vishal,
> >>
> >> These are DEBUG level messages and are from the block cache, there is
> >> nothing wrong with that. Can you explain more what you do and see?
> >>
> >> Lars
> >>
> >> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
> >> <vi...@gmail.com> wrote:
> >> > all was working fine and suddenly I see a lot of logs like below
> >> >
> >> > 2011-02-15 22:19:04,023 DEBUG
> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> eviction
> >> > started; Attempting to free 19.88 MB of total=168.64 MB
> >> > 2011-02-15 22:19:04,025 DEBUG
> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> eviction
> >> > completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB,
> multi=92.37
> >> MB,
> >> > memory=166.09 KB
> >> > 2011-02-15 22:19:11,207 DEBUG
> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> eviction
> >> > started; Attempting to free 19.88 MB of total=168.64 MB
> >> > 2011-02-15 22:19:11,444 DEBUG
> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> eviction
> >> > completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB,
> multi=93.32
> >> MB,
> >> > memory=166.09 KB
> >> > 2011-02-15 22:19:21,494 DEBUG
> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> eviction
> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> > 2011-02-15 22:19:21,760 DEBUG
> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> eviction
> >> > completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB,
> multi=92.73
> >> MB,
> >> > memory=166.09 KB
> >> > 2011-02-15 22:19:39,838 DEBUG
> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> eviction
> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> > 2011-02-15 22:19:39,852 DEBUG
> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> eviction
> >> > completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB,
> multi=91.48
> >> MB,
> >> > memory=166.09 KB
> >> > 2011-02-15 22:19:49,768 DEBUG
> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> eviction
> >> > started; Attempting to free 19.87 MB of total=168.62 MB
> >> > 2011-02-15 22:19:49,770 DEBUG
> >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU
> eviction
> >> > completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB,
> multi=90.35
> >> MB,
> >> > memory=166.09 KB
> >> >
> >> >
> >> > I haven't changed anything including the table definitions.
> >> > please let me know where to look...
> >> >
> >> > thanks,
> >> > Vishal Kapoor
> >> >
> >>
> >
>

Re: Hbase inserts very slow

Posted by Jean-Daniel Cryans <jd...@apache.org>.
First, loading into 3 families is currently a bad idea and is bound to
be inefficient, here's the reason why:
https://issues.apache.org/jira/browse/HBASE-3149

Those log lines mean that your scanning of the first table is
generating a log of block cache churn. When setting up the Map, set
your scanner to setCacheBlocks(false) before passing it to
TableMapReduceUtil.initTableMapperJob

Finally, you may want to give more memory to the region server.

J-D

On Wed, Feb 16, 2011 at 7:35 AM, Vishal Kapoor
<vi...@gmail.com> wrote:
> Lars,
>
> I am still working on pseudo distributed.
> hadoop-0.20.2+737/
> and hbase-0.90.0 with the hadoop jar from the hadoop install.
>
> I have a LIVE_RAW_TABLE table, which gets values from a live system
> I go through each row of that table and get the row ids of two reference
> tables from it.
> TABLE_A and TABLE_B, then I explode this to a new table LIVE_TABLE
> I use
> TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null, job);
>
>
> LIVE_TABLE has three families, LIVE, A, B and the row id is a composite key
> reverseTimeStamp/rowidA/rowIdB
> after that a run a bunch of map reduce to consolidate the data,
> to start with I have around 15000 rows in LIVE_RAW_TABLE.
>
> when I start with my job, i see it running quite well till i am almost done
> with 5000 rows
> then it starts printing the message in the logs, which I use to not see
> before.
> the job use to run for around 900 sec ( I have a lot of data parsing while
> exploding )
> 15000 rows from LIVE_RAW_TABLE explodes to around 500,000 rows in
> LIVE_TABLE.
>
> after those debug messages, the job runs for around 2500 sec,
> I have not changed anything, including the table design.
>
> here is my table description.
>
> {NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE', BLOOMFILTER => 'NONE',
> REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', TTL =>
> '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>
> 'true'}, {NAME => 'A', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0',
> VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'B',
> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '1',
> COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY
> => 'false', BLOCKCACHE => 'true'}]}
>
> thanks for all your help.
>
> Vishal
>
> On Wed, Feb 16, 2011 at 4:26 AM, Lars George <la...@gmail.com> wrote:
>
>> Hi Vishal,
>>
>> These are DEBUG level messages and are from the block cache, there is
>> nothing wrong with that. Can you explain more what you do and see?
>>
>> Lars
>>
>> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
>> <vi...@gmail.com> wrote:
>> > all was working fine and suddenly I see a lot of logs like below
>> >
>> > 2011-02-15 22:19:04,023 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > started; Attempting to free 19.88 MB of total=168.64 MB
>> > 2011-02-15 22:19:04,025 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB, multi=92.37
>> MB,
>> > memory=166.09 KB
>> > 2011-02-15 22:19:11,207 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > started; Attempting to free 19.88 MB of total=168.64 MB
>> > 2011-02-15 22:19:11,444 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB, multi=93.32
>> MB,
>> > memory=166.09 KB
>> > 2011-02-15 22:19:21,494 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > started; Attempting to free 19.87 MB of total=168.62 MB
>> > 2011-02-15 22:19:21,760 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB, multi=92.73
>> MB,
>> > memory=166.09 KB
>> > 2011-02-15 22:19:39,838 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > started; Attempting to free 19.87 MB of total=168.62 MB
>> > 2011-02-15 22:19:39,852 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB, multi=91.48
>> MB,
>> > memory=166.09 KB
>> > 2011-02-15 22:19:49,768 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > started; Attempting to free 19.87 MB of total=168.62 MB
>> > 2011-02-15 22:19:49,770 DEBUG
>> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
>> > completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB, multi=90.35
>> MB,
>> > memory=166.09 KB
>> >
>> >
>> > I haven't changed anything including the table definitions.
>> > please let me know where to look...
>> >
>> > thanks,
>> > Vishal Kapoor
>> >
>>
>

Re: Hbase inserts very slow

Posted by Vishal Kapoor <vi...@gmail.com>.
Lars,

I am still working on pseudo distributed.
hadoop-0.20.2+737/
and hbase-0.90.0 with the hadoop jar from the hadoop install.

I have a LIVE_RAW_TABLE table, which gets values from a live system
I go through each row of that table and get the row ids of two reference
tables from it.
TABLE_A and TABLE_B, then I explode this to a new table LIVE_TABLE
I use
TableMapReduceUtil.initTableReducerJob("LIVE_TABLE", null, job);


LIVE_TABLE has three families, LIVE, A, B and the row id is a composite key
reverseTimeStamp/rowidA/rowIdB
after that a run a bunch of map reduce to consolidate the data,
to start with I have around 15000 rows in LIVE_RAW_TABLE.

when I start with my job, i see it running quite well till i am almost done
with 5000 rows
then it starts printing the message in the logs, which I use to not see
before.
the job use to run for around 900 sec ( I have a lot of data parsing while
exploding )
15000 rows from LIVE_RAW_TABLE explodes to around 500,000 rows in
LIVE_TABLE.

after those debug messages, the job runs for around 2500 sec,
I have not changed anything, including the table design.

here is my table description.

{NAME => 'LIVE_TABLE', FAMILIES => [{NAME => 'LIVE', BLOOMFILTER => 'NONE',
REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', TTL =>
'2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>
'true'}, {NAME => 'A', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0',
VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
'65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'B',
BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '1',
COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY
=> 'false', BLOCKCACHE => 'true'}]}

thanks for all your help.

Vishal

On Wed, Feb 16, 2011 at 4:26 AM, Lars George <la...@gmail.com> wrote:

> Hi Vishal,
>
> These are DEBUG level messages and are from the block cache, there is
> nothing wrong with that. Can you explain more what you do and see?
>
> Lars
>
> On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
> <vi...@gmail.com> wrote:
> > all was working fine and suddenly I see a lot of logs like below
> >
> > 2011-02-15 22:19:04,023 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > started; Attempting to free 19.88 MB of total=168.64 MB
> > 2011-02-15 22:19:04,025 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB, multi=92.37
> MB,
> > memory=166.09 KB
> > 2011-02-15 22:19:11,207 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > started; Attempting to free 19.88 MB of total=168.64 MB
> > 2011-02-15 22:19:11,444 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB, multi=93.32
> MB,
> > memory=166.09 KB
> > 2011-02-15 22:19:21,494 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > started; Attempting to free 19.87 MB of total=168.62 MB
> > 2011-02-15 22:19:21,760 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB, multi=92.73
> MB,
> > memory=166.09 KB
> > 2011-02-15 22:19:39,838 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > started; Attempting to free 19.87 MB of total=168.62 MB
> > 2011-02-15 22:19:39,852 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB, multi=91.48
> MB,
> > memory=166.09 KB
> > 2011-02-15 22:19:49,768 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > started; Attempting to free 19.87 MB of total=168.62 MB
> > 2011-02-15 22:19:49,770 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB, multi=90.35
> MB,
> > memory=166.09 KB
> >
> >
> > I haven't changed anything including the table definitions.
> > please let me know where to look...
> >
> > thanks,
> > Vishal Kapoor
> >
>

Re: Hbase inserts very slow

Posted by Lars George <la...@gmail.com>.
Hi Vishal,

These are DEBUG level messages and are from the block cache, there is
nothing wrong with that. Can you explain more what you do and see?

Lars

On Wed, Feb 16, 2011 at 4:24 AM, Vishal Kapoor
<vi...@gmail.com> wrote:
> all was working fine and suddenly I see a lot of logs like below
>
> 2011-02-15 22:19:04,023 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> started; Attempting to free 19.88 MB of total=168.64 MB
> 2011-02-15 22:19:04,025 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> completed; freed=19.91 MB, total=148.73 MB, single=74.47 MB, multi=92.37 MB,
> memory=166.09 KB
> 2011-02-15 22:19:11,207 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> started; Attempting to free 19.88 MB of total=168.64 MB
> 2011-02-15 22:19:11,444 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> completed; freed=19.93 MB, total=149.09 MB, single=73.91 MB, multi=93.32 MB,
> memory=166.09 KB
> 2011-02-15 22:19:21,494 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> started; Attempting to free 19.87 MB of total=168.62 MB
> 2011-02-15 22:19:21,760 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> completed; freed=19.91 MB, total=148.84 MB, single=74.22 MB, multi=92.73 MB,
> memory=166.09 KB
> 2011-02-15 22:19:39,838 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> started; Attempting to free 19.87 MB of total=168.62 MB
> 2011-02-15 22:19:39,852 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> completed; freed=19.91 MB, total=148.71 MB, single=75.35 MB, multi=91.48 MB,
> memory=166.09 KB
> 2011-02-15 22:19:49,768 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> started; Attempting to free 19.87 MB of total=168.62 MB
> 2011-02-15 22:19:49,770 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> completed; freed=19.91 MB, total=148.71 MB, single=76.48 MB, multi=90.35 MB,
> memory=166.09 KB
>
>
> I haven't changed anything including the table definitions.
> please let me know where to look...
>
> thanks,
> Vishal Kapoor
>