You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Buttler, David" <bu...@llnl.gov> on 2010/11/04 02:22:24 UTC

split not working

Hi all,
I have a small table (3M rows) that is using 4 regions.  I want to split the table so that I can take advantage of more nodes in my cluster with map/reduce tasks and the TableInputFormat.
The split button on the web page sends split messages to the master/region server, but nothing seems to happen - my regions don't split.

I am using the latest cloudera branch hadoop-0.20.2+737, and hbase-089.20100924+28

Alternatively, is there a way that I can specify a minimum number of splits for the TableInputFormat?
Dave


RE: split not working

Posted by "Buttler, David" <bu...@llnl.gov>.
I have tried it from the shell and using the API.  The same thing happens: the log messages report that a split request was sent, but then no further updates.
Is it possible that having GZ compression enabled hampers this operation?

Log messages attached.
Dave

Master:
-------------
2010-11-04 15:48:04,028 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Rowscanned=4, rowsOffline=0
2010-11-04 15:48:30,757 DEBUG org.apache.hadoop.hbase.master.RegionManager: Adding operation TABLE_SPLIT from tasklist
2010-11-04 15:48:30,757 DEBUG org.apache.hadoop.hbase.master.RegionManager: Adding operation TABLE_SPLIT from tasklist
2010-11-04 15:48:30,757 DEBUG org.apache.hadoop.hbase.master.RegionManager: Adding operation TABLE_SPLIT from tasklist
2010-11-04 15:48:30,757 DEBUG org.apache.hadoop.hbase.master.RegionManager: Adding operation TABLE_SPLIT from tasklist
2010-11-04 15:48:31,135 DEBUG org.apache.hadoop.hbase.master.RegionManager: Sending MSG_REGION_SPLIT REGION => {NAME => 'dbpedia,3fdc590551f245675682a4f1baedb59eca4c698d,128820517403
8.26fe0f3fc176da9403ab4cd94dc56f41.', STARTKEY => '3fdc590551f245675682a4f1baedb59eca4c698d', ENDKEY => '7fcb1391fda134a4b2a17b494b52f65db2bad69e', ENCODED => 26fe0f3fc176da9403ab4cd
94dc56f41, TABLE => {{NAME => 'dbpedia', FAMILIES => [{NAME => 'annotations', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'GZ', TTL => '214748364
7', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'meta', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'GZ', TTL => 
'2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'src', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'GZ'
, TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'text', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSI
ON => 'GZ', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}} to 10.220.5.20:60020
2010-11-04 15:48:31,520 DEBUG org.apache.hadoop.hbase.master.RegionManager: Sending MSG_REGION_SPLIT REGION => {NAME => 'dbpedia,7fcb1391fda134a4b2a17b494b52f65db2bad69e,128820529378
4.2b6ce7d6ac234856cb808ac29b65e88a.', STARTKEY => '7fcb1391fda134a4b2a17b494b52f65db2bad69e', ENDKEY => 'bffc513271d8d61b3510cb4780efddaa212481a0', ENCODED => 2b6ce7d6ac234856cb808ac
29b65e88a, TABLE => {{NAME => 'dbpedia', FAMILIES => [{NAME => 'annotations', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'GZ', TTL => '214748364
7', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'meta', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'GZ', TTL => 
'2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'src', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'GZ'
, TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'text', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSI
ON => 'GZ', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}} to 10.220.5.17:60020
2010-11-04 15:48:31,521 DEBUG org.apache.hadoop.hbase.master.RegionManager: Sending MSG_REGION_SPLIT REGION => {NAME => 'dbpedia,bffc513271d8d61b3510cb4780efddaa212481a0,128820529378
4.ae02b30fad134ad4e07b0bad5f3e1154.', STARTKEY => 'bffc513271d8d61b3510cb4780efddaa212481a0', ENDKEY => '', ENCODED => ae02b30fad134ad4e07b0bad5f3e1154, TABLE => {{NAME => 'dbpedia',
 FAMILIES => [{NAME => 'annotations', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'GZ', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => '
false', BLOCKCACHE => 'true'}, {NAME => 'meta', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'GZ', TTL => '2147483647', BLOCKSIZE => '65536', IN_M
EMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'src', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'GZ', TTL => '2147483647', BLOCKSIZE => '655
36', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'text', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'GZ', TTL => '2147483647', BLOCKSI
ZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}} to 10.220.5.17:60020
2010-11-04 15:48:31,536 DEBUG org.apache.hadoop.hbase.master.RegionManager: Sending MSG_REGION_SPLIT REGION => {NAME => 'dbpedia,,1288205174038.5ed51d83d2db52b019c9650fc08316ec.', ST
ARTKEY => '', ENDKEY => '3fdc590551f245675682a4f1baedb59eca4c698d', ENCODED => 5ed51d83d2db52b019c9650fc08316ec, TABLE => {{NAME => 'dbpedia', FAMILIES => [{NAME => 'annotations', BL
OOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'GZ', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 
CKCACHE => 'true'}]}} to 10.220.5.38:60020
2010-11-04 15:48:47,476 INFO org.apache.hadoop.hbase.master.ServerManager: 30 region servers, 0 dead, average load 63.06666666666667


-----------
Node containing a region:
2010-11-04 15:41:59,658 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_SPLIT: dbpedia,3fdc590551f245675682a4f1baedb59eca4c698d,1288205174038.26fe0f3fc176da9403ab4cd94dc56f41.
2010-11-04 15:41:59,659 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_SPLIT: dbpedia,3fdc590551f245675682a4f1baedb59eca4c698d,1288205174038.26fe0f3fc176da9403ab4cd94dc56f41.
2010-11-04 15:41:59,659 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region dbpedia,3fdc590551f245675682a4f1baedb59eca4c698d,1288205174038.26fe0f3fc176da9403ab4cd94dc56f41. because: MSG_REGION_SPLIT
2010-11-04 15:41:59,659 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region dbpedia,3fdc590551f245675682a4f1baedb59eca4c698d,1288205174038.26fe0f3fc176da9403ab4cd94dc56f41.
2010-11-04 15:41:59,659 DEBUG org.apache.hadoop.hbase.regionserver.Store: annotations: no store files to compact
2010-11-04 15:41:59,659 DEBUG org.apache.hadoop.hbase.regionserver.Store: meta: no store files to compact
2010-11-04 15:41:59,663 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region dbpedia,3fdc590551f245675682a4f1baedb59eca4c698d,1288205174038.26fe0f3fc176da9403ab4cd94dc56f41. in 0sec
2010-11-04 15:46:36,345 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=59.2 MB, free=739.13 MB, max=798.34 MB, blocks=830, accesses=67548, hits=0, hitRatio=0.00%%, evictions=0, evicted=0, evictedPerRun=NaN
2010-11-04 15:48:09,624 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_SPLIT: dbpedia,3fdc590551f245675682a4f1baedb59eca4c698d,1288205174038.26fe0f3fc176da9403ab4cd94dc56f41.
2010-11-04 15:48:09,624 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_SPLIT: dbpedia,3fdc590551f245675682a4f1baedb59eca4c698d,1288205174038.26fe0f3fc176da9403ab4cd94dc56f41.
2010-11-04 15:48:09,624 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region dbpedia,3fdc590551f245675682a4f1baedb59eca4c698d,1288205174038.26fe0f3fc176da9403ab4cd94dc56f41. because: MSG_REGION_SPLIT
2010-11-04 15:48:09,624 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on region dbpedia,3fdc590551f245675682a4f1baedb59eca4c698d,1288205174038.26fe0f3fc176da9403ab4cd94dc56f41.
2010-11-04 15:48:09,624 DEBUG org.apache.hadoop.hbase.regionserver.Store: annotations: no store files to compact
2010-11-04 15:48:09,624 DEBUG org.apache.hadoop.hbase.regionserver.Store: meta: no store files to compact
2010-11-04 15:48:09,626 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region dbpedia,3fdc590551f245675682a4f1baedb59eca4c698d,1288205174038.26fe0f3fc176da9403ab4cd94dc56f41. in 0sec
2010-11-04 15:51:36,345 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=59.2 MB, free=739.13 MB, max=798.34 MB, blocks=830, accesses=67548, hits=0, hitRatio=0.00%%, evictions=0, evicted=0, evictedPerRun=NaN
2010-11-04 15:56:36,345 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=59.2 MB, free=739.13 MB, max=798.34 MB, blocks=830, accesses=67548, hits=0, hitRatio=0.00%%, evictions=0, evicted=0, evictedPerRun=NaN

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Wednesday, November 03, 2010 8:26 PM
To: user@hbase.apache.org
Subject: Re: split not working

Try it from the shell.  Try splitting individual regions (Scan .META.
first to see list or check your UI where it lists the regions in a
table).

St.Ack

On Wed, Nov 3, 2010 at 6:22 PM, Buttler, David <bu...@llnl.gov> wrote:
> Hi all,
> I have a small table (3M rows) that is using 4 regions.  I want to split the table so that I can take advantage of more nodes in my cluster with map/reduce tasks and the TableInputFormat.
> The split button on the web page sends split messages to the master/region server, but nothing seems to happen - my regions don't split.
>
> I am using the latest cloudera branch hadoop-0.20.2+737, and hbase-089.20100924+28
>
> Alternatively, is there a way that I can specify a minimum number of splits for the TableInputFormat?
> Dave
>
>

Re: split not working

Posted by Stack <st...@duboce.net>.
Try it from the shell.  Try splitting individual regions (Scan .META.
first to see list or check your UI where it lists the regions in a
table).

St.Ack

On Wed, Nov 3, 2010 at 6:22 PM, Buttler, David <bu...@llnl.gov> wrote:
> Hi all,
> I have a small table (3M rows) that is using 4 regions.  I want to split the table so that I can take advantage of more nodes in my cluster with map/reduce tasks and the TableInputFormat.
> The split button on the web page sends split messages to the master/region server, but nothing seems to happen - my regions don't split.
>
> I am using the latest cloudera branch hadoop-0.20.2+737, and hbase-089.20100924+28
>
> Alternatively, is there a way that I can specify a minimum number of splits for the TableInputFormat?
> Dave
>
>