You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Ionut Ignatescu <io...@gmail.com> on 2012/07/14 16:40:19 UTC

How to merge regions?

Hi all,

 

My use case: I have several tables with key starting with a timestamp. Also,
this tables have set data retention to 30 days.

Table size is around 1Tb(3Tb replicated) and data is inserted regular(on
5minute, ~200Mb is inserted).

File size is set to 1Gb. I have this tables in use for almost half an year
and now a table has around 6k partitions and

40% of them are empty.

The problem: the number of regions per region server is now pretty high.

 

Questions: 

Which approach is better?

- to merge adjacent empty partitions in a bigger one?

- to merge empty partitions to non-empty partitions?

Also, I'm wondering why regions merge is not part of major compactions and
why it's necessary to stop the 

entire fleet to solve this problem.


Re: How to merge regions?

Posted by Stack <st...@duboce.net>.
On Sat, Jul 14, 2012 at 4:40 PM, Ionut Ignatescu
<io...@gmail.com> wrote:
> - to merge adjacent empty partitions in a bigger one?
>
> - to merge empty partitions to non-empty partitions?
>
> Also, I'm wondering why regions merge is not part of major compactions and
> why it's necessary to stop the
>
> entire fleet to solve this problem.
>

Its something that should have been done long time ago but none of us
has taken it on properly.  J-D did an online merge hackup script
attached to the onilne merge issue that worked for our purposes and
helped out some others but beyond that, online merge needs loving.

It should be easier in your case given 40% of the regions are empty.
Are you ok w/ a bit of scripting editing the .META. table?  Are all
40% on the end of the table (given they are aged out)?  Can you just
cut the empty tail off the table by deleting all empty regions from
.META. (and from hdfs) off the end and then just add back a single
region what has a start key of the last non-empty region and an endkey
of the empty row to put back a scan stopper?

St.Ack