You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ted Tuttle <te...@mentacapital.com> on 2014/08/28 20:19:49 UTC

state-of-the-art method for merging regions on v0.94

Hello-

We recently realized our region size is 1G and need to increase it to get our region count under control.  I've done some research on merging regions and have come away confused.

There is the ops handbook:

http://hbase.apache.org/book/ops.regionmgt.html

And then there is this horror story:

http://metabroadcast.com/blog/so-you-broke-hbase

Is there someone out there that has done a large scale (i.e. 10:1 reduction on 10k's of regions) merge successfully on HBase 0.94?  If so, how did you do it?

Thanks,
Ted


RE: state-of-the-art method for merging regions on v0.94

Posted by Ted Tuttle <te...@mentacapital.com>.
Sorry JM. We are on v0.94.16

-----Original Message-----
From: Jean-Marc Spaggiari [mailto:jean-marc@spaggiari.org] 
Sent: Thursday, August 28, 2014 11:48 AM
To: user
Subject: Re: state-of-the-art method for merging regions on v0.94

Hi Ted,

Which version of 0.94? 0.94.0? Or 0.94.22?

Here is the documentation for CopyTable. Seems that there is some issues with the color, but wording is correct...

http://hbase.apache.org/book/ops_mgt.html#copytable

JM


2014-08-28 14:43 GMT-04:00 Ted Tuttle <te...@mentacapital.com>:

> Yes, we run 0.94.
>
> Regarding your suggestion of copying data: what method/tools would you 
> suggest for this?
>
> -----Original Message-----
> From: Jean-Marc Spaggiari [mailto:jean-marc@spaggiari.org]
> Sent: Thursday, August 28, 2014 11:26 AM
> To: user
> Subject: Re: state-of-the-art method for merging regions on v0.94
>
> Yep, did it ;) (20:1 reduction, about 10K regions) You really need to 
> understand what you are doing when doing that.
>
> Off line merge will need some downtimes and depending on the version 
> you use might have some issues. Can still find work arround, but 
> again, you need to know.
>
> Another option is to copy your data into another table where regions 
> are bigger...
>
> What HBase version do you run? 0.94.?
>
> JM
>
>
> 2014-08-28 14:19 GMT-04:00 Ted Tuttle <te...@mentacapital.com>:
>
> > Hello-
> >
> > We recently realized our region size is 1G and need to increase it 
> > to get our region count under control.  I've done some research on 
> > merging regions and have come away confused.
> >
> > There is the ops handbook:
> >
> > http://hbase.apache.org/book/ops.regionmgt.html
> >
> > And then there is this horror story:
> >
> > http://metabroadcast.com/blog/so-you-broke-hbase
> >
> > Is there someone out there that has done a large scale (i.e. 10:1 
> > reduction on 10k's of regions) merge successfully on HBase 0.94?  If 
> > so, how did you do it?
> >
> > Thanks,
> > Ted
> >
> >
>

Re: state-of-the-art method for merging regions on v0.94

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Ted,

Which version of 0.94? 0.94.0? Or 0.94.22?

Here is the documentation for CopyTable. Seems that there is some issues
with the color, but wording is correct...

http://hbase.apache.org/book/ops_mgt.html#copytable

JM


2014-08-28 14:43 GMT-04:00 Ted Tuttle <te...@mentacapital.com>:

> Yes, we run 0.94.
>
> Regarding your suggestion of copying data: what method/tools would you
> suggest for this?
>
> -----Original Message-----
> From: Jean-Marc Spaggiari [mailto:jean-marc@spaggiari.org]
> Sent: Thursday, August 28, 2014 11:26 AM
> To: user
> Subject: Re: state-of-the-art method for merging regions on v0.94
>
> Yep, did it ;) (20:1 reduction, about 10K regions) You really need to
> understand what you are doing when doing that.
>
> Off line merge will need some downtimes and depending on the version you
> use might have some issues. Can still find work arround, but again, you
> need to know.
>
> Another option is to copy your data into another table where regions are
> bigger...
>
> What HBase version do you run? 0.94.?
>
> JM
>
>
> 2014-08-28 14:19 GMT-04:00 Ted Tuttle <te...@mentacapital.com>:
>
> > Hello-
> >
> > We recently realized our region size is 1G and need to increase it to
> > get our region count under control.  I've done some research on
> > merging regions and have come away confused.
> >
> > There is the ops handbook:
> >
> > http://hbase.apache.org/book/ops.regionmgt.html
> >
> > And then there is this horror story:
> >
> > http://metabroadcast.com/blog/so-you-broke-hbase
> >
> > Is there someone out there that has done a large scale (i.e. 10:1
> > reduction on 10k's of regions) merge successfully on HBase 0.94?  If
> > so, how did you do it?
> >
> > Thanks,
> > Ted
> >
> >
>

RE: state-of-the-art method for merging regions on v0.94

Posted by Ted Tuttle <te...@mentacapital.com>.
Yes, we run 0.94.

Regarding your suggestion of copying data: what method/tools would you suggest for this?

-----Original Message-----
From: Jean-Marc Spaggiari [mailto:jean-marc@spaggiari.org] 
Sent: Thursday, August 28, 2014 11:26 AM
To: user
Subject: Re: state-of-the-art method for merging regions on v0.94

Yep, did it ;) (20:1 reduction, about 10K regions) You really need to understand what you are doing when doing that.

Off line merge will need some downtimes and depending on the version you use might have some issues. Can still find work arround, but again, you need to know.

Another option is to copy your data into another table where regions are bigger...

What HBase version do you run? 0.94.?

JM


2014-08-28 14:19 GMT-04:00 Ted Tuttle <te...@mentacapital.com>:

> Hello-
>
> We recently realized our region size is 1G and need to increase it to 
> get our region count under control.  I've done some research on 
> merging regions and have come away confused.
>
> There is the ops handbook:
>
> http://hbase.apache.org/book/ops.regionmgt.html
>
> And then there is this horror story:
>
> http://metabroadcast.com/blog/so-you-broke-hbase
>
> Is there someone out there that has done a large scale (i.e. 10:1 
> reduction on 10k's of regions) merge successfully on HBase 0.94?  If 
> so, how did you do it?
>
> Thanks,
> Ted
>
>

Re: state-of-the-art method for merging regions on v0.94

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Yep, did it ;) (20:1 reduction, about 10K regions) You really need to
understand what you are doing when doing that.

Off line merge will need some downtimes and depending on the version you
use might have some issues. Can still find work arround, but again, you
need to know.

Another option is to copy your data into another table where regions are
bigger...

What HBase version do you run? 0.94.?

JM


2014-08-28 14:19 GMT-04:00 Ted Tuttle <te...@mentacapital.com>:

> Hello-
>
> We recently realized our region size is 1G and need to increase it to get
> our region count under control.  I've done some research on merging regions
> and have come away confused.
>
> There is the ops handbook:
>
> http://hbase.apache.org/book/ops.regionmgt.html
>
> And then there is this horror story:
>
> http://metabroadcast.com/blog/so-you-broke-hbase
>
> Is there someone out there that has done a large scale (i.e. 10:1
> reduction on 10k's of regions) merge successfully on HBase 0.94?  If so,
> how did you do it?
>
> Thanks,
> Ted
>
>

Re: state-of-the-art method for merging regions on v0.94

Posted by Shahab Yunus <sh...@gmail.com>.
Great, thanks. I just wanted to confirm my understanding that 0.98 is
indeed OK.

Regards,
Shahab


On Thu, Aug 28, 2014 at 2:35 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi Shahab,
>
> Ted talked about 0.94. In 0.98 there is no (known) issues with the on-line
> merge.
>
> JM
>
>
> 2014-08-28 14:33 GMT-04:00 Shahab Yunus <sh...@gmail.com>:
>
> > I have a question here. In 0.98 the merge_region command which can be run
> > through HBase shell is not reliable? If we simply want to merge 2 regions
> > at a time? I thought that the older Merge tool was not safe.
> >
> > Thanks,
> > Shahab
> >
> >
> > On Thu, Aug 28, 2014 at 2:26 PM, Bryan Beaudreault <
> > bbeaudreault@hubspot.com
> > > wrote:
> >
> > > I've done it.  This is the code I used:
> > > https://gist.github.com/bbeaudreault/7567385
> > >
> > > It comes from the hbase source, but is modified to actually work (the
> > class
> > > provided in hbase is private and does not work out of the box). There
> is
> > a
> > > readme at the bottom of the gist with my process.  One important note
> > > though, I did this with a deep understanding (after hours of reading
> > hbase
> > > code and doing tests on a test cluster) of how it all works.  And even
> > then
> > > I felt nervous to do it in prod.  Hence why I went the snapshot/compact
> > > route.
> > >
> > > I would definitely test it on a test cluster and get some familiarity
> > > before getting close to a production table.  That said, I've run this
> on
> > > 8-10 production tables a few months ago, reducing in size from 10-20x
> in
> > > some cases.
> > >
> > >
> > > On Thu, Aug 28, 2014 at 2:19 PM, Ted Tuttle <te...@mentacapital.com>
> > wrote:
> > >
> > > > Hello-
> > > >
> > > > We recently realized our region size is 1G and need to increase it to
> > get
> > > > our region count under control.  I've done some research on merging
> > > regions
> > > > and have come away confused.
> > > >
> > > > There is the ops handbook:
> > > >
> > > > http://hbase.apache.org/book/ops.regionmgt.html
> > > >
> > > > And then there is this horror story:
> > > >
> > > > http://metabroadcast.com/blog/so-you-broke-hbase
> > > >
> > > > Is there someone out there that has done a large scale (i.e. 10:1
> > > > reduction on 10k's of regions) merge successfully on HBase 0.94?  If
> > so,
> > > > how did you do it?
> > > >
> > > > Thanks,
> > > > Ted
> > > >
> > > >
> > >
> >
>

Re: state-of-the-art method for merging regions on v0.94

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Shahab,

Ted talked about 0.94. In 0.98 there is no (known) issues with the on-line
merge.

JM


2014-08-28 14:33 GMT-04:00 Shahab Yunus <sh...@gmail.com>:

> I have a question here. In 0.98 the merge_region command which can be run
> through HBase shell is not reliable? If we simply want to merge 2 regions
> at a time? I thought that the older Merge tool was not safe.
>
> Thanks,
> Shahab
>
>
> On Thu, Aug 28, 2014 at 2:26 PM, Bryan Beaudreault <
> bbeaudreault@hubspot.com
> > wrote:
>
> > I've done it.  This is the code I used:
> > https://gist.github.com/bbeaudreault/7567385
> >
> > It comes from the hbase source, but is modified to actually work (the
> class
> > provided in hbase is private and does not work out of the box). There is
> a
> > readme at the bottom of the gist with my process.  One important note
> > though, I did this with a deep understanding (after hours of reading
> hbase
> > code and doing tests on a test cluster) of how it all works.  And even
> then
> > I felt nervous to do it in prod.  Hence why I went the snapshot/compact
> > route.
> >
> > I would definitely test it on a test cluster and get some familiarity
> > before getting close to a production table.  That said, I've run this on
> > 8-10 production tables a few months ago, reducing in size from 10-20x in
> > some cases.
> >
> >
> > On Thu, Aug 28, 2014 at 2:19 PM, Ted Tuttle <te...@mentacapital.com>
> wrote:
> >
> > > Hello-
> > >
> > > We recently realized our region size is 1G and need to increase it to
> get
> > > our region count under control.  I've done some research on merging
> > regions
> > > and have come away confused.
> > >
> > > There is the ops handbook:
> > >
> > > http://hbase.apache.org/book/ops.regionmgt.html
> > >
> > > And then there is this horror story:
> > >
> > > http://metabroadcast.com/blog/so-you-broke-hbase
> > >
> > > Is there someone out there that has done a large scale (i.e. 10:1
> > > reduction on 10k's of regions) merge successfully on HBase 0.94?  If
> so,
> > > how did you do it?
> > >
> > > Thanks,
> > > Ted
> > >
> > >
> >
>

Re: state-of-the-art method for merging regions on v0.94

Posted by Shahab Yunus <sh...@gmail.com>.
I have a question here. In 0.98 the merge_region command which can be run
through HBase shell is not reliable? If we simply want to merge 2 regions
at a time? I thought that the older Merge tool was not safe.

Thanks,
Shahab


On Thu, Aug 28, 2014 at 2:26 PM, Bryan Beaudreault <bbeaudreault@hubspot.com
> wrote:

> I've done it.  This is the code I used:
> https://gist.github.com/bbeaudreault/7567385
>
> It comes from the hbase source, but is modified to actually work (the class
> provided in hbase is private and does not work out of the box). There is a
> readme at the bottom of the gist with my process.  One important note
> though, I did this with a deep understanding (after hours of reading hbase
> code and doing tests on a test cluster) of how it all works.  And even then
> I felt nervous to do it in prod.  Hence why I went the snapshot/compact
> route.
>
> I would definitely test it on a test cluster and get some familiarity
> before getting close to a production table.  That said, I've run this on
> 8-10 production tables a few months ago, reducing in size from 10-20x in
> some cases.
>
>
> On Thu, Aug 28, 2014 at 2:19 PM, Ted Tuttle <te...@mentacapital.com> wrote:
>
> > Hello-
> >
> > We recently realized our region size is 1G and need to increase it to get
> > our region count under control.  I've done some research on merging
> regions
> > and have come away confused.
> >
> > There is the ops handbook:
> >
> > http://hbase.apache.org/book/ops.regionmgt.html
> >
> > And then there is this horror story:
> >
> > http://metabroadcast.com/blog/so-you-broke-hbase
> >
> > Is there someone out there that has done a large scale (i.e. 10:1
> > reduction on 10k's of regions) merge successfully on HBase 0.94?  If so,
> > how did you do it?
> >
> > Thanks,
> > Ted
> >
> >
>

Re: state-of-the-art method for merging regions on v0.94

Posted by Bryan Beaudreault <bb...@hubspot.com>.
Lars, so it worked for me, and I'm more than happy for anyone to use/adapt
it as necessary for hbase proper.  But I'm not sure it's anywhere near
production ready, and I don't have the time to work on it more right now.
 Perhaps someone with more knowledge of region internals could vet it and
add relevant tests.  We could enter a JIRA, and if I find time in the
future I can take a look.

And yes, @JM, my gist was specific to an online migration (cluster is
active, but table is disabled).  Offline did not meet our requirements at
the time, so I never tried it.



On Thu, Aug 28, 2014 at 5:04 PM, lars hofhansl <la...@apache.org> wrote:

> Agreed.
>
> Bryan, we should pull in your code if that works better.
>
> -- Lars
>
>
>
> ________________________________
>  From: Andrew Purtell <ap...@apache.org>
> To: "user@hbase.apache.org" <us...@hbase.apache.org>
> Cc: Development <De...@mentacapital.com>
> Sent: Thursday, August 28, 2014 12:12 PM
> Subject: Re: state-of-the-art method for merging regions on v0.94
>
>
> If the 0.94 merge code doesn't work out the box we should fix that.
>
>
>
>
>
> On Thu, Aug 28, 2014 at 11:26 AM, Bryan Beaudreault <
> bbeaudreault@hubspot.com> wrote:
>
> > I've done it.  This is the code I used:
> > https://gist.github.com/bbeaudreault/7567385
> >
> > It comes from the hbase source, but is modified to actually work (the
> class
> > provided in hbase is private and does not work out of the box). There is
> a
> > readme at the bottom of the gist with my process.  One important note
> > though, I did this with a deep understanding (after hours of reading
> hbase
> > code and doing tests on a test cluster) of how it all works.  And even
> then
> > I felt nervous to do it in prod.  Hence why I went the snapshot/compact
> > route.
> >
> > I would definitely test it on a test cluster and get some familiarity
> > before getting close to a production table.  That said, I've run this on
> > 8-10 production tables a few months ago, reducing in size from 10-20x in
> > some cases.
> >
> >
> > On Thu, Aug 28, 2014 at 2:19 PM, Ted Tuttle <te...@mentacapital.com>
> wrote:
> >
> > > Hello-
> > >
> > > We recently realized our region size is 1G and need to increase it to
> get
> > > our region count under control.  I've done some research on merging
> > regions
> > > and have come away confused.
> > >
> > > There is the ops handbook:
> > >
> > > http://hbase.apache.org/book/ops.regionmgt.html
> > >
> > > And then there is this horror story:
> > >
> > > http://metabroadcast.com/blog/so-you-broke-hbase
> > >
> > > Is there someone out there that has done a large scale (i.e. 10:1
> > > reduction on 10k's of regions) merge successfully on HBase 0.94?  If
> so,
> > > how did you do it?
> > >
> > > Thanks,
> > > Ted
> > >
> > >
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Re: state-of-the-art method for merging regions on v0.94

Posted by lars hofhansl <la...@apache.org>.
Agreed.

Bryan, we should pull in your code if that works better.

-- Lars



________________________________
 From: Andrew Purtell <ap...@apache.org>
To: "user@hbase.apache.org" <us...@hbase.apache.org> 
Cc: Development <De...@mentacapital.com> 
Sent: Thursday, August 28, 2014 12:12 PM
Subject: Re: state-of-the-art method for merging regions on v0.94
 

If the 0.94 merge code doesn't work out the box we should fix that.





On Thu, Aug 28, 2014 at 11:26 AM, Bryan Beaudreault <
bbeaudreault@hubspot.com> wrote:

> I've done it.  This is the code I used:
> https://gist.github.com/bbeaudreault/7567385
>
> It comes from the hbase source, but is modified to actually work (the class
> provided in hbase is private and does not work out of the box). There is a
> readme at the bottom of the gist with my process.  One important note
> though, I did this with a deep understanding (after hours of reading hbase
> code and doing tests on a test cluster) of how it all works.  And even then
> I felt nervous to do it in prod.  Hence why I went the snapshot/compact
> route.
>
> I would definitely test it on a test cluster and get some familiarity
> before getting close to a production table.  That said, I've run this on
> 8-10 production tables a few months ago, reducing in size from 10-20x in
> some cases.
>
>
> On Thu, Aug 28, 2014 at 2:19 PM, Ted Tuttle <te...@mentacapital.com> wrote:
>
> > Hello-
> >
> > We recently realized our region size is 1G and need to increase it to get
> > our region count under control.  I've done some research on merging
> regions
> > and have come away confused.
> >
> > There is the ops handbook:
> >
> > http://hbase.apache.org/book/ops.regionmgt.html
> >
> > And then there is this horror story:
> >
> > http://metabroadcast.com/blog/so-you-broke-hbase
> >
> > Is there someone out there that has done a large scale (i.e. 10:1
> > reduction on 10k's of regions) merge successfully on HBase 0.94?  If so,
> > how did you do it?
> >
> > Thanks,
> > Ted
> >
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: state-of-the-art method for merging regions on v0.94

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Oh, I just cheched and HBASE-1212 seems to have been backported into 0.94
too... Missed that. So then offline merges should be fine in that branch
too...


2014-08-28 15:12 GMT-04:00 Andrew Purtell <ap...@apache.org>:

> If the 0.94 merge code doesn't work out the box we should fix that.
>
>
> On Thu, Aug 28, 2014 at 11:26 AM, Bryan Beaudreault <
> bbeaudreault@hubspot.com> wrote:
>
> > I've done it.  This is the code I used:
> > https://gist.github.com/bbeaudreault/7567385
> >
> > It comes from the hbase source, but is modified to actually work (the
> class
> > provided in hbase is private and does not work out of the box). There is
> a
> > readme at the bottom of the gist with my process.  One important note
> > though, I did this with a deep understanding (after hours of reading
> hbase
> > code and doing tests on a test cluster) of how it all works.  And even
> then
> > I felt nervous to do it in prod.  Hence why I went the snapshot/compact
> > route.
> >
> > I would definitely test it on a test cluster and get some familiarity
> > before getting close to a production table.  That said, I've run this on
> > 8-10 production tables a few months ago, reducing in size from 10-20x in
> > some cases.
> >
> >
> > On Thu, Aug 28, 2014 at 2:19 PM, Ted Tuttle <te...@mentacapital.com>
> wrote:
> >
> > > Hello-
> > >
> > > We recently realized our region size is 1G and need to increase it to
> get
> > > our region count under control.  I've done some research on merging
> > regions
> > > and have come away confused.
> > >
> > > There is the ops handbook:
> > >
> > > http://hbase.apache.org/book/ops.regionmgt.html
> > >
> > > And then there is this horror story:
> > >
> > > http://metabroadcast.com/blog/so-you-broke-hbase
> > >
> > > Is there someone out there that has done a large scale (i.e. 10:1
> > > reduction on 10k's of regions) merge successfully on HBase 0.94?  If
> so,
> > > how did you do it?
> > >
> > > Thanks,
> > > Ted
> > >
> > >
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Re: state-of-the-art method for merging regions on v0.94

Posted by Andrew Purtell <ap...@apache.org>.
If the 0.94 merge code doesn't work out the box we should fix that.


On Thu, Aug 28, 2014 at 11:26 AM, Bryan Beaudreault <
bbeaudreault@hubspot.com> wrote:

> I've done it.  This is the code I used:
> https://gist.github.com/bbeaudreault/7567385
>
> It comes from the hbase source, but is modified to actually work (the class
> provided in hbase is private and does not work out of the box). There is a
> readme at the bottom of the gist with my process.  One important note
> though, I did this with a deep understanding (after hours of reading hbase
> code and doing tests on a test cluster) of how it all works.  And even then
> I felt nervous to do it in prod.  Hence why I went the snapshot/compact
> route.
>
> I would definitely test it on a test cluster and get some familiarity
> before getting close to a production table.  That said, I've run this on
> 8-10 production tables a few months ago, reducing in size from 10-20x in
> some cases.
>
>
> On Thu, Aug 28, 2014 at 2:19 PM, Ted Tuttle <te...@mentacapital.com> wrote:
>
> > Hello-
> >
> > We recently realized our region size is 1G and need to increase it to get
> > our region count under control.  I've done some research on merging
> regions
> > and have come away confused.
> >
> > There is the ops handbook:
> >
> > http://hbase.apache.org/book/ops.regionmgt.html
> >
> > And then there is this horror story:
> >
> > http://metabroadcast.com/blog/so-you-broke-hbase
> >
> > Is there someone out there that has done a large scale (i.e. 10:1
> > reduction on 10k's of regions) merge successfully on HBase 0.94?  If so,
> > how did you do it?
> >
> > Thanks,
> > Ted
> >
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: state-of-the-art method for merging regions on v0.94

Posted by Bryan Beaudreault <bb...@hubspot.com>.
I've done it.  This is the code I used:
https://gist.github.com/bbeaudreault/7567385

It comes from the hbase source, but is modified to actually work (the class
provided in hbase is private and does not work out of the box). There is a
readme at the bottom of the gist with my process.  One important note
though, I did this with a deep understanding (after hours of reading hbase
code and doing tests on a test cluster) of how it all works.  And even then
I felt nervous to do it in prod.  Hence why I went the snapshot/compact
route.

I would definitely test it on a test cluster and get some familiarity
before getting close to a production table.  That said, I've run this on
8-10 production tables a few months ago, reducing in size from 10-20x in
some cases.


On Thu, Aug 28, 2014 at 2:19 PM, Ted Tuttle <te...@mentacapital.com> wrote:

> Hello-
>
> We recently realized our region size is 1G and need to increase it to get
> our region count under control.  I've done some research on merging regions
> and have come away confused.
>
> There is the ops handbook:
>
> http://hbase.apache.org/book/ops.regionmgt.html
>
> And then there is this horror story:
>
> http://metabroadcast.com/blog/so-you-broke-hbase
>
> Is there someone out there that has done a large scale (i.e. 10:1
> reduction on 10k's of regions) merge successfully on HBase 0.94?  If so,
> how did you do it?
>
> Thanks,
> Ted
>
>

Re: state-of-the-art method for merging regions on v0.94

Posted by lars hofhansl <la...@apache.org>.
Hey Ted!


How many regions (per region server) do you have on average?
If it's not too bad you might just be able to increase hbase.hregion.max.filesize to 10 or 20g and bounce all the region servers.
Then as you write more data you will fill up the existing regions.

"Too bad" is fuzzy. If you approach hundreds of regions per region server you likely have a problem, depending on your read/write patterns.


-- Lars



________________________________
 From: Ted Tuttle <te...@mentacapital.com>
To: "user@hbase.apache.org" <us...@hbase.apache.org> 
Cc: Development <De...@mentacapital.com> 
Sent: Thursday, August 28, 2014 11:19 AM
Subject: state-of-the-art method for merging regions on v0.94
 

Hello-

We recently realized our region size is 1G and need to increase it to get our region count under control.  I've done some research on merging regions and have come away confused.

There is the ops handbook:

http://hbase.apache.org/book/ops.regionmgt.html

And then there is this horror story:

http://metabroadcast.com/blog/so-you-broke-hbase

Is there someone out there that has done a large scale (i.e. 10:1 reduction on 10k's of regions) merge successfully on HBase 0.94?  If so, how did you do it?

Thanks,
Ted