You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ramon Wang <ra...@appannie.com> on 2014/02/21 10:50:11 UTC

When should we trigger a major compaction?

Hi Guys

We disabled the automatically major compaction setting in our HBase
cluster, so we want to something which can tell us when should we do major
compaction, we were thinking to use "num of storefiles", but we cannot find
any APIs or any JMX settings which can be used for it, please share us some
ideas on how to do it, thanks in advance.

Cheers
Ramon

Re: When should we trigger a major compaction?

Posted by Jeremy Carroll <ph...@gmail.com>.
When the store file size starts to get very large, major compaction will shrink the HFile size down by removing tombstones and expired records.  If latency in requests is within your SLA, store file size is not gigantic. You would not have to do this. 

IMHO minor compactions is the best tool to enforce the max number of store files read in a request. 

Sent from my iPhone

> On Feb 21, 2014, at 6:26 AM, Ramon Wang <ra...@appannie.com> wrote:
> 
> Guys, I'm clear about what compaction is doing there, i just want to know
> are there any indicators or indexes can help us to decide whether to start
> a major compaction manually?
> 
> As there will be a single StoreFile per Store after a major compaction, so
> the num of StoreFile maybe a indexes, but i'm not sure about this.
> 
> Thanks
> Ramon
> 
> 
>> On Fri, Feb 21, 2014 at 9:00 PM, Ted Yu <yu...@gmail.com> wrote:
>> 
>> Ramon:
>> See http://hbase.apache.org/book.html#compaction for description on
>> compaction file selection.
>> 
>> Cheers
>> 
>> 
>>> On Fri, Feb 21, 2014 at 4:38 AM, yonghu <yo...@gmail.com> wrote:
>>> 
>>> Before you want to trigger major compaction, let's first explain why do
>> we
>>> need major compaction. The major compaction will cause
>>> 1. delete the data which is masked by tombstone;
>>> 2. delete the data which has expired ttl;
>>> 3. compact several small hfiles into a single larger one.
>>> 
>>> I didn't quite understand what do you mean by "use num of storefiles",
>> why
>>> this?
>>> 
>>> 
>>>> On Fri, Feb 21, 2014 at 10:50 AM, Ramon Wang <ra...@appannie.com> wrote:
>>>> 
>>>> Hi Guys
>>>> 
>>>> We disabled the automatically major compaction setting in our HBase
>>>> cluster, so we want to something which can tell us when should we do
>>> major
>>>> compaction, we were thinking to use "num of storefiles", but we cannot
>>> find
>>>> any APIs or any JMX settings which can be used for it, please share us
>>> some
>>>> ideas on how to do it, thanks in advance.
>>>> 
>>>> Cheers
>>>> Ramon
>> 

Re: When should we trigger a major compaction?

Posted by Ramon Wang <ra...@appannie.com>.
Guys, I'm clear about what compaction is doing there, i just want to know
are there any indicators or indexes can help us to decide whether to start
a major compaction manually?

As there will be a single StoreFile per Store after a major compaction, so
the num of StoreFile maybe a indexes, but i'm not sure about this.

Thanks
Ramon


On Fri, Feb 21, 2014 at 9:00 PM, Ted Yu <yu...@gmail.com> wrote:

> Ramon:
> See http://hbase.apache.org/book.html#compaction for description on
> compaction file selection.
>
> Cheers
>
>
> On Fri, Feb 21, 2014 at 4:38 AM, yonghu <yo...@gmail.com> wrote:
>
> > Before you want to trigger major compaction, let's first explain why do
> we
> > need major compaction. The major compaction will cause
> > 1. delete the data which is masked by tombstone;
> > 2. delete the data which has expired ttl;
> > 3. compact several small hfiles into a single larger one.
> >
> > I didn't quite understand what do you mean by "use num of storefiles",
> why
> > this?
> >
> >
> > On Fri, Feb 21, 2014 at 10:50 AM, Ramon Wang <ra...@appannie.com> wrote:
> >
> > > Hi Guys
> > >
> > > We disabled the automatically major compaction setting in our HBase
> > > cluster, so we want to something which can tell us when should we do
> > major
> > > compaction, we were thinking to use "num of storefiles", but we cannot
> > find
> > > any APIs or any JMX settings which can be used for it, please share us
> > some
> > > ideas on how to do it, thanks in advance.
> > >
> > > Cheers
> > > Ramon
> > >
> >
>

Re: When should we trigger a major compaction?

Posted by Ted Yu <yu...@gmail.com>.
Ramon:
See http://hbase.apache.org/book.html#compaction for description on
compaction file selection.

Cheers


On Fri, Feb 21, 2014 at 4:38 AM, yonghu <yo...@gmail.com> wrote:

> Before you want to trigger major compaction, let's first explain why do we
> need major compaction. The major compaction will cause
> 1. delete the data which is masked by tombstone;
> 2. delete the data which has expired ttl;
> 3. compact several small hfiles into a single larger one.
>
> I didn't quite understand what do you mean by "use num of storefiles", why
> this?
>
>
> On Fri, Feb 21, 2014 at 10:50 AM, Ramon Wang <ra...@appannie.com> wrote:
>
> > Hi Guys
> >
> > We disabled the automatically major compaction setting in our HBase
> > cluster, so we want to something which can tell us when should we do
> major
> > compaction, we were thinking to use "num of storefiles", but we cannot
> find
> > any APIs or any JMX settings which can be used for it, please share us
> some
> > ideas on how to do it, thanks in advance.
> >
> > Cheers
> > Ramon
> >
>

Re: When should we trigger a major compaction?

Posted by yonghu <yo...@gmail.com>.
Before you want to trigger major compaction, let's first explain why do we
need major compaction. The major compaction will cause
1. delete the data which is masked by tombstone;
2. delete the data which has expired ttl;
3. compact several small hfiles into a single larger one.

I didn't quite understand what do you mean by "use num of storefiles", why
this?


On Fri, Feb 21, 2014 at 10:50 AM, Ramon Wang <ra...@appannie.com> wrote:

> Hi Guys
>
> We disabled the automatically major compaction setting in our HBase
> cluster, so we want to something which can tell us when should we do major
> compaction, we were thinking to use "num of storefiles", but we cannot find
> any APIs or any JMX settings which can be used for it, please share us some
> ideas on how to do it, thanks in advance.
>
> Cheers
> Ramon
>