You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by yonghu <yo...@gmail.com> on 2012/01/26 14:43:59 UTC

the occasion of the major compact?

Hello,

I read this blog http://outerthought.org/blog/465-ot.html. It mentions
that every 24 hours the major compaction will occur. My question is
that if there are any other conditions which can trigger major
compaction happening? For example, when the size of store file reaches
the threshold (I think this will cause minor compaction or region file
split, not major compaction, but not quite sure).

Thanks!

Yong

Re: the occasion of the major compact?

Posted by yonghu <yo...@gmail.com>.
Yes.

I have already used the way which suggested by Nicolas.

By the way which suggested by Lars, exporting the content of table, I
am not sure if it's a good idea. As I can't control the compactions,
the data completion can not be guaranteed. It means between two export
operations, if there are compactions happening, and then the deleted
data will be lost. BTW,  if I understand right, from Lars's
description, the deleted data might also be removed during the minor
compaction!

Thanks

Yong


On Thu, Jan 26, 2012 at 11:52 PM, lars hofhansl <lh...@yahoo.com> wrote:
> If you are planning to use trunk (what will be 0.94) you can also enable KEEP_DELETED_CELLS for your column families.
> That will keep deleted cells around (until they get removed because of # of versions, or TTL).
>
> Also note that version # and TTL checks are also performed during minor compactions and even during memstore flushes, and hence cells might be removed on those occasions as well.
>
> If you have time and space, you also backup your tables into text files (using export) and crunch them there (I added support for HBASE-4536) in export as well.
>
> -- Lars
>
>
> ----- Original Message -----
> From: yonghu <yo...@gmail.com>
> To: user@hbase.apache.org; lars hofhansl <lh...@yahoo.com>
> Cc:
> Sent: Thursday, January 26, 2012 1:22 PM
> Subject: Re: the occasion of the major compact?
>
> yes. I read this blog
> http://hadoop-hbase.blogspot.com/2011/12/raw-scans.html. And I thought
> if I could disable the major compact, it was possible to use the way
> described in the blog. Otherwise, the major compact will remove the
> deleted data.
>
> Thanks!
>
> Yong
>
> On Thu, Jan 26, 2012 at 10:11 PM, lars hofhansl <lh...@yahoo.com> wrote:
>> Unless you have HBASE-4536 (only in trunk, though) or are parsing the HFiles yourself you have no way of actually getting to the deleted data.
>>
>> -- Lars
>>
>>
>>
>> ----- Original Message -----
>> From: yonghu <yo...@gmail.com>
>> To: user@hbase.apache.org
>> Cc:
>> Sent: Thursday, January 26, 2012 1:00 PM
>> Subject: Re: the occasion of the major compact?
>>
>> Nicolas,
>>
>> In my use case, I want to extract the deleted data. Hence, if I
>> disable the major compaction, I can prevent the hbase to actually
>> delete the data. After extracting the deleted data, I can issue major
>> compact by myself.
>>
>> Regards
>>
>> Yong
>>
>> On Thu, Jan 26, 2012 at 8:02 PM, Nicolas Spiegelberg
>> <ns...@fb.com> wrote:
>>> Yong,
>>>
>>> Can you please explain why you want to disable major compactions?  What
>>> are the problems that you're currently seeing or what are you worried will
>>> happen if a major compaction is allowed to occur?  Right now, there are
>>> only an extremely small subset of cases where you must explicitly disable
>>> compactions.  These use cases I know of are very complicated and require
>>> building StoreFile analysis tools underneath HBase, that I'm pretty sure
>>> you're not needing this.
>>>
>>> Please also read my follow up commentary to explaining major compaction
>>> logic:
>>> http://search-hadoop.com/m/JR9sK1xnbj21
>>> http://search-hadoop.com/m/X7W7q1xnbj21
>>>
>>>
>>> The vast majority of users need features completely unrelated to
>>> compactions.  The compaction algorithm is an easy target to worry about.
>>>
>>>
>>> On 1/26/12 7:06 AM, "yonghu" <yo...@gmail.com> wrote:
>>>
>>>>Hello Mikael,
>>>>
>>>>I think disabling the major compaction in the timed and client-issued
>>>>situation is not a problem. The problem is the size-based. From the
>>>>mailing list, it only talks about the situation of minor compaction
>>>>not major compaction, if I understand right. So, I want to know if
>>>>someone can tell me how to close the major compaction in size-based
>>>>situation.
>>>>
>>>>Thanks
>>>>
>>>>Yong
>>>>I saw the description which indicating the size of store file can also
>>>>trigger major compaction.
>>>>
>>>>On Thu, Jan 26, 2012 at 3:54 PM, Mikael Sitruk <mi...@gmail.com>
>>>>wrote:
>>>>> Yong hi
>>>>>
>>>>> As far as i know setting  hbase.hregion.majorcompaction to 0 will
>>>>>disable
>>>>> the time based trigger only.
>>>>> Client are always able to invoke the major compact, no matter what is
>>>>>the
>>>>> value of the hbase.hregion.majorcompaction.
>>>>>
>>>>> Perhaps client invocation of compaction can me disabled with the
>>>>>security
>>>>> package.
>>>>>
>>>>> Anyway i'm digging into 0.92, I hope to get those insight soon.
>>>>>
>>>>> Mikael.S
>>>>>
>>>>> On Thu, Jan 26, 2012 at 4:39 PM, yonghu <yo...@gmail.com> wrote:
>>>>>
>>>>>> Thanks for your response.
>>>>>>
>>>>>> I knew that major compact can be triggered based on client, time and
>>>>>> size. In my situation, I have to close the functionality of major
>>>>>> compact. So, if I set the Œhbase.hregion.majorcompaction¹ into 0, it
>>>>>> will close all the three situations or I have to set it separately for
>>>>>> each case. BTW, my hbase version is 0.92.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Yong
>>>>>>
>>>>>> On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk
>>>>>><mi...@gmail.com>
>>>>>> wrote:
>>>>>> > look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it
>>>>>>explain a
>>>>>> > lot on major compaction and enhancement over versions
>>>>>> >
>>>>>> > Mikael.S
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dh...@figarocms.fr>
>>>>>> wrote:
>>>>>> >
>>>>>> >> Le 26/01/2012 14:43, yonghu a écrit :
>>>>>> >> > Hello,
>>>>>> >> >
>>>>>> >> > I read this blog http://outerthought.org/blog/465-ot.html. It
>>>>>> mentions
>>>>>> >> > that every 24 hours the major compaction will occur. My question
>>>>>>is
>>>>>> >> > that if there are any other conditions which can trigger major
>>>>>> >> > compaction happening? For example, when the size of store file
>>>>>>reaches
>>>>>> >> > the threshold (I think this will cause minor compaction or region
>>>>>>file
>>>>>> >> > split, not major compaction, but not quite sure).
>>>>>> >> >
>>>>>> >> > Thanks!
>>>>>> >> >
>>>>>> >> > Yong
>>>>>> >>
>>>>>> >> Hello,
>>>>>> >> I think when there is massive delete on the table or change table
>>>>>> >> attribute like TTL (that is susseptible of remove a lot of
>>>>>> >> versions/rows) or COMPRESSION wich gain a lot of disk space on each
>>>>>> region.
>>>>>> >>
>>>>>> >> Cheers,
>>>>>> >>
>>>>>> >> --
>>>>>> >> Damien
>>>>>> >>
>>>>>> >>
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Mikael.S
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Mikael.S
>>>
>>
>

Re: the occasion of the major compact?

Posted by lars hofhansl <lh...@yahoo.com>.
If you are planning to use trunk (what will be 0.94) you can also enable KEEP_DELETED_CELLS for your column families.
That will keep deleted cells around (until they get removed because of # of versions, or TTL).

Also note that version # and TTL checks are also performed during minor compactions and even during memstore flushes, and hence cells might be removed on those occasions as well.

If you have time and space, you also backup your tables into text files (using export) and crunch them there (I added support for HBASE-4536) in export as well.

-- Lars


----- Original Message -----
From: yonghu <yo...@gmail.com>
To: user@hbase.apache.org; lars hofhansl <lh...@yahoo.com>
Cc: 
Sent: Thursday, January 26, 2012 1:22 PM
Subject: Re: the occasion of the major compact?

yes. I read this blog
http://hadoop-hbase.blogspot.com/2011/12/raw-scans.html. And I thought
if I could disable the major compact, it was possible to use the way
described in the blog. Otherwise, the major compact will remove the
deleted data.

Thanks!

Yong

On Thu, Jan 26, 2012 at 10:11 PM, lars hofhansl <lh...@yahoo.com> wrote:
> Unless you have HBASE-4536 (only in trunk, though) or are parsing the HFiles yourself you have no way of actually getting to the deleted data.
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: yonghu <yo...@gmail.com>
> To: user@hbase.apache.org
> Cc:
> Sent: Thursday, January 26, 2012 1:00 PM
> Subject: Re: the occasion of the major compact?
>
> Nicolas,
>
> In my use case, I want to extract the deleted data. Hence, if I
> disable the major compaction, I can prevent the hbase to actually
> delete the data. After extracting the deleted data, I can issue major
> compact by myself.
>
> Regards
>
> Yong
>
> On Thu, Jan 26, 2012 at 8:02 PM, Nicolas Spiegelberg
> <ns...@fb.com> wrote:
>> Yong,
>>
>> Can you please explain why you want to disable major compactions?  What
>> are the problems that you're currently seeing or what are you worried will
>> happen if a major compaction is allowed to occur?  Right now, there are
>> only an extremely small subset of cases where you must explicitly disable
>> compactions.  These use cases I know of are very complicated and require
>> building StoreFile analysis tools underneath HBase, that I'm pretty sure
>> you're not needing this.
>>
>> Please also read my follow up commentary to explaining major compaction
>> logic:
>> http://search-hadoop.com/m/JR9sK1xnbj21
>> http://search-hadoop.com/m/X7W7q1xnbj21
>>
>>
>> The vast majority of users need features completely unrelated to
>> compactions.  The compaction algorithm is an easy target to worry about.
>>
>>
>> On 1/26/12 7:06 AM, "yonghu" <yo...@gmail.com> wrote:
>>
>>>Hello Mikael,
>>>
>>>I think disabling the major compaction in the timed and client-issued
>>>situation is not a problem. The problem is the size-based. From the
>>>mailing list, it only talks about the situation of minor compaction
>>>not major compaction, if I understand right. So, I want to know if
>>>someone can tell me how to close the major compaction in size-based
>>>situation.
>>>
>>>Thanks
>>>
>>>Yong
>>>I saw the description which indicating the size of store file can also
>>>trigger major compaction.
>>>
>>>On Thu, Jan 26, 2012 at 3:54 PM, Mikael Sitruk <mi...@gmail.com>
>>>wrote:
>>>> Yong hi
>>>>
>>>> As far as i know setting  hbase.hregion.majorcompaction to 0 will
>>>>disable
>>>> the time based trigger only.
>>>> Client are always able to invoke the major compact, no matter what is
>>>>the
>>>> value of the hbase.hregion.majorcompaction.
>>>>
>>>> Perhaps client invocation of compaction can me disabled with the
>>>>security
>>>> package.
>>>>
>>>> Anyway i'm digging into 0.92, I hope to get those insight soon.
>>>>
>>>> Mikael.S
>>>>
>>>> On Thu, Jan 26, 2012 at 4:39 PM, yonghu <yo...@gmail.com> wrote:
>>>>
>>>>> Thanks for your response.
>>>>>
>>>>> I knew that major compact can be triggered based on client, time and
>>>>> size. In my situation, I have to close the functionality of major
>>>>> compact. So, if I set the Œhbase.hregion.majorcompaction¹ into 0, it
>>>>> will close all the three situations or I have to set it separately for
>>>>> each case. BTW, my hbase version is 0.92.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Yong
>>>>>
>>>>> On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk
>>>>><mi...@gmail.com>
>>>>> wrote:
>>>>> > look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it
>>>>>explain a
>>>>> > lot on major compaction and enhancement over versions
>>>>> >
>>>>> > Mikael.S
>>>>> >
>>>>> >
>>>>> > On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dh...@figarocms.fr>
>>>>> wrote:
>>>>> >
>>>>> >> Le 26/01/2012 14:43, yonghu a écrit :
>>>>> >> > Hello,
>>>>> >> >
>>>>> >> > I read this blog http://outerthought.org/blog/465-ot.html. It
>>>>> mentions
>>>>> >> > that every 24 hours the major compaction will occur. My question
>>>>>is
>>>>> >> > that if there are any other conditions which can trigger major
>>>>> >> > compaction happening? For example, when the size of store file
>>>>>reaches
>>>>> >> > the threshold (I think this will cause minor compaction or region
>>>>>file
>>>>> >> > split, not major compaction, but not quite sure).
>>>>> >> >
>>>>> >> > Thanks!
>>>>> >> >
>>>>> >> > Yong
>>>>> >>
>>>>> >> Hello,
>>>>> >> I think when there is massive delete on the table or change table
>>>>> >> attribute like TTL (that is susseptible of remove a lot of
>>>>> >> versions/rows) or COMPRESSION wich gain a lot of disk space on each
>>>>> region.
>>>>> >>
>>>>> >> Cheers,
>>>>> >>
>>>>> >> --
>>>>> >> Damien
>>>>> >>
>>>>> >>
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Mikael.S
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Mikael.S
>>
>


Re: the occasion of the major compact?

Posted by yonghu <yo...@gmail.com>.
yes. I read this blog
http://hadoop-hbase.blogspot.com/2011/12/raw-scans.html. And I thought
if I could disable the major compact, it was possible to use the way
described in the blog. Otherwise, the major compact will remove the
deleted data.

Thanks!

Yong

On Thu, Jan 26, 2012 at 10:11 PM, lars hofhansl <lh...@yahoo.com> wrote:
> Unless you have HBASE-4536 (only in trunk, though) or are parsing the HFiles yourself you have no way of actually getting to the deleted data.
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: yonghu <yo...@gmail.com>
> To: user@hbase.apache.org
> Cc:
> Sent: Thursday, January 26, 2012 1:00 PM
> Subject: Re: the occasion of the major compact?
>
> Nicolas,
>
> In my use case, I want to extract the deleted data. Hence, if I
> disable the major compaction, I can prevent the hbase to actually
> delete the data. After extracting the deleted data, I can issue major
> compact by myself.
>
> Regards
>
> Yong
>
> On Thu, Jan 26, 2012 at 8:02 PM, Nicolas Spiegelberg
> <ns...@fb.com> wrote:
>> Yong,
>>
>> Can you please explain why you want to disable major compactions?  What
>> are the problems that you're currently seeing or what are you worried will
>> happen if a major compaction is allowed to occur?  Right now, there are
>> only an extremely small subset of cases where you must explicitly disable
>> compactions.  These use cases I know of are very complicated and require
>> building StoreFile analysis tools underneath HBase, that I'm pretty sure
>> you're not needing this.
>>
>> Please also read my follow up commentary to explaining major compaction
>> logic:
>> http://search-hadoop.com/m/JR9sK1xnbj21
>> http://search-hadoop.com/m/X7W7q1xnbj21
>>
>>
>> The vast majority of users need features completely unrelated to
>> compactions.  The compaction algorithm is an easy target to worry about.
>>
>>
>> On 1/26/12 7:06 AM, "yonghu" <yo...@gmail.com> wrote:
>>
>>>Hello Mikael,
>>>
>>>I think disabling the major compaction in the timed and client-issued
>>>situation is not a problem. The problem is the size-based. From the
>>>mailing list, it only talks about the situation of minor compaction
>>>not major compaction, if I understand right. So, I want to know if
>>>someone can tell me how to close the major compaction in size-based
>>>situation.
>>>
>>>Thanks
>>>
>>>Yong
>>>I saw the description which indicating the size of store file can also
>>>trigger major compaction.
>>>
>>>On Thu, Jan 26, 2012 at 3:54 PM, Mikael Sitruk <mi...@gmail.com>
>>>wrote:
>>>> Yong hi
>>>>
>>>> As far as i know setting  hbase.hregion.majorcompaction to 0 will
>>>>disable
>>>> the time based trigger only.
>>>> Client are always able to invoke the major compact, no matter what is
>>>>the
>>>> value of the hbase.hregion.majorcompaction.
>>>>
>>>> Perhaps client invocation of compaction can me disabled with the
>>>>security
>>>> package.
>>>>
>>>> Anyway i'm digging into 0.92, I hope to get those insight soon.
>>>>
>>>> Mikael.S
>>>>
>>>> On Thu, Jan 26, 2012 at 4:39 PM, yonghu <yo...@gmail.com> wrote:
>>>>
>>>>> Thanks for your response.
>>>>>
>>>>> I knew that major compact can be triggered based on client, time and
>>>>> size. In my situation, I have to close the functionality of major
>>>>> compact. So, if I set the Œhbase.hregion.majorcompaction¹ into 0, it
>>>>> will close all the three situations or I have to set it separately for
>>>>> each case. BTW, my hbase version is 0.92.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Yong
>>>>>
>>>>> On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk
>>>>><mi...@gmail.com>
>>>>> wrote:
>>>>> > look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it
>>>>>explain a
>>>>> > lot on major compaction and enhancement over versions
>>>>> >
>>>>> > Mikael.S
>>>>> >
>>>>> >
>>>>> > On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dh...@figarocms.fr>
>>>>> wrote:
>>>>> >
>>>>> >> Le 26/01/2012 14:43, yonghu a écrit :
>>>>> >> > Hello,
>>>>> >> >
>>>>> >> > I read this blog http://outerthought.org/blog/465-ot.html. It
>>>>> mentions
>>>>> >> > that every 24 hours the major compaction will occur. My question
>>>>>is
>>>>> >> > that if there are any other conditions which can trigger major
>>>>> >> > compaction happening? For example, when the size of store file
>>>>>reaches
>>>>> >> > the threshold (I think this will cause minor compaction or region
>>>>>file
>>>>> >> > split, not major compaction, but not quite sure).
>>>>> >> >
>>>>> >> > Thanks!
>>>>> >> >
>>>>> >> > Yong
>>>>> >>
>>>>> >> Hello,
>>>>> >> I think when there is massive delete on the table or change table
>>>>> >> attribute like TTL (that is susseptible of remove a lot of
>>>>> >> versions/rows) or COMPRESSION wich gain a lot of disk space on each
>>>>> region.
>>>>> >>
>>>>> >> Cheers,
>>>>> >>
>>>>> >> --
>>>>> >> Damien
>>>>> >>
>>>>> >>
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Mikael.S
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Mikael.S
>>
>

Re: the occasion of the major compact?

Posted by Nicolas Spiegelberg <ns...@fb.com>.
As an aside,

If you can't wait for HBASE-4536 and trunk to get RC'd, you can just add a
'Delete' column to logically serve as a client-side delete marker instead
of issuing the actual delete and have an MR job both extract the delete
data & handle the actual server-side delete.

On 1/26/12 4:11 PM, "lars hofhansl" <lh...@yahoo.com> wrote:

>Unless you have HBASE-4536 (only in trunk, though) or are parsing the
>HFiles yourself you have no way of actuallygettingto the deleted data.
>
>-- Lars
>
>
>
>----- Original Message -----
>From: yonghu <yo...@gmail.com>
>To: user@hbase.apache.org
>Cc: 
>Sent: Thursday, January 26, 2012 1:00 PM
>Subject: Re: the occasion of the major compact?
>
>Nicolas,
>
>In my use case, I want to extract the deleted data. Hence, if I
>disable the major compaction, I can prevent the hbase to actually
>delete the data. After extracting the deleted data, I can issue major
>compact by myself.
>
>Regards
>
>Yong
>
>On Thu, Jan 26, 2012 at 8:02 PM, Nicolas Spiegelberg
><ns...@fb.com> wrote:
>> Yong,
>>
>> Can you please explain why you want to disable major compactions?  What
>> are the problems that you're currently seeing or what are you worried
>>will
>> happen if a major compaction is allowed to occur?  Right now, there are
>> only an extremely small subset of cases where you must explicitly
>>disable
>> compactions.  These use cases I know of are very complicated and require
>> building StoreFile analysis tools underneath HBase, that I'm pretty sure
>> you're not needing this.
>>
>> Please also read my follow up commentary to explaining major compaction
>> logic:
>> http://search-hadoop.com/m/JR9sK1xnbj21
>> http://search-hadoop.com/m/X7W7q1xnbj21
>>
>>
>> The vast majority of users need features completely unrelated to
>> compactions.  The compaction algorithm is an easy target to worry about.
>>
>>
>> On 1/26/12 7:06 AM, "yonghu" <yo...@gmail.com> wrote:
>>
>>>Hello Mikael,
>>>
>>>I think disabling the major compaction in the timed and client-issued
>>>situation is not a problem. The problem is the size-based. From the
>>>mailing list, it only talks about the situation of minor compaction
>>>not major compaction, if I understand right. So, I want to know if
>>>someone can tell me how to close the major compaction in size-based
>>>situation.
>>>
>>>Thanks
>>>
>>>Yong
>>>I saw the description which indicating the size of store file can also
>>>trigger major compaction.
>>>
>>>On Thu, Jan 26, 2012 at 3:54 PM, Mikael Sitruk <mi...@gmail.com>
>>>wrote:
>>>> Yong hi
>>>>
>>>> As far as i know setting  hbase.hregion.majorcompaction to 0 will
>>>>disable
>>>> the time based trigger only.
>>>> Client are always able to invoke the major compact, no matter what is
>>>>the
>>>> value of the hbase.hregion.majorcompaction.
>>>>
>>>> Perhaps client invocation of compaction can me disabled with the
>>>>security
>>>> package.
>>>>
>>>> Anyway i'm digging into 0.92, I hope to get those insight soon.
>>>>
>>>> Mikael.S
>>>>
>>>> On Thu, Jan 26, 2012 at 4:39 PM, yonghu <yo...@gmail.com> wrote:
>>>>
>>>>> Thanks for your response.
>>>>>
>>>>> I knew that major compact can be triggered based on client, time and
>>>>> size. In my situation, I have to close the functionality of major
>>>>> compact. So, if I set the Œhbase.hregion.majorcompaction¹ into 0, it
>>>>> will close all the three situations or I have to set it separately
>>>>>for
>>>>> each case. BTW, my hbase version is 0.92.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Yong
>>>>>
>>>>> On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk
>>>>><mi...@gmail.com>
>>>>> wrote:
>>>>> > look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it
>>>>>explain a
>>>>> > lot on major compaction and enhancement over versions
>>>>> >
>>>>> > Mikael.S
>>>>> >
>>>>> >
>>>>> > On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dh...@figarocms.fr>
>>>>> wrote:
>>>>> >
>>>>> >> Le 26/01/2012 14:43, yonghu a écrit :
>>>>> >> > Hello,
>>>>> >> >
>>>>> >> > I read this blog http://outerthought.org/blog/465-ot.html. It
>>>>> mentions
>>>>> >> > that every 24 hours the major compaction will occur. My question
>>>>>is
>>>>> >> > that if there are any other conditions which can trigger major
>>>>> >> > compaction happening? For example, when the size of store file
>>>>>reaches
>>>>> >> > the threshold (I think this will cause minor compaction or
>>>>>region
>>>>>file
>>>>> >> > split, not major compaction, but not quite sure).
>>>>> >> >
>>>>> >> > Thanks!
>>>>> >> >
>>>>> >> > Yong
>>>>> >>
>>>>> >> Hello,
>>>>> >> I think when there is massive delete on the table or change table
>>>>> >> attribute like TTL (that is susseptible of remove a lot of
>>>>> >> versions/rows) or COMPRESSION wich gain a lot of disk space on
>>>>>each
>>>>> region.
>>>>> >>
>>>>> >> Cheers,
>>>>> >>
>>>>> >> --
>>>>> >> Damien
>>>>> >>
>>>>> >>
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Mikael.S
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Mikael.S
>>
>


Re: the occasion of the major compact?

Posted by lars hofhansl <lh...@yahoo.com>.
Unless you have HBASE-4536 (only in trunk, though) or are parsing the HFiles yourself you have no way of actuallygettingto the deleted data.

-- Lars



----- Original Message -----
From: yonghu <yo...@gmail.com>
To: user@hbase.apache.org
Cc: 
Sent: Thursday, January 26, 2012 1:00 PM
Subject: Re: the occasion of the major compact?

Nicolas,

In my use case, I want to extract the deleted data. Hence, if I
disable the major compaction, I can prevent the hbase to actually
delete the data. After extracting the deleted data, I can issue major
compact by myself.

Regards

Yong

On Thu, Jan 26, 2012 at 8:02 PM, Nicolas Spiegelberg
<ns...@fb.com> wrote:
> Yong,
>
> Can you please explain why you want to disable major compactions?  What
> are the problems that you're currently seeing or what are you worried will
> happen if a major compaction is allowed to occur?  Right now, there are
> only an extremely small subset of cases where you must explicitly disable
> compactions.  These use cases I know of are very complicated and require
> building StoreFile analysis tools underneath HBase, that I'm pretty sure
> you're not needing this.
>
> Please also read my follow up commentary to explaining major compaction
> logic:
> http://search-hadoop.com/m/JR9sK1xnbj21
> http://search-hadoop.com/m/X7W7q1xnbj21
>
>
> The vast majority of users need features completely unrelated to
> compactions.  The compaction algorithm is an easy target to worry about.
>
>
> On 1/26/12 7:06 AM, "yonghu" <yo...@gmail.com> wrote:
>
>>Hello Mikael,
>>
>>I think disabling the major compaction in the timed and client-issued
>>situation is not a problem. The problem is the size-based. From the
>>mailing list, it only talks about the situation of minor compaction
>>not major compaction, if I understand right. So, I want to know if
>>someone can tell me how to close the major compaction in size-based
>>situation.
>>
>>Thanks
>>
>>Yong
>>I saw the description which indicating the size of store file can also
>>trigger major compaction.
>>
>>On Thu, Jan 26, 2012 at 3:54 PM, Mikael Sitruk <mi...@gmail.com>
>>wrote:
>>> Yong hi
>>>
>>> As far as i know setting  hbase.hregion.majorcompaction to 0 will
>>>disable
>>> the time based trigger only.
>>> Client are always able to invoke the major compact, no matter what is
>>>the
>>> value of the hbase.hregion.majorcompaction.
>>>
>>> Perhaps client invocation of compaction can me disabled with the
>>>security
>>> package.
>>>
>>> Anyway i'm digging into 0.92, I hope to get those insight soon.
>>>
>>> Mikael.S
>>>
>>> On Thu, Jan 26, 2012 at 4:39 PM, yonghu <yo...@gmail.com> wrote:
>>>
>>>> Thanks for your response.
>>>>
>>>> I knew that major compact can be triggered based on client, time and
>>>> size. In my situation, I have to close the functionality of major
>>>> compact. So, if I set the Œhbase.hregion.majorcompaction¹ into 0, it
>>>> will close all the three situations or I have to set it separately for
>>>> each case. BTW, my hbase version is 0.92.
>>>>
>>>> Thanks!
>>>>
>>>> Yong
>>>>
>>>> On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk
>>>><mi...@gmail.com>
>>>> wrote:
>>>> > look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it
>>>>explain a
>>>> > lot on major compaction and enhancement over versions
>>>> >
>>>> > Mikael.S
>>>> >
>>>> >
>>>> > On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dh...@figarocms.fr>
>>>> wrote:
>>>> >
>>>> >> Le 26/01/2012 14:43, yonghu a écrit :
>>>> >> > Hello,
>>>> >> >
>>>> >> > I read this blog http://outerthought.org/blog/465-ot.html. It
>>>> mentions
>>>> >> > that every 24 hours the major compaction will occur. My question
>>>>is
>>>> >> > that if there are any other conditions which can trigger major
>>>> >> > compaction happening? For example, when the size of store file
>>>>reaches
>>>> >> > the threshold (I think this will cause minor compaction or region
>>>>file
>>>> >> > split, not major compaction, but not quite sure).
>>>> >> >
>>>> >> > Thanks!
>>>> >> >
>>>> >> > Yong
>>>> >>
>>>> >> Hello,
>>>> >> I think when there is massive delete on the table or change table
>>>> >> attribute like TTL (that is susseptible of remove a lot of
>>>> >> versions/rows) or COMPRESSION wich gain a lot of disk space on each
>>>> region.
>>>> >>
>>>> >> Cheers,
>>>> >>
>>>> >> --
>>>> >> Damien
>>>> >>
>>>> >>
>>>> >
>>>> >
>>>> > --
>>>> > Mikael.S
>>>>
>>>
>>>
>>>
>>> --
>>> Mikael.S
>


Re: the occasion of the major compact?

Posted by yonghu <yo...@gmail.com>.
Nicolas,

In my use case, I want to extract the deleted data. Hence, if I
disable the major compaction, I can prevent the hbase to actually
delete the data. After extracting the deleted data, I can issue major
compact by myself.

Regards

Yong

On Thu, Jan 26, 2012 at 8:02 PM, Nicolas Spiegelberg
<ns...@fb.com> wrote:
> Yong,
>
> Can you please explain why you want to disable major compactions?  What
> are the problems that you're currently seeing or what are you worried will
> happen if a major compaction is allowed to occur?  Right now, there are
> only an extremely small subset of cases where you must explicitly disable
> compactions.  These use cases I know of are very complicated and require
> building StoreFile analysis tools underneath HBase, that I'm pretty sure
> you're not needing this.
>
> Please also read my follow up commentary to explaining major compaction
> logic:
> http://search-hadoop.com/m/JR9sK1xnbj21
> http://search-hadoop.com/m/X7W7q1xnbj21
>
>
> The vast majority of users need features completely unrelated to
> compactions.  The compaction algorithm is an easy target to worry about.
>
>
> On 1/26/12 7:06 AM, "yonghu" <yo...@gmail.com> wrote:
>
>>Hello Mikael,
>>
>>I think disabling the major compaction in the timed and client-issued
>>situation is not a problem. The problem is the size-based. From the
>>mailing list, it only talks about the situation of minor compaction
>>not major compaction, if I understand right. So, I want to know if
>>someone can tell me how to close the major compaction in size-based
>>situation.
>>
>>Thanks
>>
>>Yong
>>I saw the description which indicating the size of store file can also
>>trigger major compaction.
>>
>>On Thu, Jan 26, 2012 at 3:54 PM, Mikael Sitruk <mi...@gmail.com>
>>wrote:
>>> Yong hi
>>>
>>> As far as i know setting  hbase.hregion.majorcompaction to 0 will
>>>disable
>>> the time based trigger only.
>>> Client are always able to invoke the major compact, no matter what is
>>>the
>>> value of the hbase.hregion.majorcompaction.
>>>
>>> Perhaps client invocation of compaction can me disabled with the
>>>security
>>> package.
>>>
>>> Anyway i'm digging into 0.92, I hope to get those insight soon.
>>>
>>> Mikael.S
>>>
>>> On Thu, Jan 26, 2012 at 4:39 PM, yonghu <yo...@gmail.com> wrote:
>>>
>>>> Thanks for your response.
>>>>
>>>> I knew that major compact can be triggered based on client, time and
>>>> size. In my situation, I have to close the functionality of major
>>>> compact. So, if I set the Œhbase.hregion.majorcompaction¹ into 0, it
>>>> will close all the three situations or I have to set it separately for
>>>> each case. BTW, my hbase version is 0.92.
>>>>
>>>> Thanks!
>>>>
>>>> Yong
>>>>
>>>> On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk
>>>><mi...@gmail.com>
>>>> wrote:
>>>> > look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it
>>>>explain a
>>>> > lot on major compaction and enhancement over versions
>>>> >
>>>> > Mikael.S
>>>> >
>>>> >
>>>> > On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dh...@figarocms.fr>
>>>> wrote:
>>>> >
>>>> >> Le 26/01/2012 14:43, yonghu a écrit :
>>>> >> > Hello,
>>>> >> >
>>>> >> > I read this blog http://outerthought.org/blog/465-ot.html. It
>>>> mentions
>>>> >> > that every 24 hours the major compaction will occur. My question
>>>>is
>>>> >> > that if there are any other conditions which can trigger major
>>>> >> > compaction happening? For example, when the size of store file
>>>>reaches
>>>> >> > the threshold (I think this will cause minor compaction or region
>>>>file
>>>> >> > split, not major compaction, but not quite sure).
>>>> >> >
>>>> >> > Thanks!
>>>> >> >
>>>> >> > Yong
>>>> >>
>>>> >> Hello,
>>>> >> I think when there is massive delete on the table or change table
>>>> >> attribute like TTL (that is susseptible of remove a lot of
>>>> >> versions/rows) or COMPRESSION wich gain a lot of disk space on each
>>>> region.
>>>> >>
>>>> >> Cheers,
>>>> >>
>>>> >> --
>>>> >> Damien
>>>> >>
>>>> >>
>>>> >
>>>> >
>>>> > --
>>>> > Mikael.S
>>>>
>>>
>>>
>>>
>>> --
>>> Mikael.S
>

Re: the occasion of the major compact?

Posted by Neil Yalowitz <ne...@gmail.com>.
I've had a similar question... perhaps the client promotes a compaction to
"major" if:

a) the threshold for compaction is reached
( hbase.hstore.compactionThreshold )
or (and?)
b) the maximum number of files to process in a "minor" compaction is high
enough to result in all files being processed into one (
hbase.hstore.compaction.max )


Can someone confirm if this is correct, or if there are other factors to
causing a major compaction that I am missing?


Neil Yalowitz

On Thu, Jan 26, 2012 at 10:06 AM, yonghu <yo...@gmail.com> wrote:

> Hello Mikael,
>
> I think disabling the major compaction in the timed and client-issued
> situation is not a problem. The problem is the size-based. From the
> mailing list, it only talks about the situation of minor compaction
> not major compaction, if I understand right. So, I want to know if
> someone can tell me how to close the major compaction in size-based
> situation.
>
> Thanks
>
> Yong
> I saw the description which indicating the size of store file can also
> trigger major compaction.
>
> On Thu, Jan 26, 2012 at 3:54 PM, Mikael Sitruk <mi...@gmail.com>
> wrote:
> > Yong hi
> >
> > As far as i know setting  hbase.hregion.majorcompaction to 0 will disable
> > the time based trigger only.
> > Client are always able to invoke the major compact, no matter what is the
> > value of the hbase.hregion.majorcompaction.
> >
> > Perhaps client invocation of compaction can me disabled with the security
> > package.
> >
> > Anyway i'm digging into 0.92, I hope to get those insight soon.
> >
> > Mikael.S
> >
> > On Thu, Jan 26, 2012 at 4:39 PM, yonghu <yo...@gmail.com> wrote:
> >
> >> Thanks for your response.
> >>
> >> I knew that major compact can be triggered based on client, time and
> >> size. In my situation, I have to close the functionality of major
> >> compact. So, if I set the ‘hbase.hregion.majorcompaction’ into 0, it
> >> will close all the three situations or I have to set it separately for
> >> each case. BTW, my hbase version is 0.92.
> >>
> >> Thanks!
> >>
> >> Yong
> >>
> >> On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk <mikael.sitruk@gmail.com
> >
> >> wrote:
> >> > look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it
> explain a
> >> > lot on major compaction and enhancement over versions
> >> >
> >> > Mikael.S
> >> >
> >> >
> >> > On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dh...@figarocms.fr>
> >> wrote:
> >> >
> >> >> Le 26/01/2012 14:43, yonghu a écrit :
> >> >> > Hello,
> >> >> >
> >> >> > I read this blog http://outerthought.org/blog/465-ot.html. It
> >> mentions
> >> >> > that every 24 hours the major compaction will occur. My question is
> >> >> > that if there are any other conditions which can trigger major
> >> >> > compaction happening? For example, when the size of store file
> reaches
> >> >> > the threshold (I think this will cause minor compaction or region
> file
> >> >> > split, not major compaction, but not quite sure).
> >> >> >
> >> >> > Thanks!
> >> >> >
> >> >> > Yong
> >> >>
> >> >> Hello,
> >> >> I think when there is massive delete on the table or change table
> >> >> attribute like TTL (that is susseptible of remove a lot of
> >> >> versions/rows) or COMPRESSION wich gain a lot of disk space on each
> >> region.
> >> >>
> >> >> Cheers,
> >> >>
> >> >> --
> >> >> Damien
> >> >>
> >> >>
> >> >
> >> >
> >> > --
> >> > Mikael.S
> >>
> >
> >
> >
> > --
> > Mikael.S
>

Re: the occasion of the major compact?

Posted by Nicolas Spiegelberg <ns...@fb.com>.
Yong,

Can you please explain why you want to disable major compactions?  What
are the problems that you're currently seeing or what are you worried will
happen if a major compaction is allowed to occur?  Right now, there are
only an extremely small subset of cases where you must explicitly disable
compactions.  These use cases I know of are very complicated and require
building StoreFile analysis tools underneath HBase, that I'm pretty sure
you're not needing this.

Please also read my follow up commentary to explaining major compaction
logic:
http://search-hadoop.com/m/JR9sK1xnbj21
http://search-hadoop.com/m/X7W7q1xnbj21


The vast majority of users need features completely unrelated to
compactions.  The compaction algorithm is an easy target to worry about.


On 1/26/12 7:06 AM, "yonghu" <yo...@gmail.com> wrote:

>Hello Mikael,
>
>I think disabling the major compaction in the timed and client-issued
>situation is not a problem. The problem is the size-based. From the
>mailing list, it only talks about the situation of minor compaction
>not major compaction, if I understand right. So, I want to know if
>someone can tell me how to close the major compaction in size-based
>situation.
>
>Thanks
>
>Yong
>I saw the description which indicating the size of store file can also
>trigger major compaction.
>
>On Thu, Jan 26, 2012 at 3:54 PM, Mikael Sitruk <mi...@gmail.com>
>wrote:
>> Yong hi
>>
>> As far as i know setting  hbase.hregion.majorcompaction to 0 will
>>disable
>> the time based trigger only.
>> Client are always able to invoke the major compact, no matter what is
>>the
>> value of the hbase.hregion.majorcompaction.
>>
>> Perhaps client invocation of compaction can me disabled with the
>>security
>> package.
>>
>> Anyway i'm digging into 0.92, I hope to get those insight soon.
>>
>> Mikael.S
>>
>> On Thu, Jan 26, 2012 at 4:39 PM, yonghu <yo...@gmail.com> wrote:
>>
>>> Thanks for your response.
>>>
>>> I knew that major compact can be triggered based on client, time and
>>> size. In my situation, I have to close the functionality of major
>>> compact. So, if I set the Œhbase.hregion.majorcompaction¹ into 0, it
>>> will close all the three situations or I have to set it separately for
>>> each case. BTW, my hbase version is 0.92.
>>>
>>> Thanks!
>>>
>>> Yong
>>>
>>> On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk
>>><mi...@gmail.com>
>>> wrote:
>>> > look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it
>>>explain a
>>> > lot on major compaction and enhancement over versions
>>> >
>>> > Mikael.S
>>> >
>>> >
>>> > On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dh...@figarocms.fr>
>>> wrote:
>>> >
>>> >> Le 26/01/2012 14:43, yonghu a écrit :
>>> >> > Hello,
>>> >> >
>>> >> > I read this blog http://outerthought.org/blog/465-ot.html. It
>>> mentions
>>> >> > that every 24 hours the major compaction will occur. My question
>>>is
>>> >> > that if there are any other conditions which can trigger major
>>> >> > compaction happening? For example, when the size of store file
>>>reaches
>>> >> > the threshold (I think this will cause minor compaction or region
>>>file
>>> >> > split, not major compaction, but not quite sure).
>>> >> >
>>> >> > Thanks!
>>> >> >
>>> >> > Yong
>>> >>
>>> >> Hello,
>>> >> I think when there is massive delete on the table or change table
>>> >> attribute like TTL (that is susseptible of remove a lot of
>>> >> versions/rows) or COMPRESSION wich gain a lot of disk space on each
>>> region.
>>> >>
>>> >> Cheers,
>>> >>
>>> >> --
>>> >> Damien
>>> >>
>>> >>
>>> >
>>> >
>>> > --
>>> > Mikael.S
>>>
>>
>>
>>
>> --
>> Mikael.S


Re: the occasion of the major compact?

Posted by yonghu <yo...@gmail.com>.
Hello Mikael,

I think disabling the major compaction in the timed and client-issued
situation is not a problem. The problem is the size-based. From the
mailing list, it only talks about the situation of minor compaction
not major compaction, if I understand right. So, I want to know if
someone can tell me how to close the major compaction in size-based
situation.

Thanks

Yong
I saw the description which indicating the size of store file can also
trigger major compaction.

On Thu, Jan 26, 2012 at 3:54 PM, Mikael Sitruk <mi...@gmail.com> wrote:
> Yong hi
>
> As far as i know setting  hbase.hregion.majorcompaction to 0 will disable
> the time based trigger only.
> Client are always able to invoke the major compact, no matter what is the
> value of the hbase.hregion.majorcompaction.
>
> Perhaps client invocation of compaction can me disabled with the security
> package.
>
> Anyway i'm digging into 0.92, I hope to get those insight soon.
>
> Mikael.S
>
> On Thu, Jan 26, 2012 at 4:39 PM, yonghu <yo...@gmail.com> wrote:
>
>> Thanks for your response.
>>
>> I knew that major compact can be triggered based on client, time and
>> size. In my situation, I have to close the functionality of major
>> compact. So, if I set the ‘hbase.hregion.majorcompaction’ into 0, it
>> will close all the three situations or I have to set it separately for
>> each case. BTW, my hbase version is 0.92.
>>
>> Thanks!
>>
>> Yong
>>
>> On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk <mi...@gmail.com>
>> wrote:
>> > look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it explain a
>> > lot on major compaction and enhancement over versions
>> >
>> > Mikael.S
>> >
>> >
>> > On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dh...@figarocms.fr>
>> wrote:
>> >
>> >> Le 26/01/2012 14:43, yonghu a écrit :
>> >> > Hello,
>> >> >
>> >> > I read this blog http://outerthought.org/blog/465-ot.html. It
>> mentions
>> >> > that every 24 hours the major compaction will occur. My question is
>> >> > that if there are any other conditions which can trigger major
>> >> > compaction happening? For example, when the size of store file reaches
>> >> > the threshold (I think this will cause minor compaction or region file
>> >> > split, not major compaction, but not quite sure).
>> >> >
>> >> > Thanks!
>> >> >
>> >> > Yong
>> >>
>> >> Hello,
>> >> I think when there is massive delete on the table or change table
>> >> attribute like TTL (that is susseptible of remove a lot of
>> >> versions/rows) or COMPRESSION wich gain a lot of disk space on each
>> region.
>> >>
>> >> Cheers,
>> >>
>> >> --
>> >> Damien
>> >>
>> >>
>> >
>> >
>> > --
>> > Mikael.S
>>
>
>
>
> --
> Mikael.S

Re: the occasion of the major compact?

Posted by Mikael Sitruk <mi...@gmail.com>.
Yong hi

As far as i know setting  hbase.hregion.majorcompaction to 0 will disable
the time based trigger only.
Client are always able to invoke the major compact, no matter what is the
value of the hbase.hregion.majorcompaction.

Perhaps client invocation of compaction can me disabled with the security
package.

Anyway i'm digging into 0.92, I hope to get those insight soon.

Mikael.S

On Thu, Jan 26, 2012 at 4:39 PM, yonghu <yo...@gmail.com> wrote:

> Thanks for your response.
>
> I knew that major compact can be triggered based on client, time and
> size. In my situation, I have to close the functionality of major
> compact. So, if I set the ‘hbase.hregion.majorcompaction’ into 0, it
> will close all the three situations or I have to set it separately for
> each case. BTW, my hbase version is 0.92.
>
> Thanks!
>
> Yong
>
> On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk <mi...@gmail.com>
> wrote:
> > look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it explain a
> > lot on major compaction and enhancement over versions
> >
> > Mikael.S
> >
> >
> > On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dh...@figarocms.fr>
> wrote:
> >
> >> Le 26/01/2012 14:43, yonghu a écrit :
> >> > Hello,
> >> >
> >> > I read this blog http://outerthought.org/blog/465-ot.html. It
> mentions
> >> > that every 24 hours the major compaction will occur. My question is
> >> > that if there are any other conditions which can trigger major
> >> > compaction happening? For example, when the size of store file reaches
> >> > the threshold (I think this will cause minor compaction or region file
> >> > split, not major compaction, but not quite sure).
> >> >
> >> > Thanks!
> >> >
> >> > Yong
> >>
> >> Hello,
> >> I think when there is massive delete on the table or change table
> >> attribute like TTL (that is susseptible of remove a lot of
> >> versions/rows) or COMPRESSION wich gain a lot of disk space on each
> region.
> >>
> >> Cheers,
> >>
> >> --
> >> Damien
> >>
> >>
> >
> >
> > --
> > Mikael.S
>



-- 
Mikael.S

Re: the occasion of the major compact?

Posted by yonghu <yo...@gmail.com>.
Thanks for your response.

I knew that major compact can be triggered based on client, time and
size. In my situation, I have to close the functionality of major
compact. So, if I set the ‘hbase.hregion.majorcompaction’ into 0, it
will close all the three situations or I have to set it separately for
each case. BTW, my hbase version is 0.92.

Thanks!

Yong

On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk <mi...@gmail.com> wrote:
> look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it explain a
> lot on major compaction and enhancement over versions
>
> Mikael.S
>
>
> On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dh...@figarocms.fr> wrote:
>
>> Le 26/01/2012 14:43, yonghu a écrit :
>> > Hello,
>> >
>> > I read this blog http://outerthought.org/blog/465-ot.html. It mentions
>> > that every 24 hours the major compaction will occur. My question is
>> > that if there are any other conditions which can trigger major
>> > compaction happening? For example, when the size of store file reaches
>> > the threshold (I think this will cause minor compaction or region file
>> > split, not major compaction, but not quite sure).
>> >
>> > Thanks!
>> >
>> > Yong
>>
>> Hello,
>> I think when there is massive delete on the table or change table
>> attribute like TTL (that is susseptible of remove a lot of
>> versions/rows) or COMPRESSION wich gain a lot of disk space on each region.
>>
>> Cheers,
>>
>> --
>> Damien
>>
>>
>
>
> --
> Mikael.S

Re: the occasion of the major compact?

Posted by Mikael Sitruk <mi...@gmail.com>.
look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it explain a
lot on major compaction and enhancement over versions

Mikael.S


On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dh...@figarocms.fr> wrote:

> Le 26/01/2012 14:43, yonghu a écrit :
> > Hello,
> >
> > I read this blog http://outerthought.org/blog/465-ot.html. It mentions
> > that every 24 hours the major compaction will occur. My question is
> > that if there are any other conditions which can trigger major
> > compaction happening? For example, when the size of store file reaches
> > the threshold (I think this will cause minor compaction or region file
> > split, not major compaction, but not quite sure).
> >
> > Thanks!
> >
> > Yong
>
> Hello,
> I think when there is massive delete on the table or change table
> attribute like TTL (that is susseptible of remove a lot of
> versions/rows) or COMPRESSION wich gain a lot of disk space on each region.
>
> Cheers,
>
> --
> Damien
>
>


-- 
Mikael.S

Re: the occasion of the major compact?

Posted by Damien Hardy <dh...@figarocms.fr>.
Le 26/01/2012 14:43, yonghu a écrit :
> Hello,
>
> I read this blog http://outerthought.org/blog/465-ot.html. It mentions
> that every 24 hours the major compaction will occur. My question is
> that if there are any other conditions which can trigger major
> compaction happening? For example, when the size of store file reaches
> the threshold (I think this will cause minor compaction or region file
> split, not major compaction, but not quite sure).
>
> Thanks!
>
> Yong

Hello,
I think when there is massive delete on the table or change table
attribute like TTL (that is susseptible of remove a lot of
versions/rows) or COMPRESSION wich gain a lot of disk space on each region.

Cheers,

-- 
Damien