You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Jinsong Hu <ji...@hotmail.com> on 2010/09/15 18:54:31 UTC

hbase doesn't delete data older than TTL in old regions

I have tested the TTL for hbase and found that it relies on compaction to 
remove old data . However, if a region has data that is older
than TTL, and there is no trigger to compact it, then the data will remain 
there forever, wasting disk space and memory.

It appears at this state, to really remove data older than TTL we need to 
start a client side deletion request. This is really a pity because
it is an more expensive way to get the job done.  Another side effect of 
this is that as time goes on, we will end up with some small
regions if the data are saved in chronological order in regions. It appears 
that hbase doesn't have a mechanism to merge 2 consecutive
small regions into a bigger one at this time.  So if data is saved in 
chronological order, sooner or later we will run out of capacity , even if 
the amount of data in hbase is small, because we have lots of regions with 
small storage space.

A much cheaper way to remove data older than TTL would be to remember the 
latest timestamp for the region in the .META. table
and if the time is older than TTL, we just adjust the row in .META. and 
delete the store , without doing any compaction.

Can this be added to the hbase requirement for future release ?

Jimmy

Re: lack of region merge cause in_memory option trouble

Posted by Jinsong Hu <ji...@hotmail.com>.

Hi, Andrew:
  Thanks for the suggestion. In the use case that I am considering, when TTL 
is set to 10 minutes, all data will fit in memory in 3 regions. However, 
when TTL is set to longer time, it will not fit in memory.  Some of our 
table's TTL can be set to 2 weeks, 1 month , or 1 year. persistence is still 
needed in case the regionserver shuts down.  We want to use hbase for this 
use case because of
the large amount of data that flows through.  What I am actually testing is 
to find out what happens when there is a large turn over of the data in the 
regions.
  I found that if I manually run major_compact against the table, it helps a 
lot. the older data gets removed. hbase is supposed to run major_compact 
every day, but searching from the log, and I found that is not the case. I 
have found situations that the major_compact didn't happen for several days. 
for the tables with TTL, I found the lack of major_compact
greatly impacted the read performance.  after insertion to the table running 
for 1 day, counting the rows of the table
took more than 450 seconds. after the major_compact, the same operation took 
only 65 seconds.
  In the end, I resort to command : echo 'major_compact 'table_name'" | 
hbase shell, put it in a cron job for those tables
with high data turn over and run it every hourly, and I am still testing and 
see if it helps with this situation.

Jimmy


--------------------------------------------------
From: "Andrew Purtell" <ap...@apache.org>
Sent: Sunday, September 19, 2010 7:37 AM
To: <us...@hbase.apache.org>
Subject: Re: lack of region merge cause in_memory option trouble

> Hi Jimmy,
>
> IN_MEMORY may not mean what you think. It does not turn off disk 
> persistence, flushing, etc. It is a suggestion to the regionserver that 
> all of the data for the region be retained in block cache.
>
> Also, as I said before your test case is not really what the current TTL 
> implementation targets. If you want it to work better for you given such 
> short TTLs, it may make sense to modify the memstore to simply not flush 
> values with short TTLs, if they will expire in a few minutes or seconds.
>
>> The idea is that we are only interested in last 10 minute's data,
>> as data gets older, it will be purged, and the amount of memory
>> and disk usage will remain low. [...]
>
> What is the anticipated data volume within that 10 minute window? Will it 
> fit all in RAM on a single server? Or perhaps a small cluster of servers?
>
> The BigTable/HBase design targets large data scale, and the implementation 
> is optimized for that, a distributed, elastic, **persistent** sparse map 
> with multidimensional keys. What you are talking about here way on the 
> other end of the spectrum, and persistence may not be something you want.
>
>   - Andy
>
>> From: Jinsong Hu <ji...@hotmail.com>
>> Subject: lack of region merge cause in_memory option trouble
>> To: user@hbase.apache.org
>> Date: Friday, September 17, 2010, 2:53 PM
>> Hi,
>>  I was trying to find out if the hbase can be used in
>> real-time processing scenario. In order to
>> do so, I set the in_memory for a table to be true, and set
>> the TTL for the table to 10 minuets.
>> The data comes in chronnological order. I let the test to
>> run for 1 day. The idea is that we are only
>> interested in last 10 minute's data. as data gets older, it
>> will be purged, and the amount of memory and disk usage will
>> remain low.
>>  What I found is that the region number continue to grow ,
>> and overnight it created 46 regions. the HDFS shows it used
>> 8.6G of disk space. This is one order of magnitude higher
>> than what I estimate in the ideal case. The data rate that I
>> am pumping is only 3 regions/hour. I would imagine that we
>> will only have less than 3 regions in hbase for this kind of
>> situation, and only 700M in terms of HDFS usage, regardless
>> how long I run the test.
>>  I understand that the region merge request is already
>> filed. Does anybody know when that will be implemented ?
>>
>> Jimmy.
>>
>
>
>
>
>

Re: lack of region merge cause in_memory option trouble

Posted by Andrew Purtell <ap...@apache.org>.

Hi Jimmy,

IN_MEMORY may not mean what you think. It does not turn off disk persistence, flushing, etc. It is a suggestion to the regionserver that all of the data for the region be retained in block cache. 

Also, as I said before your test case is not really what the current TTL implementation targets. If you want it to work better for you given such short TTLs, it may make sense to modify the memstore to simply not flush values with short TTLs, if they will expire in a few minutes or seconds. 

> The idea is that we are only interested in last 10 minute's data,
> as data gets older, it will be purged, and the amount of memory
> and disk usage will remain low. [...]

What is the anticipated data volume within that 10 minute window? Will it fit all in RAM on a single server? Or perhaps a small cluster of servers?

The BigTable/HBase design targets large data scale, and the implementation is optimized for that, a distributed, elastic, **persistent** sparse map with multidimensional keys. What you are talking about here way on the other end of the spectrum, and persistence may not be something you want. 

   - Andy

> From: Jinsong Hu <ji...@hotmail.com>
> Subject: lack of region merge cause in_memory option trouble
> To: user@hbase.apache.org
> Date: Friday, September 17, 2010, 2:53 PM
> Hi,
>  I was trying to find out if the hbase can be used in
> real-time processing scenario. In order to
> do so, I set the in_memory for a table to be true, and set
> the TTL for the table to 10 minuets.
> The data comes in chronnological order. I let the test to
> run for 1 day. The idea is that we are only
> interested in last 10 minute's data. as data gets older, it
> will be purged, and the amount of memory and disk usage will
> remain low.
>  What I found is that the region number continue to grow ,
> and overnight it created 46 regions. the HDFS shows it used
> 8.6G of disk space. This is one order of magnitude higher
> than what I estimate in the ideal case. The data rate that I
> am pumping is only 3 regions/hour. I would imagine that we
> will only have less than 3 regions in hbase for this kind of
> situation, and only 700M in terms of HDFS usage, regardless
> how long I run the test.
>  I understand that the region merge request is already
> filed. Does anybody know when that will be implemented ?
> 
> Jimmy. 
>

Re: lack of region merge cause in_memory option trouble

Posted by Ryan Rawson <ry...@gmail.com>.

Hi,

HBase is an open source project, while some people working on it may
be getting paid to do so, the fact remains that like most open source
projects we depend on the good will, the contributions and the
selfless help of many many people.  Requesting features is always
desired, it's good to hear what people want and where they think
things to go.  It also means the fastest way to get anything done is
to do it yourself.

At this moment, most the core hbase contributors are busy prepping the
next major release, stabilizing 0.89, adding performance fixes, etc,
etc.

Regards,
-ryan

On Fri, Sep 17, 2010 at 2:53 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> Hi,
>  I was trying to find out if the hbase can be used in real-time processing
> scenario. In order to
> do so, I set the in_memory for a table to be true, and set the TTL for the
> table to 10 minuets.
> The data comes in chronnological order. I let the test to run for 1 day. The
> idea is that we are only
> interested in last 10 minute's data. as data gets older, it will be purged,
> and the amount of memory and disk usage will remain low.
>  What I found is that the region number continue to grow , and overnight it
> created 46 regions. the HDFS shows it used 8.6G of disk space. This is one
> order of magnitude higher than what I estimate in the ideal case. The data
> rate that I am pumping is only 3 regions/hour. I would imagine that we will
> only have less than 3 regions in hbase for this kind of situation, and only
> 700M in terms of HDFS usage, regardless how long I run the test.
>  I understand that the region merge request is already filed. Does anybody
> know when that
> will be implemented ?
>
> Jimmy.
>

lack of region merge cause in_memory option trouble

Posted by Jinsong Hu <ji...@hotmail.com>.

Hi,
  I was trying to find out if the hbase can be used in real-time processing 
scenario. In order to
do so, I set the in_memory for a table to be true, and set the TTL for the 
table to 10 minuets.
The data comes in chronnological order. I let the test to run for 1 day. 
The idea is that we are only
interested in last 10 minute's data. as data gets older, it will be purged, 
and the amount of memory and disk usage will remain low.
  What I found is that the region number continue to grow , and overnight it 
created 46 regions. the HDFS shows it used 8.6G of disk space. This is one 
order of magnitude higher than what I estimate in the ideal case. The data 
rate that I am pumping is only 3 regions/hour. I would imagine that we will 
only have less than 3 regions in hbase for this kind of situation, and only 
700M in terms of HDFS usage, regardless how long I run the test.
  I understand that the region merge request is already filed. Does anybody 
know when that
will be implemented ?

Jimmy.

Re: hbase doesn't delete data older than TTL in old regions

Posted by Jinsong Hu <ji...@hotmail.com>.

I continued the test yesterday, letting data in the table with 10 minutes 
TTL sitting there
and it has passed 24 hours. I checked the table, the data are still there. I 
checked the log,
the major_compaction didn't happen for this table for the last 24 hours.

I realized that I have other tables that have been running for a while and 
all I need to do is to check the existing logs , so I checked the
major_compaction from their log, and I found that the major_compaction does 
happen, but
not really every 24 hours. I have seem 2 major compaction happenning within 
3 hours,
and I have also seem the major_compaction gap to be 1 day or 8 days for the 
same table.

Just FYI.

Jimmy.

--------------------------------------------------
From: "Jinsong Hu" <ji...@hotmail.com>
Sent: Thursday, September 16, 2010 10:31 AM
To: <us...@hbase.apache.org>
Subject: Re: hbase doesn't delete data older than TTL in old regions

> I updated the ticket with our discussion, and added the following 
> comments:
>
> What I suggest is to make the sweep part of the major_compact. basically, 
> it needs to merge consecutive empty regions to the neighboring region that 
> is not empty. it need to merge the records in .META. table, and delete the 
> empty directories in the HDFS for the empty regions. it should then 
> instruct the region servers to unload the original regions and reload the 
> merged regions.
>
> Jimmy.
>
> --------------------------------------------------
> From: "Stack" <st...@duboce.net>
> Sent: Thursday, September 16, 2010 9:49 AM
> To: <us...@hbase.apache.org>
> Subject: Re: hbase doesn't delete data older than TTL in old regions
>
>> On Thu, Sep 16, 2010 at 9:32 AM, Jinsong Hu <ji...@hotmail.com> 
>> wrote:
>>> That means, if we run this in production system and key is chronological
>>> order, we will end up
>>> having thousands of regions as time goes on and the number of regions 
>>> never
>>> decrease,
>>> even though old data are compacted away. we don't really mind having 
>>> several
>>> empty regions, but the fact that the region number continue to grow
>>> unlimited without stop as time goes on, is really troublesome. It waste
>>> hadoop namenode resource, and waste memory resource on regionserver, as 
>>> each
>>> region takes some memory to store region info.
>>>
>>
>> Agreed.
>>
>> It'd be easy enough to write a script to do this run out of cron but
>> yeah, we should have a facility to sweep hbase and in particular if
>> regions are empty of store files, merge to neighbour.
>>
>> Would you mind updating hbase-2999 to make it clear what is  needed to
>> satisfy the issue?  The clearer the stipulation, the easier it is on
>> the implementor (Patches also accepted if you'd like to have a go at
>> this yourself).
>>
>> St.Ack
>>
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Jinsong Hu <ji...@hotmail.com>.

I updated the ticket with our discussion, and added the following comments:

What I suggest is to make the sweep part of the major_compact. basically, it 
needs to merge consecutive empty regions to the neighboring region that is 
not empty. it need to merge the records in .META. table, and delete the 
empty directories in the HDFS for the empty regions. it should then instruct 
the region servers to unload the original regions and reload the merged 
regions.

Jimmy.

--------------------------------------------------
From: "Stack" <st...@duboce.net>
Sent: Thursday, September 16, 2010 9:49 AM
To: <us...@hbase.apache.org>
Subject: Re: hbase doesn't delete data older than TTL in old regions

> On Thu, Sep 16, 2010 at 9:32 AM, Jinsong Hu <ji...@hotmail.com> 
> wrote:
>> That means, if we run this in production system and key is chronological
>> order, we will end up
>> having thousands of regions as time goes on and the number of regions 
>> never
>> decrease,
>> even though old data are compacted away. we don't really mind having 
>> several
>> empty regions, but the fact that the region number continue to grow
>> unlimited without stop as time goes on, is really troublesome. It waste
>> hadoop namenode resource, and waste memory resource on regionserver, as 
>> each
>> region takes some memory to store region info.
>>
>
> Agreed.
>
> It'd be easy enough to write a script to do this run out of cron but
> yeah, we should have a facility to sweep hbase and in particular if
> regions are empty of store files, merge to neighbour.
>
> Would you mind updating hbase-2999 to make it clear what is  needed to
> satisfy the issue?  The clearer the stipulation, the easier it is on
> the implementor (Patches also accepted if you'd like to have a go at
> this yourself).
>
> St.Ack
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Stack <st...@duboce.net>.

On Thu, Sep 16, 2010 at 9:32 AM, Jinsong Hu <ji...@hotmail.com> wrote:
> That means, if we run this in production system and key is chronological
> order, we will end up
> having thousands of regions as time goes on and the number of regions never
> decrease,
> even though old data are compacted away. we don't really mind having several
> empty regions, but the fact that the region number continue to grow
> unlimited without stop as time goes on, is really troublesome. It waste
> hadoop namenode resource, and waste memory resource on regionserver, as each
> region takes some memory to store region info.
>

Agreed.

It'd be easy enough to write a script to do this run out of cron but
yeah, we should have a facility to sweep hbase and in particular if
regions are empty of store files, merge to neighbour.

Would you mind updating hbase-2999 to make it clear what is  needed to
satisfy the issue?  The clearer the stipulation, the easier it is on
the implementor (Patches also accepted if you'd like to have a go at
this yourself).

St.Ack

Re: hbase doesn't delete data older than TTL in old regions

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Currently merging regions can only be done while HBase is offline, a
long time ago this was opened:
https://issues.apache.org/jira/browse/HBASE-420. And some work was to
at least be able to merge regions in disabled tables:
https://issues.apache.org/jira/browse/HBASE-1621 but it requires a lot
more engineering.

J-D

On Thu, Sep 16, 2010 at 9:32 AM, Jinsong Hu <ji...@hotmail.com> wrote:
>
> I did the test, instead of waiting for one day, I manually run major_compact
> and found the old data is indeed removed. For this part , it is working as
> advertised.
>
> However, I found that I end up having several regions that have no data
> inside.
> and the regions are not merged even though they are empty  and consecutive.
>
> That means, if we run this in production system and key is chronological
> order, we will end up
> having thousands of regions as time goes on and the number of regions never
> decrease,
> even though old data are compacted away. we don't really mind having several
> empty regions, but the fact that the region number continue to grow
> unlimited without stop as time goes on, is really troublesome. It waste
> hadoop namenode resource, and waste memory resource on regionserver, as each
> region takes some memory to store region info.
>
> can this be added to the compaction task, to merge consecutive empty region
> into single one
> after data is processed ?
>
> Jimmy.
>
> --------------------------------------------------
> From: "Stack" <st...@duboce.net>
> Sent: Thursday, September 16, 2010 8:39 AM
> To: <us...@hbase.apache.org>
> Subject: Re: hbase doesn't delete data older than TTL in old regions
>
>> You could change hbase.hregion.majorcompaction to be less than one day
>> so you don't have to wait so long.  Make sure DEBUG is enabled (It
>> should be by default).  With DEBUG, you'll be able to see compactions
>> running.  Log will include type of compaction run.
>>
>> Thanks for testing,
>> St.Ack
>>
>> On Wed, Sep 15, 2010 at 10:43 PM, Jinsong Hu <ji...@hotmail.com>
>> wrote:
>>>
>>> Hi, Stack:
>>>  Thanks for the explanation.  I looked at the code and it seems that the
>>> old
>>> region should get compacted
>>> and data older than TTL will get removed. I will do a test with a table
>>> with
>>> 10 min TTL , and insert several
>>> regions and wait for 1 day, and see if old records will indeed get
>>> removed
>>> or not.
>>>
>>> Jimmy.
>>>
>>> --------------------------------------------------
>>> From: "Stack" <st...@duboce.net>
>>> Sent: Wednesday, September 15, 2010 9:53 PM
>>> To: <us...@hbase.apache.org>
>>> Subject: Re: hbase doesn't delete data older than TTL in old regions
>>>
>>>> On Wed, Sep 15, 2010 at 5:50 PM, Jinsong Hu <ji...@hotmail.com>
>>>> wrote:
>>>>>
>>>>> One thing I am not clear about major compaction is that for the regions
>>>>> with
>>>>> a single map file,
>>>>> will hbase actually load it and remove the records older than TTL ?
>>>>
>>>> Major compactions will run even if only one file IFF this file is not
>>>> already the product of a major compaction (files that have been major
>>>> compacted get a marker in their metadata so next time a major
>>>> compaction runs we'll skip the file) AND the time since the last major
>>>> compaction is < TTL (See
>>>>
>>>>
>>>> http://hbase.apache.org/docs/r0.89.20100726/xref/org/apache/hadoop/hbase/regionserver/Store.html#743).
>>>>
>>>> The RegionServer runs a Major Compaction checking thread... it runs on a
>>>> period.
>>>>
>>>> So, it should be doing what you want (if a little crudely given its
>>>> waiting TTL before rechecking if already major compacted.
>>>>
>>>> We could make improvement by looking at oldest timestamp every time we
>>>> run the major compaction check.
>>>>
>>>> St.Ack
>>>>
>>>
>>
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Jinsong Hu <ji...@hotmail.com>.

I did the test, instead of waiting for one day, I manually run major_compact
and found the old data is indeed removed. For this part , it is working as 
advertised.

However, I found that I end up having several regions that have no data 
inside.
and the regions are not merged even though they are empty  and consecutive.

That means, if we run this in production system and key is chronological 
order, we will end up
having thousands of regions as time goes on and the number of regions never 
decrease,
even though old data are compacted away. we don't really mind having several 
empty regions, but the fact that the region number continue to grow 
unlimited without stop as time goes on, is really troublesome. It waste 
hadoop namenode resource, and waste memory resource on regionserver, as each 
region takes some memory to store region info.

can this be added to the compaction task, to merge consecutive empty region 
into single one
after data is processed ?

Jimmy.

--------------------------------------------------
From: "Stack" <st...@duboce.net>
Sent: Thursday, September 16, 2010 8:39 AM
To: <us...@hbase.apache.org>
Subject: Re: hbase doesn't delete data older than TTL in old regions

> You could change hbase.hregion.majorcompaction to be less than one day
> so you don't have to wait so long.  Make sure DEBUG is enabled (It
> should be by default).  With DEBUG, you'll be able to see compactions
> running.  Log will include type of compaction run.
>
> Thanks for testing,
> St.Ack
>
> On Wed, Sep 15, 2010 at 10:43 PM, Jinsong Hu <ji...@hotmail.com> 
> wrote:
>> Hi, Stack:
>>  Thanks for the explanation.  I looked at the code and it seems that the 
>> old
>> region should get compacted
>> and data older than TTL will get removed. I will do a test with a table 
>> with
>> 10 min TTL , and insert several
>> regions and wait for 1 day, and see if old records will indeed get 
>> removed
>> or not.
>>
>> Jimmy.
>>
>> --------------------------------------------------
>> From: "Stack" <st...@duboce.net>
>> Sent: Wednesday, September 15, 2010 9:53 PM
>> To: <us...@hbase.apache.org>
>> Subject: Re: hbase doesn't delete data older than TTL in old regions
>>
>>> On Wed, Sep 15, 2010 at 5:50 PM, Jinsong Hu <ji...@hotmail.com>
>>> wrote:
>>>>
>>>> One thing I am not clear about major compaction is that for the regions
>>>> with
>>>> a single map file,
>>>> will hbase actually load it and remove the records older than TTL ?
>>>
>>> Major compactions will run even if only one file IFF this file is not
>>> already the product of a major compaction (files that have been major
>>> compacted get a marker in their metadata so next time a major
>>> compaction runs we'll skip the file) AND the time since the last major
>>> compaction is < TTL (See
>>>
>>> http://hbase.apache.org/docs/r0.89.20100726/xref/org/apache/hadoop/hbase/regionserver/Store.html#743).
>>>
>>> The RegionServer runs a Major Compaction checking thread... it runs on a
>>> period.
>>>
>>> So, it should be doing what you want (if a little crudely given its
>>> waiting TTL before rechecking if already major compacted.
>>>
>>> We could make improvement by looking at oldest timestamp every time we
>>> run the major compaction check.
>>>
>>> St.Ack
>>>
>>
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Stack <st...@duboce.net>.

You could change hbase.hregion.majorcompaction to be less than one day
so you don't have to wait so long.  Make sure DEBUG is enabled (It
should be by default).  With DEBUG, you'll be able to see compactions
running.  Log will include type of compaction run.

Thanks for testing,
St.Ack

On Wed, Sep 15, 2010 at 10:43 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> Hi, Stack:
>  Thanks for the explanation.  I looked at the code and it seems that the old
> region should get compacted
> and data older than TTL will get removed. I will do a test with a table with
> 10 min TTL , and insert several
> regions and wait for 1 day, and see if old records will indeed get removed
> or not.
>
> Jimmy.
>
> --------------------------------------------------
> From: "Stack" <st...@duboce.net>
> Sent: Wednesday, September 15, 2010 9:53 PM
> To: <us...@hbase.apache.org>
> Subject: Re: hbase doesn't delete data older than TTL in old regions
>
>> On Wed, Sep 15, 2010 at 5:50 PM, Jinsong Hu <ji...@hotmail.com>
>> wrote:
>>>
>>> One thing I am not clear about major compaction is that for the regions
>>> with
>>> a single map file,
>>> will hbase actually load it and remove the records older than TTL ?
>>
>> Major compactions will run even if only one file IFF this file is not
>> already the product of a major compaction (files that have been major
>> compacted get a marker in their metadata so next time a major
>> compaction runs we'll skip the file) AND the time since the last major
>> compaction is < TTL (See
>>
>> http://hbase.apache.org/docs/r0.89.20100726/xref/org/apache/hadoop/hbase/regionserver/Store.html#743).
>>
>> The RegionServer runs a Major Compaction checking thread... it runs on a
>> period.
>>
>> So, it should be doing what you want (if a little crudely given its
>> waiting TTL before rechecking if already major compacted.
>>
>> We could make improvement by looking at oldest timestamp every time we
>> run the major compaction check.
>>
>> St.Ack
>>
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Jinsong Hu <ji...@hotmail.com>.

Hi, Stack:
  Thanks for the explanation.  I looked at the code and it seems that the 
old region should get compacted
and data older than TTL will get removed. I will do a test with a table with 
10 min TTL , and insert several
regions and wait for 1 day, and see if old records will indeed get removed 
or not.

Jimmy.

--------------------------------------------------
From: "Stack" <st...@duboce.net>
Sent: Wednesday, September 15, 2010 9:53 PM
To: <us...@hbase.apache.org>
Subject: Re: hbase doesn't delete data older than TTL in old regions

> On Wed, Sep 15, 2010 at 5:50 PM, Jinsong Hu <ji...@hotmail.com> 
> wrote:
>> One thing I am not clear about major compaction is that for the regions 
>> with
>> a single map file,
>> will hbase actually load it and remove the records older than TTL ?
>
> Major compactions will run even if only one file IFF this file is not
> already the product of a major compaction (files that have been major
> compacted get a marker in their metadata so next time a major
> compaction runs we'll skip the file) AND the time since the last major
> compaction is < TTL (See
> http://hbase.apache.org/docs/r0.89.20100726/xref/org/apache/hadoop/hbase/regionserver/Store.html#743).
>
> The RegionServer runs a Major Compaction checking thread... it runs on a 
> period.
>
> So, it should be doing what you want (if a little crudely given its
> waiting TTL before rechecking if already major compacted.
>
> We could make improvement by looking at oldest timestamp every time we
> run the major compaction check.
>
> St.Ack
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Stack <st...@duboce.net>.

On Wed, Sep 15, 2010 at 5:50 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> One thing I am not clear about major compaction is that for the regions with
> a single map file,
> will hbase actually load it and remove the records older than TTL ?

Major compactions will run even if only one file IFF this file is not
already the product of a major compaction (files that have been major
compacted get a marker in their metadata so next time a major
compaction runs we'll skip the file) AND the time since the last major
compaction is < TTL (See
http://hbase.apache.org/docs/r0.89.20100726/xref/org/apache/hadoop/hbase/regionserver/Store.html#743).

The RegionServer runs a Major Compaction checking thread... it runs on a period.

So, it should be doing what you want (if a little crudely given its
waiting TTL before rechecking if already major compacted.

We could make improvement by looking at oldest timestamp every time we
run the major compaction check.

St.Ack

Re: hbase doesn't delete data older than TTL in old regions

Posted by Andrew Purtell <ap...@apache.org>.

> Unfortunately it confirmed my suspicion that current TTL is
> implemented
> purely based on active compaction.  And in log
> table/history data table, current implementation is not
> sufficient.

You continue to make that statement but it not an accurate statement.

HBase respects TTL when returning answers. At no time will you see a value that has expired.

So it is not "purely based on active compaction". 

Let us not be overly general in our language here. You are claiming a feature is broken and in fact it is not broken, it functions as advertised.

Best regards,

    - Andy

Why is this email five sentences or less?
http://five.sentenc.es/


--- On Wed, 9/15/10, Jinsong Hu <ji...@hotmail.com> wrote:

> From: Jinsong Hu <ji...@hotmail.com>
> Subject: Re: hbase doesn't delete data older than TTL in old regions
> To: apurtell@apache.org, user@hbase.apache.org
> Date: Wednesday, September 15, 2010, 5:50 PM
> I artificially set TTL to 10 minutes
> so that I can get the results quicker, and I don't have to
> wait for one day to get results. the TTL is set to 600
> seconds ( equals 10 minutes) when I did the testing.
> 
> In real application, TTL will be set to several months to
> years.
> 
> One thing I am not clear about major compaction is that for
> the regions with a single map file,
> will hbase actually load it and remove the records older
> than TTL ? I read the document and
> it doesn't seem to be the case. From engineering point of
> view, it also doesn't make sense to run compaction on a
> region that has only one single map file. The consequence is
> for those regions with single map file and old data, 
> old data will not be dropped forever even though it has well
> passed TTL .
> 
> I designed the TTL test case to see whether it works under
> different scenarios and figure out how it is actually done.
> Unfortunately it confirmed my suspicion that current TTL is
> implemented
> purely based on active compaction.  And in log
> table/history data table, current implementation is not
> sufficient.
> 
> Jimmy
> 
> --------------------------------------------------
> From: "Andrew Purtell" <ap...@apache.org>
> Sent: Wednesday, September 15, 2010 5:33 PM
> To: <us...@hbase.apache.org>
> Subject: Re: hbase doesn't delete data older than TTL in
> old regions
> 
> >>  I did a test with 2 key structure: 1. 
> time:random ,
> >> and  2. random:time.
> >> the TTL is set to 10 minutes. the time is current
> system
> >> time. the random is a random string with length
> 2-10
> >> characters long.
> > 
> > This use case doesn't make much sense the way HBase
> currently works. You can set the TTL to 10 minutes but
> default major compaction runs every 24 hours. This can be
> tuned down, I've run with it every 4 or 8 hours during
> various experiments with different operational conditions.
> However TTL is specified in seconds instead of milliseconds
> given the notion of the typical TTL being greater than the
> major compaction interval.
> > 
> > If TTL is so short, maybe it should not be flushed
> from memstore at all? Is that what you want?
> > 
> >    - Andy
> > 
> > 
> >> From: Jinsong Hu <ji...@hotmail.com>
> >> Subject: Re: hbase doesn't delete data older than
> TTL in old regions
> >> To: user@hbase.apache.org
> >> Date: Wednesday, September 15, 2010, 11:56 AM
> >> Hi, ryan:
> >>  I did a test with 2 key structure: 1. 
> time:random ,
> >> and  2. random:time.
> >> the TTL is set to 10 minutes. the time is current
> system
> >> time. the random is a random string with length
> 2-10
> >> characters long.
> >> 
> >>  I wrote a test program to continue to pump
> data into hbase
> >> table , with the time going up with time.
> >> for the second test case, the number of rows
> remains
> >> approximately constant after it reaches
> >> certain limit. I also checked a specific row, and
> wait to
> >> 20 minutes later and check it again
> >> and found it is indeed gone.
> >> 
> >> In the first key case, the number of rows continue
> to grow
> >> and the number of regions continue to grow..
> >> to some number much higher than first case, and
> doesn't
> >> stop. I checked some stores that  with data
> >> several  hours old and they still remain
> there without
> >> getting deleted.
> >> 
> >> Jimmy.
> >> 
> >>
> --------------------------------------------------
> >> From: "Ryan Rawson" <ry...@gmail.com>
> >> Sent: Wednesday, September 15, 2010 11:43 AM
> >> To: <us...@hbase.apache.org>
> >> Subject: Re: hbase doesn't delete data older than
> TTL in
> >> old regions
> >> 
> >> > I feel the need to pipe in here, since people
> are
> >> accusing hbase of
> >> > having a broken feature 'TTL' when from the
> >> description in this email
> >> > thread, and my own knowledge doesn't really
> describe a
> >> broken feature.
> >> > Non optimal maybe, but not broken.
> >> >
> >> > First off, the TTL feature works on the
> timestamp,
> >> thus rowkey
> >> > structure is not related.  This is
> because the
> >> timestamp is stored in
> >> > a different field.  If you are also
> storing the
> >> data in row key
> >> > chronological order, then you may end up with
> sparse
> >> or 'small'
> >> > regions.  But that doesn't mean the
> feature is
> >> broken - ie: it does
> >> > not remove data older than the TTL. 
> Needs tuning
> >> yes, but not broken.
> >> >
> >> > Also note that "client side deletes" work in
> the same
> >> way that TTL
> >> > does, you insert a tombstone marker, then a
> compaction
> >> actually purges
> >> > the data itself.
> >> >
> >> > -ryan
> >> >
> >> > On Wed, Sep 15, 2010 at 11:26 AM, Jinsong Hu
> <ji...@hotmail.com>
> >> wrote:
> >> >> I opened a ticket https://issues.apache.org/jira/browse/HBASE-2999 to
> >> track
> >> >> issue. dropping old store , and update
> the
> >> adjacent region's key range when
> >> >> all
> >> >> store for a region is gone is probably
> the
> >> cheapest solution, both in terms
> >> >> of coding and in terms of resource usage
> in the
> >> cluster. Do we know when
> >> >> this can be done ?
> >> >>
> >> >>
> >> >> Jimmy.
> >> >>
> >> >>
> >>
> --------------------------------------------------
> >> >> From: "Jonathan Gray" <jg...@facebook.com>
> >> >> Sent: Wednesday, September 15, 2010 11:06
> AM
> >> >> To: <us...@hbase.apache.org>
> >> >> Subject: RE: hbase doesn't delete data
> older than
> >> TTL in old regions
> >> >>
> >> >>> This sounds reasonable.
> >> >>>
> >> >>> We are tracking min/max timestamps
> in
> >> storefiles too, so it's possible
> >> >>> that we could expire some files of a
> region as
> >> well, even if the region was
> >> >>> not completely expired.
> >> >>>
> >> >>> Jinsong, mind filing a jira?
> >> >>>
> >> >>> JG
> >> >>>
> >> >>>> -----Original Message-----
> >> >>>> From: Jinsong Hu [mailto:jinsong_hu@hotmail.com]
> >> >>>> Sent: Wednesday, September 15,
> 2010 10:39
> >> AM
> >> >>>> To: user@hbase.apache.org
> >> >>>> Subject: Re: hbase doesn't delete
> data
> >> older than TTL in old regions
> >> >>>>
> >> >>>> Yes, Current TTL based on
> compaction is
> >> working as advertised if the
> >> >>>> key
> >> >>>> randomly distribute the incoming
> data
> >> >>>> among all regions.  However,
> if the
> >> key is designed in chronological
> >> >>>> order,
> >> >>>> the TTL doesn't really work,
> as  no
> >> compaction
> >> >>>> will happen for data already
> written. So
> >> we can't say  that current TTL
> >> >>>> really work as advertised, as it
> is key
> >> structure dependent.
> >> >>>>
> >> >>>> This is a pity, because a major
> use case
> >> for hbase is for people to
> >> >>>> store
> >> >>>> history or log data. normally
> people only
> >> >>>> want to retain the data for a
> fixed
> >> period. for example, US government
> >> >>>> default data retention policy is
> 7 years.
> >> Those
> >> >>>> data are saved in chronological
> order.
> >> Current TTL implementation
> >> >>>> doesn't
> >> >>>> work at all for those kind of use
> case.
> >> >>>>
> >> >>>> In order for that use case to
> really work,
> >> hbase needs to have an
> >> >>>> active
> >> >>>> thread that periodically runs and
> check if
> >> there
> >> >>>> are data older than TTL, and
> delete the
> >> data older than TTL is
> >> >>>> necessary,
> >> >>>> and compact small regions older
> than
> >> certain time period
> >> >>>> into larger ones to save system
> resource.
> >> It can optimize the deletion
> >> >>>> by
> >> >>>> delete the whole region if it
> detects that
> >> the last time
> >> >>>> stamp for the region is older
> than
> >> TTL.  There should be 2 parameters
> >> >>>> to
> >> >>>> configure for hbase:
> >> >>>>
> >> >>>> 1. whether to disable/enable the
> TTL
> >> thread.
> >> >>>> 2. the interval that TTL will
> run. maybe
> >> we can use a special value
> >> >>>> like 0
> >> >>>> to indicate that we don't run the
> TTL
> >> thread, thus saving one
> >> >>>> configuration
> >> >>>> parameter.
> >> >>>> for the default TTL, probably it
> should be
> >> set to 1 day.
> >> >>>> 3. How small will the region be
> merged. it
> >> should be a percentage of
> >> >>>> the
> >> >>>> store size. for example, if 2
> consecutive
> >> region is only 10% of the
> >> >>>> store
> >> >>>> szie ( default is 256M), we can
> initiate a
> >> region merge.  We probably
> >> >>>> need a
> >> >>>> parameter to reduce the merge
> too. for
> >> example , we only merge for
> >> >>>> regions
> >> >>>> who's largest timestamp
> >> >>>> is older than half of TTL.
> >> >>>>
> >> >>>>
> >> >>>> Jimmy
> >> >>>>
> >> >>>>
> >>
> --------------------------------------------------
> >> >>>> From: "Stack" <st...@duboce.net>
> >> >>>> Sent: Wednesday, September 15,
> 2010 10:08
> >> AM
> >> >>>> To: <us...@hbase.apache.org>
> >> >>>> Subject: Re: hbase doesn't delete
> data
> >> older than TTL in old regions
> >> >>>>
> >> >>>> > On Wed, Sep 15, 2010 at 9:54
> AM,
> >> Jinsong Hu <ji...@hotmail.com>
> >> >>>> > wrote:
> >> >>>> >> I have tested the TTL
> for hbase
> >> and found that it relies on
> >> >>>> compaction to
> >> >>>> >> remove old data .
> However, if a
> >> region has data that is older
> >> >>>> >> than TTL, and there is
> no trigger
> >> to compact it, then the data will
> >> >>>> >> remain
> >> >>>> >> there forever, wasting
> disk space
> >> and memory.
> >> >>>> >>
> >> >>>> >
> >> >>>> > So its working as advertised
> then?
> >> >>>> >
> >> >>>> > There's currently an issue
> where we
> >> can skip major compactions if
> >> >>>> your
> >> >>>> > write loading has a
> particular
> >> character: hbase-2990.
> >> >>>> >
> >> >>>> >
> >> >>>> >> It appears at this
> state, to
> >> really remove data older than TTL we
> >> >>>> need to
> >> >>>> >> start a client side
> deletion
> >> request.
> >> >>>> >
> >> >>>> > Or run a manual major
> compaction:
> >> >>>> >
> >> >>>> > $ echo "major_compact
> TABLENAME" |
> >> ./bin/hbase shell
> >> >>>> >
> >> >>>> >
> >> >>>> >
> >> >>>> > This is really a pity
> because
> >> >>>> >> it is an more expensive
> way to
> >> get the job done.  Another side
> >> >>>> effect of
> >> >>>> >> this is that as time
> goes on, we
> >> will end up with some small
> >> >>>> >> regions if the data are
> saved in
> >> chronological order in regions. It
> >> >>>> >> appears
> >> >>>> >> that hbase doesn't have
> a
> >> mechanism to merge 2 consecutive
> >> >>>> >> small regions into a
> bigger one
> >> at this time.
> >> >>>> >
> >> >>>> > $ ./bin/hbase
> >> org.apache.hadoop.hbase.util.Merge
> >> >>>> > Usage: bin/hbase merge
> >> <table-name> <region-1>
> <region-2>
> >> >>>> >
> >> >>>> > Currently only works on
> offlined
> >> table but there's a patch available
> >> >>>> > to make it run against
> onlined
> >> regions.
> >> >>>> >
> >> >>>> >
> >> >>>> > So if data is saved in
> >> >>>> >> chronological order,
> sooner or
> >> later we will run out of capacity ,
> >> >>>> even
> >> >>>> >> if
> >> >>>> >> the amount of data in
> hbase is
> >> small, because we have lots of
> >> >>>> regions
> >> >>>> >> with
> >> >>>> >> small storage space.
> >> >>>> >>
> >> >>>> >> A much cheaper way to
> remove data
> >> older than TTL would be to
> >> >>>> remember the
> >> >>>> >> latest timestamp for the
> region
> >> in the .META. table
> >> >>>> >> and if the time is older
> than
> >> TTL, we just adjust the row in .META.
> >> >>>> and
> >> >>>> >> delete the store ,
> without doing
> >> any compaction.
> >> >>>> >>
> >> >>>> >
> >> >>>> > Say more on the above. 
> It
> >> sounds promising.  Are you suggesting that
> >> >>>> > in addition to compactions
> that we
> >> also have a provision where we
> >> >>>> keep
> >> >>>> > account of a storefiles
> latest
> >> timestamp (we already do this I
> >> >>>> > believe) and that when now
> -
> >> storefile-timestamp > ttl, we just
> >> >>>> remove
> >> >>>> > the storefile
> wholesale.  That
> >> sounds like it could work, if that is
> >> >>>> > what you are
> suggesting.  Mind
> >> filing an issue w/ a detailed
> >> >>>> > description?
> >> >>>> >
> >> >>>> > Thanks,
> >> >>>> > St.Ack
> >> >>>> >
> >> >>>> >
> >> >>>> >
> >> >>>> >> Can this be added to the
> hbase
> >> requirement for future release ?
> >> >>>> >>
> >> >>>> >> Jimmy
> >> >>>> >>
> >> >>>> >>
> >> >>>> >>
> >> >>>> >
> >> >>>
> >> >>
> >> >
> >> 
> > 
> > 
> > 
> > 
> > 
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Jinsong Hu <ji...@hotmail.com>.

I artificially set TTL to 10 minutes so that I can get the results quicker, 
and I don't have to wait for one day to get results. the TTL is set to 600 
seconds ( equals 10 minutes) when I did the testing.

In real application, TTL will be set to several months to years.

One thing I am not clear about major compaction is that for the regions with 
a single map file,
will hbase actually load it and remove the records older than TTL ? I read 
the document and
it doesn't seem to be the case. From engineering point of view, it also 
doesn't make sense to run compaction on a region that has only one single 
map file. The consequence is for those regions with single map file and old 
data,  old data will not be dropped forever even though it has well passed 
TTL .

I designed the TTL test case to see whether it works under different 
scenarios and figure out how it is actually done. Unfortunately it confirmed 
my suspicion that current TTL is implemented
purely based on active compaction.  And in log table/history data table, 
current implementation is not sufficient.

Jimmy

--------------------------------------------------
From: "Andrew Purtell" <ap...@apache.org>
Sent: Wednesday, September 15, 2010 5:33 PM
To: <us...@hbase.apache.org>
Subject: Re: hbase doesn't delete data older than TTL in old regions

>>  I did a test with 2 key structure: 1.  time:random ,
>> and  2. random:time.
>> the TTL is set to 10 minutes. the time is current system
>> time. the random is a random string with length 2-10
>> characters long.
>
> This use case doesn't make much sense the way HBase currently works. You 
> can set the TTL to 10 minutes but default major compaction runs every 24 
> hours. This can be tuned down, I've run with it every 4 or 8 hours during 
> various experiments with different operational conditions. However TTL is 
> specified in seconds instead of milliseconds given the notion of the 
> typical TTL being greater than the major compaction interval.
>
> If TTL is so short, maybe it should not be flushed from memstore at all? 
> Is that what you want?
>
>    - Andy
>
>
>> From: Jinsong Hu <ji...@hotmail.com>
>> Subject: Re: hbase doesn't delete data older than TTL in old regions
>> To: user@hbase.apache.org
>> Date: Wednesday, September 15, 2010, 11:56 AM
>> Hi, ryan:
>>  I did a test with 2 key structure: 1.  time:random ,
>> and  2. random:time.
>> the TTL is set to 10 minutes. the time is current system
>> time. the random is a random string with length 2-10
>> characters long.
>>
>>  I wrote a test program to continue to pump data into hbase
>> table , with the time going up with time.
>> for the second test case, the number of rows remains
>> approximately constant after it reaches
>> certain limit. I also checked a specific row, and wait to
>> 20 minutes later and check it again
>> and found it is indeed gone.
>>
>> In the first key case, the number of rows continue to grow
>> and the number of regions continue to grow..
>> to some number much higher than first case, and doesn't
>> stop. I checked some stores that  with data
>> several  hours old and they still remain there without
>> getting deleted.
>>
>> Jimmy.
>>
>> --------------------------------------------------
>> From: "Ryan Rawson" <ry...@gmail.com>
>> Sent: Wednesday, September 15, 2010 11:43 AM
>> To: <us...@hbase.apache.org>
>> Subject: Re: hbase doesn't delete data older than TTL in
>> old regions
>>
>> > I feel the need to pipe in here, since people are
>> accusing hbase of
>> > having a broken feature 'TTL' when from the
>> description in this email
>> > thread, and my own knowledge doesn't really describe a
>> broken feature.
>> > Non optimal maybe, but not broken.
>> >
>> > First off, the TTL feature works on the timestamp,
>> thus rowkey
>> > structure is not related.  This is because the
>> timestamp is stored in
>> > a different field.  If you are also storing the
>> data in row key
>> > chronological order, then you may end up with sparse
>> or 'small'
>> > regions.  But that doesn't mean the feature is
>> broken - ie: it does
>> > not remove data older than the TTL.  Needs tuning
>> yes, but not broken.
>> >
>> > Also note that "client side deletes" work in the same
>> way that TTL
>> > does, you insert a tombstone marker, then a compaction
>> actually purges
>> > the data itself.
>> >
>> > -ryan
>> >
>> > On Wed, Sep 15, 2010 at 11:26 AM, Jinsong Hu <ji...@hotmail.com>
>> wrote:
>> >> I opened a ticket https://issues.apache.org/jira/browse/HBASE-2999 to
>> track
>> >> issue. dropping old store , and update the
>> adjacent region's key range when
>> >> all
>> >> store for a region is gone is probably the
>> cheapest solution, both in terms
>> >> of coding and in terms of resource usage in the
>> cluster. Do we know when
>> >> this can be done ?
>> >>
>> >>
>> >> Jimmy.
>> >>
>> >>
>> --------------------------------------------------
>> >> From: "Jonathan Gray" <jg...@facebook.com>
>> >> Sent: Wednesday, September 15, 2010 11:06 AM
>> >> To: <us...@hbase.apache.org>
>> >> Subject: RE: hbase doesn't delete data older than
>> TTL in old regions
>> >>
>> >>> This sounds reasonable.
>> >>>
>> >>> We are tracking min/max timestamps in
>> storefiles too, so it's possible
>> >>> that we could expire some files of a region as
>> well, even if the region was
>> >>> not completely expired.
>> >>>
>> >>> Jinsong, mind filing a jira?
>> >>>
>> >>> JG
>> >>>
>> >>>> -----Original Message-----
>> >>>> From: Jinsong Hu [mailto:jinsong_hu@hotmail.com]
>> >>>> Sent: Wednesday, September 15, 2010 10:39
>> AM
>> >>>> To: user@hbase.apache.org
>> >>>> Subject: Re: hbase doesn't delete data
>> older than TTL in old regions
>> >>>>
>> >>>> Yes, Current TTL based on compaction is
>> working as advertised if the
>> >>>> key
>> >>>> randomly distribute the incoming data
>> >>>> among all regions.  However, if the
>> key is designed in chronological
>> >>>> order,
>> >>>> the TTL doesn't really work, as  no
>> compaction
>> >>>> will happen for data already written. So
>> we can't say  that current TTL
>> >>>> really work as advertised, as it is key
>> structure dependent.
>> >>>>
>> >>>> This is a pity, because a major use case
>> for hbase is for people to
>> >>>> store
>> >>>> history or log data. normally people only
>> >>>> want to retain the data for a fixed
>> period. for example, US government
>> >>>> default data retention policy is 7 years.
>> Those
>> >>>> data are saved in chronological order.
>> Current TTL implementation
>> >>>> doesn't
>> >>>> work at all for those kind of use case.
>> >>>>
>> >>>> In order for that use case to really work,
>> hbase needs to have an
>> >>>> active
>> >>>> thread that periodically runs and check if
>> there
>> >>>> are data older than TTL, and delete the
>> data older than TTL is
>> >>>> necessary,
>> >>>> and compact small regions older than
>> certain time period
>> >>>> into larger ones to save system resource.
>> It can optimize the deletion
>> >>>> by
>> >>>> delete the whole region if it detects that
>> the last time
>> >>>> stamp for the region is older than
>> TTL.  There should be 2 parameters
>> >>>> to
>> >>>> configure for hbase:
>> >>>>
>> >>>> 1. whether to disable/enable the TTL
>> thread.
>> >>>> 2. the interval that TTL will run. maybe
>> we can use a special value
>> >>>> like 0
>> >>>> to indicate that we don't run the TTL
>> thread, thus saving one
>> >>>> configuration
>> >>>> parameter.
>> >>>> for the default TTL, probably it should be
>> set to 1 day.
>> >>>> 3. How small will the region be merged. it
>> should be a percentage of
>> >>>> the
>> >>>> store size. for example, if 2 consecutive
>> region is only 10% of the
>> >>>> store
>> >>>> szie ( default is 256M), we can initiate a
>> region merge.  We probably
>> >>>> need a
>> >>>> parameter to reduce the merge too. for
>> example , we only merge for
>> >>>> regions
>> >>>> who's largest timestamp
>> >>>> is older than half of TTL.
>> >>>>
>> >>>>
>> >>>> Jimmy
>> >>>>
>> >>>>
>> --------------------------------------------------
>> >>>> From: "Stack" <st...@duboce.net>
>> >>>> Sent: Wednesday, September 15, 2010 10:08
>> AM
>> >>>> To: <us...@hbase.apache.org>
>> >>>> Subject: Re: hbase doesn't delete data
>> older than TTL in old regions
>> >>>>
>> >>>> > On Wed, Sep 15, 2010 at 9:54 AM,
>> Jinsong Hu <ji...@hotmail.com>
>> >>>> > wrote:
>> >>>> >> I have tested the TTL for hbase
>> and found that it relies on
>> >>>> compaction to
>> >>>> >> remove old data . However, if a
>> region has data that is older
>> >>>> >> than TTL, and there is no trigger
>> to compact it, then the data will
>> >>>> >> remain
>> >>>> >> there forever, wasting disk space
>> and memory.
>> >>>> >>
>> >>>> >
>> >>>> > So its working as advertised then?
>> >>>> >
>> >>>> > There's currently an issue where we
>> can skip major compactions if
>> >>>> your
>> >>>> > write loading has a particular
>> character: hbase-2990.
>> >>>> >
>> >>>> >
>> >>>> >> It appears at this state, to
>> really remove data older than TTL we
>> >>>> need to
>> >>>> >> start a client side deletion
>> request.
>> >>>> >
>> >>>> > Or run a manual major compaction:
>> >>>> >
>> >>>> > $ echo "major_compact TABLENAME" |
>> ./bin/hbase shell
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > This is really a pity because
>> >>>> >> it is an more expensive way to
>> get the job done.  Another side
>> >>>> effect of
>> >>>> >> this is that as time goes on, we
>> will end up with some small
>> >>>> >> regions if the data are saved in
>> chronological order in regions. It
>> >>>> >> appears
>> >>>> >> that hbase doesn't have a
>> mechanism to merge 2 consecutive
>> >>>> >> small regions into a bigger one
>> at this time.
>> >>>> >
>> >>>> > $ ./bin/hbase
>> org.apache.hadoop.hbase.util.Merge
>> >>>> > Usage: bin/hbase merge
>> <table-name> <region-1> <region-2>
>> >>>> >
>> >>>> > Currently only works on offlined
>> table but there's a patch available
>> >>>> > to make it run against onlined
>> regions.
>> >>>> >
>> >>>> >
>> >>>> > So if data is saved in
>> >>>> >> chronological order, sooner or
>> later we will run out of capacity ,
>> >>>> even
>> >>>> >> if
>> >>>> >> the amount of data in hbase is
>> small, because we have lots of
>> >>>> regions
>> >>>> >> with
>> >>>> >> small storage space.
>> >>>> >>
>> >>>> >> A much cheaper way to remove data
>> older than TTL would be to
>> >>>> remember the
>> >>>> >> latest timestamp for the region
>> in the .META. table
>> >>>> >> and if the time is older than
>> TTL, we just adjust the row in .META.
>> >>>> and
>> >>>> >> delete the store , without doing
>> any compaction.
>> >>>> >>
>> >>>> >
>> >>>> > Say more on the above.  It
>> sounds promising.  Are you suggesting that
>> >>>> > in addition to compactions that we
>> also have a provision where we
>> >>>> keep
>> >>>> > account of a storefiles latest
>> timestamp (we already do this I
>> >>>> > believe) and that when now -
>> storefile-timestamp > ttl, we just
>> >>>> remove
>> >>>> > the storefile wholesale.  That
>> sounds like it could work, if that is
>> >>>> > what you are suggesting.  Mind
>> filing an issue w/ a detailed
>> >>>> > description?
>> >>>> >
>> >>>> > Thanks,
>> >>>> > St.Ack
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> >> Can this be added to the hbase
>> requirement for future release ?
>> >>>> >>
>> >>>> >> Jimmy
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >
>> >>>
>> >>
>> >
>>
>
>
>
>
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Andrew Purtell <ap...@apache.org>.

>  I did a test with 2 key structure: 1.  time:random ,
> and  2. random:time.
> the TTL is set to 10 minutes. the time is current system
> time. the random is a random string with length 2-10
> characters long.

This use case doesn't make much sense the way HBase currently works. You can set the TTL to 10 minutes but default major compaction runs every 24 hours. This can be tuned down, I've run with it every 4 or 8 hours during various experiments with different operational conditions. However TTL is specified in seconds instead of milliseconds given the notion of the typical TTL being greater than the major compaction interval. 

If TTL is so short, maybe it should not be flushed from memstore at all? Is that what you want? 

    - Andy


> From: Jinsong Hu <ji...@hotmail.com>
> Subject: Re: hbase doesn't delete data older than TTL in old regions
> To: user@hbase.apache.org
> Date: Wednesday, September 15, 2010, 11:56 AM
> Hi, ryan:
>  I did a test with 2 key structure: 1.  time:random ,
> and  2. random:time.
> the TTL is set to 10 minutes. the time is current system
> time. the random is a random string with length 2-10
> characters long.
> 
>  I wrote a test program to continue to pump data into hbase
> table , with the time going up with time.
> for the second test case, the number of rows remains
> approximately constant after it reaches
> certain limit. I also checked a specific row, and wait to
> 20 minutes later and check it again
> and found it is indeed gone.
> 
> In the first key case, the number of rows continue to grow
> and the number of regions continue to grow..
> to some number much higher than first case, and doesn't
> stop. I checked some stores that  with data
> several  hours old and they still remain there without
> getting deleted.
> 
> Jimmy.
> 
> --------------------------------------------------
> From: "Ryan Rawson" <ry...@gmail.com>
> Sent: Wednesday, September 15, 2010 11:43 AM
> To: <us...@hbase.apache.org>
> Subject: Re: hbase doesn't delete data older than TTL in
> old regions
> 
> > I feel the need to pipe in here, since people are
> accusing hbase of
> > having a broken feature 'TTL' when from the
> description in this email
> > thread, and my own knowledge doesn't really describe a
> broken feature.
> > Non optimal maybe, but not broken.
> > 
> > First off, the TTL feature works on the timestamp,
> thus rowkey
> > structure is not related.  This is because the
> timestamp is stored in
> > a different field.  If you are also storing the
> data in row key
> > chronological order, then you may end up with sparse
> or 'small'
> > regions.  But that doesn't mean the feature is
> broken - ie: it does
> > not remove data older than the TTL.  Needs tuning
> yes, but not broken.
> > 
> > Also note that "client side deletes" work in the same
> way that TTL
> > does, you insert a tombstone marker, then a compaction
> actually purges
> > the data itself.
> > 
> > -ryan
> > 
> > On Wed, Sep 15, 2010 at 11:26 AM, Jinsong Hu <ji...@hotmail.com>
> wrote:
> >> I opened a ticket https://issues.apache.org/jira/browse/HBASE-2999 to
> track
> >> issue. dropping old store , and update the
> adjacent region's key range when
> >> all
> >> store for a region is gone is probably the
> cheapest solution, both in terms
> >> of coding and in terms of resource usage in the
> cluster. Do we know when
> >> this can be done ?
> >> 
> >> 
> >> Jimmy.
> >> 
> >>
> --------------------------------------------------
> >> From: "Jonathan Gray" <jg...@facebook.com>
> >> Sent: Wednesday, September 15, 2010 11:06 AM
> >> To: <us...@hbase.apache.org>
> >> Subject: RE: hbase doesn't delete data older than
> TTL in old regions
> >> 
> >>> This sounds reasonable.
> >>> 
> >>> We are tracking min/max timestamps in
> storefiles too, so it's possible
> >>> that we could expire some files of a region as
> well, even if the region was
> >>> not completely expired.
> >>> 
> >>> Jinsong, mind filing a jira?
> >>> 
> >>> JG
> >>> 
> >>>> -----Original Message-----
> >>>> From: Jinsong Hu [mailto:jinsong_hu@hotmail.com]
> >>>> Sent: Wednesday, September 15, 2010 10:39
> AM
> >>>> To: user@hbase.apache.org
> >>>> Subject: Re: hbase doesn't delete data
> older than TTL in old regions
> >>>> 
> >>>> Yes, Current TTL based on compaction is
> working as advertised if the
> >>>> key
> >>>> randomly distribute the incoming data
> >>>> among all regions.  However, if the
> key is designed in chronological
> >>>> order,
> >>>> the TTL doesn't really work, as  no
> compaction
> >>>> will happen for data already written. So
> we can't say  that current TTL
> >>>> really work as advertised, as it is key
> structure dependent.
> >>>> 
> >>>> This is a pity, because a major use case
> for hbase is for people to
> >>>> store
> >>>> history or log data. normally people only
> >>>> want to retain the data for a fixed
> period. for example, US government
> >>>> default data retention policy is 7 years.
> Those
> >>>> data are saved in chronological order.
> Current TTL implementation
> >>>> doesn't
> >>>> work at all for those kind of use case.
> >>>> 
> >>>> In order for that use case to really work,
> hbase needs to have an
> >>>> active
> >>>> thread that periodically runs and check if
> there
> >>>> are data older than TTL, and delete the
> data older than TTL is
> >>>> necessary,
> >>>> and compact small regions older than
> certain time period
> >>>> into larger ones to save system resource.
> It can optimize the deletion
> >>>> by
> >>>> delete the whole region if it detects that
> the last time
> >>>> stamp for the region is older than
> TTL.  There should be 2 parameters
> >>>> to
> >>>> configure for hbase:
> >>>> 
> >>>> 1. whether to disable/enable the TTL
> thread.
> >>>> 2. the interval that TTL will run. maybe
> we can use a special value
> >>>> like 0
> >>>> to indicate that we don't run the TTL
> thread, thus saving one
> >>>> configuration
> >>>> parameter.
> >>>> for the default TTL, probably it should be
> set to 1 day.
> >>>> 3. How small will the region be merged. it
> should be a percentage of
> >>>> the
> >>>> store size. for example, if 2 consecutive
> region is only 10% of the
> >>>> store
> >>>> szie ( default is 256M), we can initiate a
> region merge.  We probably
> >>>> need a
> >>>> parameter to reduce the merge too. for
> example , we only merge for
> >>>> regions
> >>>> who's largest timestamp
> >>>> is older than half of TTL.
> >>>> 
> >>>> 
> >>>> Jimmy
> >>>> 
> >>>>
> --------------------------------------------------
> >>>> From: "Stack" <st...@duboce.net>
> >>>> Sent: Wednesday, September 15, 2010 10:08
> AM
> >>>> To: <us...@hbase.apache.org>
> >>>> Subject: Re: hbase doesn't delete data
> older than TTL in old regions
> >>>> 
> >>>> > On Wed, Sep 15, 2010 at 9:54 AM,
> Jinsong Hu <ji...@hotmail.com>
> >>>> > wrote:
> >>>> >> I have tested the TTL for hbase
> and found that it relies on
> >>>> compaction to
> >>>> >> remove old data . However, if a
> region has data that is older
> >>>> >> than TTL, and there is no trigger
> to compact it, then the data will
> >>>> >> remain
> >>>> >> there forever, wasting disk space
> and memory.
> >>>> >>
> >>>> >
> >>>> > So its working as advertised then?
> >>>> >
> >>>> > There's currently an issue where we
> can skip major compactions if
> >>>> your
> >>>> > write loading has a particular
> character: hbase-2990.
> >>>> >
> >>>> >
> >>>> >> It appears at this state, to
> really remove data older than TTL we
> >>>> need to
> >>>> >> start a client side deletion
> request.
> >>>> >
> >>>> > Or run a manual major compaction:
> >>>> >
> >>>> > $ echo "major_compact TABLENAME" |
> ./bin/hbase shell
> >>>> >
> >>>> >
> >>>> >
> >>>> > This is really a pity because
> >>>> >> it is an more expensive way to
> get the job done.  Another side
> >>>> effect of
> >>>> >> this is that as time goes on, we
> will end up with some small
> >>>> >> regions if the data are saved in
> chronological order in regions. It
> >>>> >> appears
> >>>> >> that hbase doesn't have a
> mechanism to merge 2 consecutive
> >>>> >> small regions into a bigger one
> at this time.
> >>>> >
> >>>> > $ ./bin/hbase
> org.apache.hadoop.hbase.util.Merge
> >>>> > Usage: bin/hbase merge
> <table-name> <region-1> <region-2>
> >>>> >
> >>>> > Currently only works on offlined
> table but there's a patch available
> >>>> > to make it run against onlined
> regions.
> >>>> >
> >>>> >
> >>>> > So if data is saved in
> >>>> >> chronological order, sooner or
> later we will run out of capacity ,
> >>>> even
> >>>> >> if
> >>>> >> the amount of data in hbase is
> small, because we have lots of
> >>>> regions
> >>>> >> with
> >>>> >> small storage space.
> >>>> >>
> >>>> >> A much cheaper way to remove data
> older than TTL would be to
> >>>> remember the
> >>>> >> latest timestamp for the region
> in the .META. table
> >>>> >> and if the time is older than
> TTL, we just adjust the row in .META.
> >>>> and
> >>>> >> delete the store , without doing
> any compaction.
> >>>> >>
> >>>> >
> >>>> > Say more on the above.  It
> sounds promising.  Are you suggesting that
> >>>> > in addition to compactions that we
> also have a provision where we
> >>>> keep
> >>>> > account of a storefiles latest
> timestamp (we already do this I
> >>>> > believe) and that when now -
> storefile-timestamp > ttl, we just
> >>>> remove
> >>>> > the storefile wholesale.  That
> sounds like it could work, if that is
> >>>> > what you are suggesting.  Mind
> filing an issue w/ a detailed
> >>>> > description?
> >>>> >
> >>>> > Thanks,
> >>>> > St.Ack
> >>>> >
> >>>> >
> >>>> >
> >>>> >> Can this be added to the hbase
> requirement for future release ?
> >>>> >>
> >>>> >> Jimmy
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >
> >>> 
> >> 
> > 
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Jinsong Hu <ji...@hotmail.com>.

Hi, ryan:
  I did a test with 2 key structure: 1.  time:random , and  2. random:time.
the TTL is set to 10 minutes. the time is current system time. the random is 
a random string with length 2-10 characters long.

  I wrote a test program to continue to pump data into hbase table , with 
the time going up with time.
for the second test case, the number of rows remains approximately constant 
after it reaches
certain limit. I also checked a specific row, and wait to 20 minutes later 
and check it again
and found it is indeed gone.

In the first key case, the number of rows continue to grow and the number of 
regions continue to grow..
to some number much higher than first case, and doesn't stop. I checked some 
stores that  with data
several  hours old and they still remain there without getting deleted.

Jimmy.

--------------------------------------------------
From: "Ryan Rawson" <ry...@gmail.com>
Sent: Wednesday, September 15, 2010 11:43 AM
To: <us...@hbase.apache.org>
Subject: Re: hbase doesn't delete data older than TTL in old regions

> I feel the need to pipe in here, since people are accusing hbase of
> having a broken feature 'TTL' when from the description in this email
> thread, and my own knowledge doesn't really describe a broken feature.
> Non optimal maybe, but not broken.
>
> First off, the TTL feature works on the timestamp, thus rowkey
> structure is not related.  This is because the timestamp is stored in
> a different field.  If you are also storing the data in row key
> chronological order, then you may end up with sparse or 'small'
> regions.  But that doesn't mean the feature is broken - ie: it does
> not remove data older than the TTL.  Needs tuning yes, but not broken.
>
> Also note that "client side deletes" work in the same way that TTL
> does, you insert a tombstone marker, then a compaction actually purges
> the data itself.
>
> -ryan
>
> On Wed, Sep 15, 2010 at 11:26 AM, Jinsong Hu <ji...@hotmail.com> 
> wrote:
>> I opened a ticket https://issues.apache.org/jira/browse/HBASE-2999 to 
>> track
>> issue. dropping old store , and update the adjacent region's key range 
>> when
>> all
>> store for a region is gone is probably the cheapest solution, both in 
>> terms
>> of coding and in terms of resource usage in the cluster. Do we know when
>> this can be done ?
>>
>>
>> Jimmy.
>>
>> --------------------------------------------------
>> From: "Jonathan Gray" <jg...@facebook.com>
>> Sent: Wednesday, September 15, 2010 11:06 AM
>> To: <us...@hbase.apache.org>
>> Subject: RE: hbase doesn't delete data older than TTL in old regions
>>
>>> This sounds reasonable.
>>>
>>> We are tracking min/max timestamps in storefiles too, so it's possible
>>> that we could expire some files of a region as well, even if the region 
>>> was
>>> not completely expired.
>>>
>>> Jinsong, mind filing a jira?
>>>
>>> JG
>>>
>>>> -----Original Message-----
>>>> From: Jinsong Hu [mailto:jinsong_hu@hotmail.com]
>>>> Sent: Wednesday, September 15, 2010 10:39 AM
>>>> To: user@hbase.apache.org
>>>> Subject: Re: hbase doesn't delete data older than TTL in old regions
>>>>
>>>> Yes, Current TTL based on compaction is working as advertised if the
>>>> key
>>>> randomly distribute the incoming data
>>>> among all regions.  However, if the key is designed in chronological
>>>> order,
>>>> the TTL doesn't really work, as  no compaction
>>>> will happen for data already written. So we can't say  that current TTL
>>>> really work as advertised, as it is key structure dependent.
>>>>
>>>> This is a pity, because a major use case for hbase is for people to
>>>> store
>>>> history or log data. normally people only
>>>> want to retain the data for a fixed period. for example, US government
>>>> default data retention policy is 7 years. Those
>>>> data are saved in chronological order. Current TTL implementation
>>>> doesn't
>>>> work at all for those kind of use case.
>>>>
>>>> In order for that use case to really work, hbase needs to have an
>>>> active
>>>> thread that periodically runs and check if there
>>>> are data older than TTL, and delete the data older than TTL is
>>>> necessary,
>>>> and compact small regions older than certain time period
>>>> into larger ones to save system resource. It can optimize the deletion
>>>> by
>>>> delete the whole region if it detects that the last time
>>>> stamp for the region is older than TTL.  There should be 2 parameters
>>>> to
>>>> configure for hbase:
>>>>
>>>> 1. whether to disable/enable the TTL thread.
>>>> 2. the interval that TTL will run. maybe we can use a special value
>>>> like 0
>>>> to indicate that we don't run the TTL thread, thus saving one
>>>> configuration
>>>> parameter.
>>>> for the default TTL, probably it should be set to 1 day.
>>>> 3. How small will the region be merged. it should be a percentage of
>>>> the
>>>> store size. for example, if 2 consecutive region is only 10% of the
>>>> store
>>>> szie ( default is 256M), we can initiate a region merge.  We probably
>>>> need a
>>>> parameter to reduce the merge too. for example , we only merge for
>>>> regions
>>>> who's largest timestamp
>>>> is older than half of TTL.
>>>>
>>>>
>>>> Jimmy
>>>>
>>>> --------------------------------------------------
>>>> From: "Stack" <st...@duboce.net>
>>>> Sent: Wednesday, September 15, 2010 10:08 AM
>>>> To: <us...@hbase.apache.org>
>>>> Subject: Re: hbase doesn't delete data older than TTL in old regions
>>>>
>>>> > On Wed, Sep 15, 2010 at 9:54 AM, Jinsong Hu <ji...@hotmail.com>
>>>> > wrote:
>>>> >> I have tested the TTL for hbase and found that it relies on
>>>> compaction to
>>>> >> remove old data . However, if a region has data that is older
>>>> >> than TTL, and there is no trigger to compact it, then the data will
>>>> >> remain
>>>> >> there forever, wasting disk space and memory.
>>>> >>
>>>> >
>>>> > So its working as advertised then?
>>>> >
>>>> > There's currently an issue where we can skip major compactions if
>>>> your
>>>> > write loading has a particular character: hbase-2990.
>>>> >
>>>> >
>>>> >> It appears at this state, to really remove data older than TTL we
>>>> need to
>>>> >> start a client side deletion request.
>>>> >
>>>> > Or run a manual major compaction:
>>>> >
>>>> > $ echo "major_compact TABLENAME" | ./bin/hbase shell
>>>> >
>>>> >
>>>> >
>>>> > This is really a pity because
>>>> >> it is an more expensive way to get the job done.  Another side
>>>> effect of
>>>> >> this is that as time goes on, we will end up with some small
>>>> >> regions if the data are saved in chronological order in regions. It
>>>> >> appears
>>>> >> that hbase doesn't have a mechanism to merge 2 consecutive
>>>> >> small regions into a bigger one at this time.
>>>> >
>>>> > $ ./bin/hbase org.apache.hadoop.hbase.util.Merge
>>>> > Usage: bin/hbase merge <table-name> <region-1> <region-2>
>>>> >
>>>> > Currently only works on offlined table but there's a patch available
>>>> > to make it run against onlined regions.
>>>> >
>>>> >
>>>> > So if data is saved in
>>>> >> chronological order, sooner or later we will run out of capacity ,
>>>> even
>>>> >> if
>>>> >> the amount of data in hbase is small, because we have lots of
>>>> regions
>>>> >> with
>>>> >> small storage space.
>>>> >>
>>>> >> A much cheaper way to remove data older than TTL would be to
>>>> remember the
>>>> >> latest timestamp for the region in the .META. table
>>>> >> and if the time is older than TTL, we just adjust the row in .META.
>>>> and
>>>> >> delete the store , without doing any compaction.
>>>> >>
>>>> >
>>>> > Say more on the above.  It sounds promising.  Are you suggesting that
>>>> > in addition to compactions that we also have a provision where we
>>>> keep
>>>> > account of a storefiles latest timestamp (we already do this I
>>>> > believe) and that when now - storefile-timestamp > ttl, we just
>>>> remove
>>>> > the storefile wholesale.  That sounds like it could work, if that is
>>>> > what you are suggesting.  Mind filing an issue w/ a detailed
>>>> > description?
>>>> >
>>>> > Thanks,
>>>> > St.Ack
>>>> >
>>>> >
>>>> >
>>>> >> Can this be added to the hbase requirement for future release ?
>>>> >>
>>>> >> Jimmy
>>>> >>
>>>> >>
>>>> >>
>>>> >
>>>
>>
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Andrew Purtell <ap...@apache.org>.

Yeah, indeed the TTL feature is not broken. It works as "advertised" if you understand how HBase internals work. 

But we can accommodate the expectations communicated on this thread, it sounds reasonable.

    - Andy


--- On Wed, 9/15/10, Ryan Rawson <ry...@gmail.com> wrote:

> From: Ryan Rawson <ry...@gmail.com>
> Subject: Re: hbase doesn't delete data older than TTL in old regions
> To: user@hbase.apache.org
> Date: Wednesday, September 15, 2010, 11:43 AM
> I feel the need to pipe in here,
> since people are accusing hbase of
> having a broken feature 'TTL' when from the description in
> this email
> thread, and my own knowledge doesn't really describe a
> broken feature.
>  Non optimal maybe, but not broken.
> 
> First off, the TTL feature works on the timestamp, thus
> rowkey
> structure is not related.  This is because the
> timestamp is stored in
> a different field.  If you are also storing the data
> in row key
> chronological order, then you may end up with sparse or
> 'small'
> regions.  But that doesn't mean the feature is broken
> - ie: it does
> not remove data older than the TTL.  Needs tuning yes,
> but not broken.
> 
> Also note that "client side deletes" work in the same way
> that TTL
> does, you insert a tombstone marker, then a compaction
> actually purges
> the data itself.
> 
> -ryan
> 
> On Wed, Sep 15, 2010 at 11:26 AM, Jinsong Hu <ji...@hotmail.com>
> wrote:
> > I opened a ticket https://issues.apache.org/jira/browse/HBASE-2999 to
> track
> > issue. dropping old store , and update the adjacent
> region's key range when
> > all
> > store for a region is gone is probably the cheapest
> solution, both in terms
> > of coding and in terms of resource usage in the
> cluster. Do we know when
> > this can be done ?
> >
> >
> > Jimmy.
> >
> > --------------------------------------------------
> > From: "Jonathan Gray" <jg...@facebook.com>
> > Sent: Wednesday, September 15, 2010 11:06 AM
> > To: <us...@hbase.apache.org>
> > Subject: RE: hbase doesn't delete data older than TTL
> in old regions
> >
> >> This sounds reasonable.
> >>
> >> We are tracking min/max timestamps in storefiles
> too, so it's possible
> >> that we could expire some files of a region as
> well, even if the region was
> >> not completely expired.
> >>
> >> Jinsong, mind filing a jira?
> >>
> >> JG
> >>
> >>> -----Original Message-----
> >>> From: Jinsong Hu [mailto:jinsong_hu@hotmail.com]
> >>> Sent: Wednesday, September 15, 2010 10:39 AM
> >>> To: user@hbase.apache.org
> >>> Subject: Re: hbase doesn't delete data older
> than TTL in old regions
> >>>
> >>> Yes, Current TTL based on compaction is
> working as advertised if the
> >>> key
> >>> randomly distribute the incoming data
> >>> among all regions.  However, if the key is
> designed in chronological
> >>> order,
> >>> the TTL doesn't really work, as  no
> compaction
> >>> will happen for data already written. So we
> can't say  that current TTL
> >>> really work as advertised, as it is key
> structure dependent.
> >>>
> >>> This is a pity, because a major use case for
> hbase is for people to
> >>> store
> >>> history or log data. normally people only
> >>> want to retain the data for a fixed period.
> for example, US government
> >>> default data retention policy is 7 years.
> Those
> >>> data are saved in chronological order. Current
> TTL implementation
> >>> doesn't
> >>> work at all for those kind of use case.
> >>>
> >>> In order for that use case to really work,
> hbase needs to have an
> >>> active
> >>> thread that periodically runs and check if
> there
> >>> are data older than TTL, and delete the data
> older than TTL is
> >>> necessary,
> >>> and compact small regions older than certain
> time period
> >>> into larger ones to save system resource. It
> can optimize the deletion
> >>> by
> >>> delete the whole region if it detects that the
> last time
> >>> stamp for the region is older than TTL.
>  There should be 2 parameters
> >>> to
> >>> configure for hbase:
> >>>
> >>> 1. whether to disable/enable the TTL thread.
> >>> 2. the interval that TTL will run. maybe we
> can use a special value
> >>> like 0
> >>> to indicate that we don't run the TTL thread,
> thus saving one
> >>> configuration
> >>> parameter.
> >>> for the default TTL, probably it should be set
> to 1 day.
> >>> 3. How small will the region be merged. it
> should be a percentage of
> >>> the
> >>> store size. for example, if 2 consecutive
> region is only 10% of the
> >>> store
> >>> szie ( default is 256M), we can initiate a
> region merge.  We probably
> >>> need a
> >>> parameter to reduce the merge too. for example
> , we only merge for
> >>> regions
> >>> who's largest timestamp
> >>> is older than half of TTL.
> >>>
> >>>
> >>> Jimmy
> >>>
> >>>
> --------------------------------------------------
> >>> From: "Stack" <st...@duboce.net>
> >>> Sent: Wednesday, September 15, 2010 10:08 AM
> >>> To: <us...@hbase.apache.org>
> >>> Subject: Re: hbase doesn't delete data older
> than TTL in old regions
> >>>
> >>> > On Wed, Sep 15, 2010 at 9:54 AM, Jinsong
> Hu <ji...@hotmail.com>
> >>> > wrote:
> >>> >> I have tested the TTL for hbase and
> found that it relies on
> >>> compaction to
> >>> >> remove old data . However, if a
> region has data that is older
> >>> >> than TTL, and there is no trigger to
> compact it, then the data will
> >>> >> remain
> >>> >> there forever, wasting disk space and
> memory.
> >>> >>
> >>> >
> >>> > So its working as advertised then?
> >>> >
> >>> > There's currently an issue where we can
> skip major compactions if
> >>> your
> >>> > write loading has a particular character:
> hbase-2990.
> >>> >
> >>> >
> >>> >> It appears at this state, to really
> remove data older than TTL we
> >>> need to
> >>> >> start a client side deletion
> request.
> >>> >
> >>> > Or run a manual major compaction:
> >>> >
> >>> > $ echo "major_compact TABLENAME" |
> ./bin/hbase shell
> >>> >
> >>> >
> >>> >
> >>> > This is really a pity because
> >>> >> it is an more expensive way to get
> the job done.  Another side
> >>> effect of
> >>> >> this is that as time goes on, we will
> end up with some small
> >>> >> regions if the data are saved in
> chronological order in regions. It
> >>> >> appears
> >>> >> that hbase doesn't have a mechanism
> to merge 2 consecutive
> >>> >> small regions into a bigger one at
> this time.
> >>> >
> >>> > $ ./bin/hbase
> org.apache.hadoop.hbase.util.Merge
> >>> > Usage: bin/hbase merge <table-name>
> <region-1> <region-2>
> >>> >
> >>> > Currently only works on offlined table
> but there's a patch available
> >>> > to make it run against onlined regions.
> >>> >
> >>> >
> >>> > So if data is saved in
> >>> >> chronological order, sooner or later
> we will run out of capacity ,
> >>> even
> >>> >> if
> >>> >> the amount of data in hbase is small,
> because we have lots of
> >>> regions
> >>> >> with
> >>> >> small storage space.
> >>> >>
> >>> >> A much cheaper way to remove data
> older than TTL would be to
> >>> remember the
> >>> >> latest timestamp for the region in
> the .META. table
> >>> >> and if the time is older than TTL, we
> just adjust the row in .META.
> >>> and
> >>> >> delete the store , without doing any
> compaction.
> >>> >>
> >>> >
> >>> > Say more on the above.  It sounds
> promising.  Are you suggesting that
> >>> > in addition to compactions that we also
> have a provision where we
> >>> keep
> >>> > account of a storefiles latest timestamp
> (we already do this I
> >>> > believe) and that when now -
> storefile-timestamp > ttl, we just
> >>> remove
> >>> > the storefile wholesale.  That sounds
> like it could work, if that is
> >>> > what you are suggesting.  Mind filing an
> issue w/ a detailed
> >>> > description?
> >>> >
> >>> > Thanks,
> >>> > St.Ack
> >>> >
> >>> >
> >>> >
> >>> >> Can this be added to the hbase
> requirement for future release ?
> >>> >>
> >>> >> Jimmy
> >>> >>
> >>> >>
> >>> >>
> >>> >
> >>
> >
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Ryan Rawson <ry...@gmail.com>.

I feel the need to pipe in here, since people are accusing hbase of
having a broken feature 'TTL' when from the description in this email
thread, and my own knowledge doesn't really describe a broken feature.
 Non optimal maybe, but not broken.

First off, the TTL feature works on the timestamp, thus rowkey
structure is not related.  This is because the timestamp is stored in
a different field.  If you are also storing the data in row key
chronological order, then you may end up with sparse or 'small'
regions.  But that doesn't mean the feature is broken - ie: it does
not remove data older than the TTL.  Needs tuning yes, but not broken.

Also note that "client side deletes" work in the same way that TTL
does, you insert a tombstone marker, then a compaction actually purges
the data itself.

-ryan

On Wed, Sep 15, 2010 at 11:26 AM, Jinsong Hu <ji...@hotmail.com> wrote:
> I opened a ticket https://issues.apache.org/jira/browse/HBASE-2999 to track
> issue. dropping old store , and update the adjacent region's key range when
> all
> store for a region is gone is probably the cheapest solution, both in terms
> of coding and in terms of resource usage in the cluster. Do we know when
> this can be done ?
>
>
> Jimmy.
>
> --------------------------------------------------
> From: "Jonathan Gray" <jg...@facebook.com>
> Sent: Wednesday, September 15, 2010 11:06 AM
> To: <us...@hbase.apache.org>
> Subject: RE: hbase doesn't delete data older than TTL in old regions
>
>> This sounds reasonable.
>>
>> We are tracking min/max timestamps in storefiles too, so it's possible
>> that we could expire some files of a region as well, even if the region was
>> not completely expired.
>>
>> Jinsong, mind filing a jira?
>>
>> JG
>>
>>> -----Original Message-----
>>> From: Jinsong Hu [mailto:jinsong_hu@hotmail.com]
>>> Sent: Wednesday, September 15, 2010 10:39 AM
>>> To: user@hbase.apache.org
>>> Subject: Re: hbase doesn't delete data older than TTL in old regions
>>>
>>> Yes, Current TTL based on compaction is working as advertised if the
>>> key
>>> randomly distribute the incoming data
>>> among all regions.  However, if the key is designed in chronological
>>> order,
>>> the TTL doesn't really work, as  no compaction
>>> will happen for data already written. So we can't say  that current TTL
>>> really work as advertised, as it is key structure dependent.
>>>
>>> This is a pity, because a major use case for hbase is for people to
>>> store
>>> history or log data. normally people only
>>> want to retain the data for a fixed period. for example, US government
>>> default data retention policy is 7 years. Those
>>> data are saved in chronological order. Current TTL implementation
>>> doesn't
>>> work at all for those kind of use case.
>>>
>>> In order for that use case to really work, hbase needs to have an
>>> active
>>> thread that periodically runs and check if there
>>> are data older than TTL, and delete the data older than TTL is
>>> necessary,
>>> and compact small regions older than certain time period
>>> into larger ones to save system resource. It can optimize the deletion
>>> by
>>> delete the whole region if it detects that the last time
>>> stamp for the region is older than TTL.  There should be 2 parameters
>>> to
>>> configure for hbase:
>>>
>>> 1. whether to disable/enable the TTL thread.
>>> 2. the interval that TTL will run. maybe we can use a special value
>>> like 0
>>> to indicate that we don't run the TTL thread, thus saving one
>>> configuration
>>> parameter.
>>> for the default TTL, probably it should be set to 1 day.
>>> 3. How small will the region be merged. it should be a percentage of
>>> the
>>> store size. for example, if 2 consecutive region is only 10% of the
>>> store
>>> szie ( default is 256M), we can initiate a region merge.  We probably
>>> need a
>>> parameter to reduce the merge too. for example , we only merge for
>>> regions
>>> who's largest timestamp
>>> is older than half of TTL.
>>>
>>>
>>> Jimmy
>>>
>>> --------------------------------------------------
>>> From: "Stack" <st...@duboce.net>
>>> Sent: Wednesday, September 15, 2010 10:08 AM
>>> To: <us...@hbase.apache.org>
>>> Subject: Re: hbase doesn't delete data older than TTL in old regions
>>>
>>> > On Wed, Sep 15, 2010 at 9:54 AM, Jinsong Hu <ji...@hotmail.com>
>>> > wrote:
>>> >> I have tested the TTL for hbase and found that it relies on
>>> compaction to
>>> >> remove old data . However, if a region has data that is older
>>> >> than TTL, and there is no trigger to compact it, then the data will
>>> >> remain
>>> >> there forever, wasting disk space and memory.
>>> >>
>>> >
>>> > So its working as advertised then?
>>> >
>>> > There's currently an issue where we can skip major compactions if
>>> your
>>> > write loading has a particular character: hbase-2990.
>>> >
>>> >
>>> >> It appears at this state, to really remove data older than TTL we
>>> need to
>>> >> start a client side deletion request.
>>> >
>>> > Or run a manual major compaction:
>>> >
>>> > $ echo "major_compact TABLENAME" | ./bin/hbase shell
>>> >
>>> >
>>> >
>>> > This is really a pity because
>>> >> it is an more expensive way to get the job done.  Another side
>>> effect of
>>> >> this is that as time goes on, we will end up with some small
>>> >> regions if the data are saved in chronological order in regions. It
>>> >> appears
>>> >> that hbase doesn't have a mechanism to merge 2 consecutive
>>> >> small regions into a bigger one at this time.
>>> >
>>> > $ ./bin/hbase org.apache.hadoop.hbase.util.Merge
>>> > Usage: bin/hbase merge <table-name> <region-1> <region-2>
>>> >
>>> > Currently only works on offlined table but there's a patch available
>>> > to make it run against onlined regions.
>>> >
>>> >
>>> > So if data is saved in
>>> >> chronological order, sooner or later we will run out of capacity ,
>>> even
>>> >> if
>>> >> the amount of data in hbase is small, because we have lots of
>>> regions
>>> >> with
>>> >> small storage space.
>>> >>
>>> >> A much cheaper way to remove data older than TTL would be to
>>> remember the
>>> >> latest timestamp for the region in the .META. table
>>> >> and if the time is older than TTL, we just adjust the row in .META.
>>> and
>>> >> delete the store , without doing any compaction.
>>> >>
>>> >
>>> > Say more on the above.  It sounds promising.  Are you suggesting that
>>> > in addition to compactions that we also have a provision where we
>>> keep
>>> > account of a storefiles latest timestamp (we already do this I
>>> > believe) and that when now - storefile-timestamp > ttl, we just
>>> remove
>>> > the storefile wholesale.  That sounds like it could work, if that is
>>> > what you are suggesting.  Mind filing an issue w/ a detailed
>>> > description?
>>> >
>>> > Thanks,
>>> > St.Ack
>>> >
>>> >
>>> >
>>> >> Can this be added to the hbase requirement for future release ?
>>> >>
>>> >> Jimmy
>>> >>
>>> >>
>>> >>
>>> >
>>
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Jinsong Hu <ji...@hotmail.com>.

I opened a ticket https://issues.apache.org/jira/browse/HBASE-2999 to track 
issue. dropping old store , and update the adjacent region's key range when 
all
store for a region is gone is probably the cheapest solution, both in terms 
of coding and in terms of resource usage in the cluster. Do we know when 
this can be done ?


Jimmy.

--------------------------------------------------
From: "Jonathan Gray" <jg...@facebook.com>
Sent: Wednesday, September 15, 2010 11:06 AM
To: <us...@hbase.apache.org>
Subject: RE: hbase doesn't delete data older than TTL in old regions

> This sounds reasonable.
>
> We are tracking min/max timestamps in storefiles too, so it's possible 
> that we could expire some files of a region as well, even if the region 
> was not completely expired.
>
> Jinsong, mind filing a jira?
>
> JG
>
>> -----Original Message-----
>> From: Jinsong Hu [mailto:jinsong_hu@hotmail.com]
>> Sent: Wednesday, September 15, 2010 10:39 AM
>> To: user@hbase.apache.org
>> Subject: Re: hbase doesn't delete data older than TTL in old regions
>>
>> Yes, Current TTL based on compaction is working as advertised if the
>> key
>> randomly distribute the incoming data
>> among all regions.  However, if the key is designed in chronological
>> order,
>> the TTL doesn't really work, as  no compaction
>> will happen for data already written. So we can't say  that current TTL
>> really work as advertised, as it is key structure dependent.
>>
>> This is a pity, because a major use case for hbase is for people to
>> store
>> history or log data. normally people only
>> want to retain the data for a fixed period. for example, US government
>> default data retention policy is 7 years. Those
>> data are saved in chronological order. Current TTL implementation
>> doesn't
>> work at all for those kind of use case.
>>
>> In order for that use case to really work, hbase needs to have an
>> active
>> thread that periodically runs and check if there
>> are data older than TTL, and delete the data older than TTL is
>> necessary,
>> and compact small regions older than certain time period
>> into larger ones to save system resource. It can optimize the deletion
>> by
>> delete the whole region if it detects that the last time
>> stamp for the region is older than TTL.  There should be 2 parameters
>> to
>> configure for hbase:
>>
>> 1. whether to disable/enable the TTL thread.
>> 2. the interval that TTL will run. maybe we can use a special value
>> like 0
>> to indicate that we don't run the TTL thread, thus saving one
>> configuration
>> parameter.
>> for the default TTL, probably it should be set to 1 day.
>> 3. How small will the region be merged. it should be a percentage of
>> the
>> store size. for example, if 2 consecutive region is only 10% of the
>> store
>> szie ( default is 256M), we can initiate a region merge.  We probably
>> need a
>> parameter to reduce the merge too. for example , we only merge for
>> regions
>> who's largest timestamp
>> is older than half of TTL.
>>
>>
>> Jimmy
>>
>> --------------------------------------------------
>> From: "Stack" <st...@duboce.net>
>> Sent: Wednesday, September 15, 2010 10:08 AM
>> To: <us...@hbase.apache.org>
>> Subject: Re: hbase doesn't delete data older than TTL in old regions
>>
>> > On Wed, Sep 15, 2010 at 9:54 AM, Jinsong Hu <ji...@hotmail.com>
>> > wrote:
>> >> I have tested the TTL for hbase and found that it relies on
>> compaction to
>> >> remove old data . However, if a region has data that is older
>> >> than TTL, and there is no trigger to compact it, then the data will
>> >> remain
>> >> there forever, wasting disk space and memory.
>> >>
>> >
>> > So its working as advertised then?
>> >
>> > There's currently an issue where we can skip major compactions if
>> your
>> > write loading has a particular character: hbase-2990.
>> >
>> >
>> >> It appears at this state, to really remove data older than TTL we
>> need to
>> >> start a client side deletion request.
>> >
>> > Or run a manual major compaction:
>> >
>> > $ echo "major_compact TABLENAME" | ./bin/hbase shell
>> >
>> >
>> >
>> > This is really a pity because
>> >> it is an more expensive way to get the job done.  Another side
>> effect of
>> >> this is that as time goes on, we will end up with some small
>> >> regions if the data are saved in chronological order in regions. It
>> >> appears
>> >> that hbase doesn't have a mechanism to merge 2 consecutive
>> >> small regions into a bigger one at this time.
>> >
>> > $ ./bin/hbase org.apache.hadoop.hbase.util.Merge
>> > Usage: bin/hbase merge <table-name> <region-1> <region-2>
>> >
>> > Currently only works on offlined table but there's a patch available
>> > to make it run against onlined regions.
>> >
>> >
>> > So if data is saved in
>> >> chronological order, sooner or later we will run out of capacity ,
>> even
>> >> if
>> >> the amount of data in hbase is small, because we have lots of
>> regions
>> >> with
>> >> small storage space.
>> >>
>> >> A much cheaper way to remove data older than TTL would be to
>> remember the
>> >> latest timestamp for the region in the .META. table
>> >> and if the time is older than TTL, we just adjust the row in .META.
>> and
>> >> delete the store , without doing any compaction.
>> >>
>> >
>> > Say more on the above.  It sounds promising.  Are you suggesting that
>> > in addition to compactions that we also have a provision where we
>> keep
>> > account of a storefiles latest timestamp (we already do this I
>> > believe) and that when now - storefile-timestamp > ttl, we just
>> remove
>> > the storefile wholesale.  That sounds like it could work, if that is
>> > what you are suggesting.  Mind filing an issue w/ a detailed
>> > description?
>> >
>> > Thanks,
>> > St.Ack
>> >
>> >
>> >
>> >> Can this be added to the hbase requirement for future release ?
>> >>
>> >> Jimmy
>> >>
>> >>
>> >>
>> >
>

RE: hbase doesn't delete data older than TTL in old regions

Posted by Jonathan Gray <jg...@facebook.com>.

This sounds reasonable.

We are tracking min/max timestamps in storefiles too, so it's possible that we could expire some files of a region as well, even if the region was not completely expired.

Jinsong, mind filing a jira?

JG

> -----Original Message-----
> From: Jinsong Hu [mailto:jinsong_hu@hotmail.com]
> Sent: Wednesday, September 15, 2010 10:39 AM
> To: user@hbase.apache.org
> Subject: Re: hbase doesn't delete data older than TTL in old regions
> 
> Yes, Current TTL based on compaction is working as advertised if the
> key
> randomly distribute the incoming data
> among all regions.  However, if the key is designed in chronological
> order,
> the TTL doesn't really work, as  no compaction
> will happen for data already written. So we can't say  that current TTL
> really work as advertised, as it is key structure dependent.
> 
> This is a pity, because a major use case for hbase is for people to
> store
> history or log data. normally people only
> want to retain the data for a fixed period. for example, US government
> default data retention policy is 7 years. Those
> data are saved in chronological order. Current TTL implementation
> doesn't
> work at all for those kind of use case.
> 
> In order for that use case to really work, hbase needs to have an
> active
> thread that periodically runs and check if there
> are data older than TTL, and delete the data older than TTL is
> necessary,
> and compact small regions older than certain time period
> into larger ones to save system resource. It can optimize the deletion
> by
> delete the whole region if it detects that the last time
> stamp for the region is older than TTL.  There should be 2 parameters
> to
> configure for hbase:
> 
> 1. whether to disable/enable the TTL thread.
> 2. the interval that TTL will run. maybe we can use a special value
> like 0
> to indicate that we don't run the TTL thread, thus saving one
> configuration
> parameter.
> for the default TTL, probably it should be set to 1 day.
> 3. How small will the region be merged. it should be a percentage of
> the
> store size. for example, if 2 consecutive region is only 10% of the
> store
> szie ( default is 256M), we can initiate a region merge.  We probably
> need a
> parameter to reduce the merge too. for example , we only merge for
> regions
> who's largest timestamp
> is older than half of TTL.
> 
> 
> Jimmy
> 
> --------------------------------------------------
> From: "Stack" <st...@duboce.net>
> Sent: Wednesday, September 15, 2010 10:08 AM
> To: <us...@hbase.apache.org>
> Subject: Re: hbase doesn't delete data older than TTL in old regions
> 
> > On Wed, Sep 15, 2010 at 9:54 AM, Jinsong Hu <ji...@hotmail.com>
> > wrote:
> >> I have tested the TTL for hbase and found that it relies on
> compaction to
> >> remove old data . However, if a region has data that is older
> >> than TTL, and there is no trigger to compact it, then the data will
> >> remain
> >> there forever, wasting disk space and memory.
> >>
> >
> > So its working as advertised then?
> >
> > There's currently an issue where we can skip major compactions if
> your
> > write loading has a particular character: hbase-2990.
> >
> >
> >> It appears at this state, to really remove data older than TTL we
> need to
> >> start a client side deletion request.
> >
> > Or run a manual major compaction:
> >
> > $ echo "major_compact TABLENAME" | ./bin/hbase shell
> >
> >
> >
> > This is really a pity because
> >> it is an more expensive way to get the job done.  Another side
> effect of
> >> this is that as time goes on, we will end up with some small
> >> regions if the data are saved in chronological order in regions. It
> >> appears
> >> that hbase doesn't have a mechanism to merge 2 consecutive
> >> small regions into a bigger one at this time.
> >
> > $ ./bin/hbase org.apache.hadoop.hbase.util.Merge
> > Usage: bin/hbase merge <table-name> <region-1> <region-2>
> >
> > Currently only works on offlined table but there's a patch available
> > to make it run against onlined regions.
> >
> >
> > So if data is saved in
> >> chronological order, sooner or later we will run out of capacity ,
> even
> >> if
> >> the amount of data in hbase is small, because we have lots of
> regions
> >> with
> >> small storage space.
> >>
> >> A much cheaper way to remove data older than TTL would be to
> remember the
> >> latest timestamp for the region in the .META. table
> >> and if the time is older than TTL, we just adjust the row in .META.
> and
> >> delete the store , without doing any compaction.
> >>
> >
> > Say more on the above.  It sounds promising.  Are you suggesting that
> > in addition to compactions that we also have a provision where we
> keep
> > account of a storefiles latest timestamp (we already do this I
> > believe) and that when now - storefile-timestamp > ttl, we just
> remove
> > the storefile wholesale.  That sounds like it could work, if that is
> > what you are suggesting.  Mind filing an issue w/ a detailed
> > description?
> >
> > Thanks,
> > St.Ack
> >
> >
> >
> >> Can this be added to the hbase requirement for future release ?
> >>
> >> Jimmy
> >>
> >>
> >>
> >

Re: hbase doesn't delete data older than TTL in old regions

Posted by Jinsong Hu <ji...@hotmail.com>.

Yes, Current TTL based on compaction is working as advertised if the key 
randomly distribute the incoming data
among all regions.  However, if the key is designed in chronological order, 
the TTL doesn't really work, as  no compaction
will happen for data already written. So we can't say  that current TTL 
really work as advertised, as it is key structure dependent.

This is a pity, because a major use case for hbase is for people to store 
history or log data. normally people only
want to retain the data for a fixed period. for example, US government 
default data retention policy is 7 years. Those
data are saved in chronological order. Current TTL implementation doesn't 
work at all for those kind of use case.

In order for that use case to really work, hbase needs to have an active 
thread that periodically runs and check if there
are data older than TTL, and delete the data older than TTL is necessary, 
and compact small regions older than certain time period
into larger ones to save system resource. It can optimize the deletion by 
delete the whole region if it detects that the last time
stamp for the region is older than TTL.  There should be 2 parameters  to 
configure for hbase:

1. whether to disable/enable the TTL thread.
2. the interval that TTL will run. maybe we can use a special value like 0 
to indicate that we don't run the TTL thread, thus saving one configuration 
parameter.
for the default TTL, probably it should be set to 1 day.
3. How small will the region be merged. it should be a percentage of the 
store size. for example, if 2 consecutive region is only 10% of the store 
szie ( default is 256M), we can initiate a region merge.  We probably need a 
parameter to reduce the merge too. for example , we only merge for regions 
who's largest timestamp
is older than half of TTL.

Jimmy

--------------------------------------------------
From: "Stack" <st...@duboce.net>
Sent: Wednesday, September 15, 2010 10:08 AM
To: <us...@hbase.apache.org>
Subject: Re: hbase doesn't delete data older than TTL in old regions

> On Wed, Sep 15, 2010 at 9:54 AM, Jinsong Hu <ji...@hotmail.com> 
> wrote:
>> I have tested the TTL for hbase and found that it relies on compaction to
>> remove old data . However, if a region has data that is older
>> than TTL, and there is no trigger to compact it, then the data will 
>> remain
>> there forever, wasting disk space and memory.
>>
>
> So its working as advertised then?
>
> There's currently an issue where we can skip major compactions if your
> write loading has a particular character: hbase-2990.
>
>
>> It appears at this state, to really remove data older than TTL we need to
>> start a client side deletion request.
>
> Or run a manual major compaction:
>
> $ echo "major_compact TABLENAME" | ./bin/hbase shell
>
>
>
> This is really a pity because
>> it is an more expensive way to get the job done.  Another side effect of
>> this is that as time goes on, we will end up with some small
>> regions if the data are saved in chronological order in regions. It 
>> appears
>> that hbase doesn't have a mechanism to merge 2 consecutive
>> small regions into a bigger one at this time.
>
> $ ./bin/hbase org.apache.hadoop.hbase.util.Merge
> Usage: bin/hbase merge <table-name> <region-1> <region-2>
>
> Currently only works on offlined table but there's a patch available
> to make it run against onlined regions.
>
>
> So if data is saved in
>> chronological order, sooner or later we will run out of capacity , even 
>> if
>> the amount of data in hbase is small, because we have lots of regions 
>> with
>> small storage space.
>>
>> A much cheaper way to remove data older than TTL would be to remember the
>> latest timestamp for the region in the .META. table
>> and if the time is older than TTL, we just adjust the row in .META. and
>> delete the store , without doing any compaction.
>>
>
> Say more on the above.  It sounds promising.  Are you suggesting that
> in addition to compactions that we also have a provision where we keep
> account of a storefiles latest timestamp (we already do this I
> believe) and that when now - storefile-timestamp > ttl, we just remove
> the storefile wholesale.  That sounds like it could work, if that is
> what you are suggesting.  Mind filing an issue w/ a detailed
> description?
>
> Thanks,
> St.Ack
>
>
>
>> Can this be added to the hbase requirement for future release ?
>>
>> Jimmy
>>
>>
>>
>

Re: hbase doesn't delete data older than TTL in old regions

Posted by Stack <st...@duboce.net>.

On Wed, Sep 15, 2010 at 9:54 AM, Jinsong Hu <ji...@hotmail.com> wrote:
> I have tested the TTL for hbase and found that it relies on compaction to
> remove old data . However, if a region has data that is older
> than TTL, and there is no trigger to compact it, then the data will remain
> there forever, wasting disk space and memory.
>

So its working as advertised then?

There's currently an issue where we can skip major compactions if your
write loading has a particular character: hbase-2990.

> It appears at this state, to really remove data older than TTL we need to
> start a client side deletion request.

Or run a manual major compaction:

$ echo "major_compact TABLENAME" | ./bin/hbase shell

 This is really a pity because
> it is an more expensive way to get the job done.  Another side effect of
> this is that as time goes on, we will end up with some small
> regions if the data are saved in chronological order in regions. It appears
> that hbase doesn't have a mechanism to merge 2 consecutive
> small regions into a bigger one at this time.

$ ./bin/hbase org.apache.hadoop.hbase.util.Merge
Usage: bin/hbase merge <table-name> <region-1> <region-2>

Currently only works on offlined table but there's a patch available
to make it run against onlined regions.

So if data is saved in
> chronological order, sooner or later we will run out of capacity , even if
> the amount of data in hbase is small, because we have lots of regions with
> small storage space.
>
> A much cheaper way to remove data older than TTL would be to remember the
> latest timestamp for the region in the .META. table
> and if the time is older than TTL, we just adjust the row in .META. and
> delete the store , without doing any compaction.
>

Say more on the above.  It sounds promising.  Are you suggesting that
in addition to compactions that we also have a provision where we keep
account of a storefiles latest timestamp (we already do this I
believe) and that when now - storefile-timestamp > ttl, we just remove
the storefile wholesale.  That sounds like it could work, if that is
what you are suggesting.  Mind filing an issue w/ a detailed
description?

Thanks,
St.Ack

> Can this be added to the hbase requirement for future release ?
>
> Jimmy
>
>
>