You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jinsong Hu <ji...@hotmail.com> on 2010/09/14 00:51:42 UTC

what is the unit of the TTL in hbase and how often hbase remove expired regions?

Hi,
  I want to find out what is the unit for TTL in hbase. I googled around and 
found some people say it is microsecond.
and I thought it was millisecond as that is java default. Then I searched 
hbase code and saw some test code treating
the unit to be seconds.
  I used a TTL=600000. if the unit is millisecond, then that means 10 
minute. However, I continue to insert records into
this table, and found that the regions older than 10 minutes are not 
removed.
  The question I have is , what is the unit for TTL. and the second question 
is, how often hbase checks all regions and
remove expired regions.

Jimmy 


Re: what is the unit of the TTL in hbase and how often hbase remove expired regions?

Posted by Jinsong Hu <ji...@hotmail.com>.
I realized that there is a problem.  I read some comments and realized that 
the TTL is used when the compaction is executed.
that means if there is some regions that has older data then TTL , and no 
new data is written to that region, then the old data will never be removed 
, because no compaction will ever happen for those data after they are 
stored.

Unfortunately, this is a fairly common situation for log data. Once they are 
written, they are seldom touched again, hence there will no compaction 
against those data.  It would be nice to use TTL to retire them, but because 
there is no compaction
for those old regions, the data will never be retired.

I wonder if the above scenario is correct and if it is, is there a solution 
for this other than periodically issue a client side
delete request ?

Jimmy

--------------------------------------------------
From: "Jonathan Gray" <jg...@facebook.com>
Sent: Monday, September 13, 2010 6:06 PM
To: <us...@hbase.apache.org>
Subject: RE: what is the unit of the TTL in hbase and how often hbase remove 
expired regions?

> The unit is seconds as outlined in HColumnDescriptor.  It's a little 
> confusing because server-side everything is milliseconds.  On the server, 
> it is converted from the user-configured seconds to milliseconds.
>
> Also, HBase will never expire "regions", rather it expires individual 
> versions of cells according to their timestamps.
>
> This is not enforced periodically, it is actually enforced constantly, so 
> you should *never see* an expired cell.  This does not mean it does not 
> still exist on disk, it means it will not be visible in user queries.  On 
> a major compaction (default every 24 hours) HBase will actually delete the 
> expired cells.
>
> JG
>
>> -----Original Message-----
>> From: Jinsong Hu [mailto:jinsong_hu@hotmail.com]
>> Sent: Monday, September 13, 2010 3:52 PM
>> To: user@hbase.apache.org
>> Subject: what is the unit of the TTL in hbase and how often hbase
>> remove expired regions?
>>
>> Hi,
>>   I want to find out what is the unit for TTL in hbase. I googled
>> around and
>> found some people say it is microsecond.
>> and I thought it was millisecond as that is java default. Then I
>> searched
>> hbase code and saw some test code treating
>> the unit to be seconds.
>>   I used a TTL=600000. if the unit is millisecond, then that means 10
>> minute. However, I continue to insert records into
>> this table, and found that the regions older than 10 minutes are not
>> removed.
>>   The question I have is , what is the unit for TTL. and the second
>> question
>> is, how often hbase checks all regions and
>> remove expired regions.
>>
>> Jimmy
>
> 

RE: what is the unit of the TTL in hbase and how often hbase remove expired regions?

Posted by Jonathan Gray <jg...@facebook.com>.
The unit is seconds as outlined in HColumnDescriptor.  It's a little confusing because server-side everything is milliseconds.  On the server, it is converted from the user-configured seconds to milliseconds.

Also, HBase will never expire "regions", rather it expires individual versions of cells according to their timestamps.

This is not enforced periodically, it is actually enforced constantly, so you should *never see* an expired cell.  This does not mean it does not still exist on disk, it means it will not be visible in user queries.  On a major compaction (default every 24 hours) HBase will actually delete the expired cells.

JG
 
> -----Original Message-----
> From: Jinsong Hu [mailto:jinsong_hu@hotmail.com]
> Sent: Monday, September 13, 2010 3:52 PM
> To: user@hbase.apache.org
> Subject: what is the unit of the TTL in hbase and how often hbase
> remove expired regions?
> 
> Hi,
>   I want to find out what is the unit for TTL in hbase. I googled
> around and
> found some people say it is microsecond.
> and I thought it was millisecond as that is java default. Then I
> searched
> hbase code and saw some test code treating
> the unit to be seconds.
>   I used a TTL=600000. if the unit is millisecond, then that means 10
> minute. However, I continue to insert records into
> this table, and found that the regions older than 10 minutes are not
> removed.
>   The question I have is , what is the unit for TTL. and the second
> question
> is, how often hbase checks all regions and
> remove expired regions.
> 
> Jimmy