You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Oleg Ruchovets <or...@gmail.com> on 2011/05/25 14:58:25 UTC

hbase row TTL

Hi ,
    Is it possible to define TTL for hbase row  (I found TTL only for column
family) ?
 In case it is not possible what is the best practice to implement TTL for
hbase rows?

Thanks in advance
Oleg.

Re: hbase row TTL

Posted by Jean-Daniel Cryans <jd...@apache.org>.
You are always using versions, 1 version is still a version since all
it means is that it will keep multiple versions if you overwrite the
same cell, but all the older versions will be cleaned during major
compactions and won't be returned even if they exist.

The TTL set to 2147483647 means no TTL, it's Interger.MAX_VALUE and is
interpreted as such inside HBase. I agree it's not user friendly.

Hope that helps,

J-D

On Thu, May 26, 2011 at 12:38 AM, Oleg Ruchovets <or...@gmail.com> wrote:
> Well ,
>   We put data to hbase on daily bases. We didn't works with versions
> All I need is after some predefined time (TTL) records from specific hbase
> table will be deleted/expired ( of course I understand that it will be
> deleted only after Major Compaction). For Example I want to define TLL 6
> month and after that all data which was inserted to hbase table before the
> TTL will be deleted.
>
> In addition I don't understand well  what does TTL means in hbase table
> creation command
>  I mean :
>  {NAME => 'cf1', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS
> => '1', COMPRESSI true
>
>  ON => 'GZ', TTL => '2147483647'}
> Defining TTL for column family cf1 what does it mean in my case? Does it
> means that 2147483647 millisecond all cells in cf1 will be expired ?
>
> I found post http://outerthought.org/blog/417-ot.html  :
> you can specify a time-to-live (TTL), if versions get older than this TTL
> they are deleted. The default TTL is "forever", and is configured via
> HColumnDescriptor.setTimeToLive(int seconds). Again, the actual removal of
> versions is done upon major compaction, but gets and scans will stop
> returning versions whose TTL is passed immediately. Note that when the TTL
> has passed for all cells in a row, the row ceases to exist (HBase has no
> explicit create or delete of a row: it exists if there are cells with values
> in them).
>
> does it means that defining TTL is relevant only in case I will using
> versions?
>
> Thanks in advance
> Oleg.
>
>
>
>
>
> On Wed, May 25, 2011 at 8:26 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> As you saw it's family based, there's no "cross-family" schemas.
>>
>> Can you tell us more about your use case?
>>
>> J-D
>>
>> On Wed, May 25, 2011 at 5:58 AM, Oleg Ruchovets <or...@gmail.com>
>> wrote:
>> > Hi ,
>> >    Is it possible to define TTL for hbase row  (I found TTL only for
>> column
>> > family) ?
>> >  In case it is not possible what is the best practice to implement TTL
>> for
>> > hbase rows?
>> >
>> > Thanks in advance
>> > Oleg.
>> >
>>
>

Re: hbase row TTL

Posted by Gary Helmling <gh...@gmail.com>.
Hi Oleg,

A TTL configuration will apply whether you use only 1 version or many
versions.  If all the KeyValue timestamps in a row are older than the
configured TTL, then the row is effectively deleted at the next major
compaction.

It sounds like the TTL functionality will do exactly what you want.  If you
want the TTL to apply to the entire table, simply set it to the same value
for each column family in the table.

For a TTL of 6 months, meaning that KeyValues with a timestamp older than 6
months (ts < now - 6 months), set
TTL => 6 * 30 * 24 * 60 * 60

ie, 180 days (in seconds).

--gh


On Thu, May 26, 2011 at 12:38 AM, Oleg Ruchovets <or...@gmail.com>wrote:

> Well ,
>   We put data to hbase on daily bases. We didn't works with versions
> All I need is after some predefined time (TTL) records from specific hbase
> table will be deleted/expired ( of course I understand that it will be
> deleted only after Major Compaction). For Example I want to define TLL 6
> month and after that all data which was inserted to hbase table before the
> TTL will be deleted.
>
> In addition I don't understand well  what does TTL means in hbase table
> creation command
>  I mean :
>  {NAME => 'cf1', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS
> => '1', COMPRESSI true
>
>  ON => 'GZ', TTL => '2147483647'}
> Defining TTL for column family cf1 what does it mean in my case? Does it
> means that 2147483647 millisecond all cells in cf1 will be expired ?
>
> I found post http://outerthought.org/blog/417-ot.html  :
> you can specify a time-to-live (TTL), if versions get older than this TTL
> they are deleted. The default TTL is "forever", and is configured via
> HColumnDescriptor.setTimeToLive(int seconds). Again, the actual removal of
> versions is done upon major compaction, but gets and scans will stop
> returning versions whose TTL is passed immediately. Note that when the TTL
> has passed for all cells in a row, the row ceases to exist (HBase has no
> explicit create or delete of a row: it exists if there are cells with
> values
> in them).
>
> does it means that defining TTL is relevant only in case I will using
> versions?
>
> Thanks in advance
> Oleg.
>
>
>
>
>
> On Wed, May 25, 2011 at 8:26 PM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
>
> > As you saw it's family based, there's no "cross-family" schemas.
> >
> > Can you tell us more about your use case?
> >
> > J-D
> >
> > On Wed, May 25, 2011 at 5:58 AM, Oleg Ruchovets <or...@gmail.com>
> > wrote:
> > > Hi ,
> > >    Is it possible to define TTL for hbase row  (I found TTL only for
> > column
> > > family) ?
> > >  In case it is not possible what is the best practice to implement TTL
> > for
> > > hbase rows?
> > >
> > > Thanks in advance
> > > Oleg.
> > >
> >
>

Re: hbase row TTL

Posted by Oleg Ruchovets <or...@gmail.com>.
Well ,
   We put data to hbase on daily bases. We didn't works with versions
All I need is after some predefined time (TTL) records from specific hbase
table will be deleted/expired ( of course I understand that it will be
deleted only after Major Compaction). For Example I want to define TLL 6
month and after that all data which was inserted to hbase table before the
TTL will be deleted.

In addition I don't understand well  what does TTL means in hbase table
creation command
 I mean :
  {NAME => 'cf1', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS
=> '1', COMPRESSI true

 ON => 'GZ', TTL => '2147483647'}
Defining TTL for column family cf1 what does it mean in my case? Does it
means that 2147483647 millisecond all cells in cf1 will be expired ?

I found post http://outerthought.org/blog/417-ot.html  :
you can specify a time-to-live (TTL), if versions get older than this TTL
they are deleted. The default TTL is "forever", and is configured via
HColumnDescriptor.setTimeToLive(int seconds). Again, the actual removal of
versions is done upon major compaction, but gets and scans will stop
returning versions whose TTL is passed immediately. Note that when the TTL
has passed for all cells in a row, the row ceases to exist (HBase has no
explicit create or delete of a row: it exists if there are cells with values
in them).

does it means that defining TTL is relevant only in case I will using
versions?

Thanks in advance
Oleg.





On Wed, May 25, 2011 at 8:26 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> As you saw it's family based, there's no "cross-family" schemas.
>
> Can you tell us more about your use case?
>
> J-D
>
> On Wed, May 25, 2011 at 5:58 AM, Oleg Ruchovets <or...@gmail.com>
> wrote:
> > Hi ,
> >    Is it possible to define TTL for hbase row  (I found TTL only for
> column
> > family) ?
> >  In case it is not possible what is the best practice to implement TTL
> for
> > hbase rows?
> >
> > Thanks in advance
> > Oleg.
> >
>

Re: hbase row TTL

Posted by Jean-Daniel Cryans <jd...@apache.org>.
As you saw it's family based, there's no "cross-family" schemas.

Can you tell us more about your use case?

J-D

On Wed, May 25, 2011 at 5:58 AM, Oleg Ruchovets <or...@gmail.com> wrote:
> Hi ,
>    Is it possible to define TTL for hbase row  (I found TTL only for column
> family) ?
>  In case it is not possible what is the best practice to implement TTL for
> hbase rows?
>
> Thanks in advance
> Oleg.
>