You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Eric Owhadi <er...@esgyn.com> on 2016/03/12 23:32:41 UTC

deleting row for aging purpose

Hello Hbaseers,

When dealing with time series, one can imagine that deleting row older than
a specific date threashold would be a common use case.

If I am not mistaken, the only options available today to perform this
aging delete is to first mark these row with a delete operation that will
set tombstone marker, and then major compaction will take care of removing
the data.

However this tombstoning is consuming valuable resource and  I was
wondering if we could instead pass a parameter to the major compact command
and tell it, BTW, any rowkey lower than xxxxxxxx, consider it tomb stoned,
so delete it.

Is that crazy idea?

Regards,

Eric

Re: deleting row for aging purpose

Posted by Vladimir Rodionov <vl...@gmail.com>.
>> Is that crazy idea?

kind of :)

Why do not use  cell's timestamps and column family TTL for that purpose?
Set explicitly KV's timestamp for every data point  you insert to be equals
to a data point's timestamp.
Your data will be automatically purged during compaction on TTL expiration.

-Vlad

On Sat, Mar 12, 2016 at 2:32 PM, Eric Owhadi <er...@esgyn.com> wrote:

> Hello Hbaseers,
>
> When dealing with time series, one can imagine that deleting row older than
> a specific date threashold would be a common use case.
>
> If I am not mistaken, the only options available today to perform this
> aging delete is to first mark these row with a delete operation that will
> set tombstone marker, and then major compaction will take care of removing
> the data.
>
> However this tombstoning is consuming valuable resource and  I was
> wondering if we could instead pass a parameter to the major compact command
> and tell it, BTW, any rowkey lower than xxxxxxxx, consider it tomb stoned,
> so delete it.
>
> Is that crazy idea?
>
> Regards,
>
> Eric
>