You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mike Peterson <mi...@mail.ru> on 2014/04/03 17:33:04 UTC

Moving older data versions to archive

 I need data versioning but want to keep older data in a separate location (to keep the current data file denser). What would be the best way to do that?

Re: Moving older data versions to archive

Posted by yonghu <yo...@gmail.com>.
I think you can define coprocessors to do this. For example, for every put
command, you can keep the desired versions that you want, and later put the
older version into the other table or HDFS. Finally, either let Hbase
delete your stale data or let coprocessor do that for you. The problem of
this approach is the performance, as you see, every put command will
trigger coprocessor once.


On Thu, Apr 3, 2014 at 8:55 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> Hey, that's one of the reasons I have opened HBASE-10115 but never got a
> chance to work on it. Basically, setup a TTL on the column, and with the
> hook, move the cells somewhere else.
>
> With current state, the only thing I see is a MR job which will run daily
> and move the older versions. Like, anything where version > 3 (as an
> example) and then delete it (or expire it with TTL, etc.). If unfortunatly
> don't think there is a "nice" solution to do that today.
>
> JM
>
>
> 2014-04-03 11:33 GMT-04:00 Mike Peterson <mi...@mail.ru>:
>
> >  I need data versioning but want to keep older data in a separate
> location
> > (to keep the current data file denser). What would be the best way to do
> > that?
>

Re[2]: Moving older data versions to archive

Posted by Mike Peterson <mi...@mail.ru>.
 Thank you for your response. After thinking a little bit about this, I think for me the ideal solution would be saving the old data to a separate file during compaction. It should work much faster than coprocessors.


Thu, 3 Apr 2014 14:55:32 -0400 от Jean-Marc Spaggiari <je...@spaggiari.org>:
>Hey, that's one of the reasons I have opened HBASE-10115 but never got a chance to work on it. Basically, setup a TTL on the column, and with the hook, move the cells somewhere else.
>
>With current state, the only thing I see is a MR job which will run daily and move the older versions. Like, anything where version > 3 (as an example) and then delete it (or expire it with TTL, etc.). If unfortunatly don't think there is a "nice" solution to do that today.
>
>JM
>
>
>2014-04-03 11:33 GMT-04:00 Mike Peterson  < mikepeterson@mail.ru > :
>> I need data versioning but want to keep older data in a separate location (to keep the current data file denser). What would be the best way to do that?


Re: Moving older data versions to archive

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hey, that's one of the reasons I have opened HBASE-10115 but never got a
chance to work on it. Basically, setup a TTL on the column, and with the
hook, move the cells somewhere else.

With current state, the only thing I see is a MR job which will run daily
and move the older versions. Like, anything where version > 3 (as an
example) and then delete it (or expire it with TTL, etc.). If unfortunatly
don't think there is a "nice" solution to do that today.

JM


2014-04-03 11:33 GMT-04:00 Mike Peterson <mi...@mail.ru>:

>  I need data versioning but want to keep older data in a separate location
> (to keep the current data file denser). What would be the best way to do
> that?