You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Rita <rm...@gmail.com> on 2012/08/16 13:31:51 UTC

backup strategies

I am sure this topic has been visited many times but I though I ask to see
if anything changed.

We are using hbase with close to 40b rows and backing up the data is
non-trivial. We can use export table to another Hadoop/HDFS filesystem but
I am not aware of any guaranteed way of preserving data from one version of
Hbase to another (specifically if its very old) . Is there a program which
will serialize the data into JSON/XML and dump it on a Unix filesystem?
Once I get the data we can compress it whatever we like and back it up
using our internal software.




-- 
--- Get your facts first, then you can distort them as you please.--

Re: backup strategies

Posted by Rita <rm...@gmail.com>.
Lets say I have a huge table and I want to back it up onto system with a
lot of disk space. Would this work, take all the keys and export the
database in chunks by selectively picking a range. For instance if the keys
are from 0-100000, I would say backup key 0-50000 into backup_dir_A and
50001-100000 to backup_dir_B . Would the be feasible?



On Wed, Aug 22, 2012 at 6:48 AM, Rita <rm...@gmail.com> wrote:

> what is the typical conversion process? My biggest worry is I come from a
> higher version of Hbase to a lower version of Hbase, say CDH4 to CDH3U1.
>
>
>
> On Thu, Aug 16, 2012 at 7:53 AM, Paul Mackles <pm...@adobe.com> wrote:
>
>> Hi Rita
>>
>> By default, the export that ships with hbase writes KeyValue objects to a
>> sequence file. It is a very simple app and it wouldn't be hard to roll
>> your own export program to write to whatever format you wanted (its a very
>> simple app). You can use the current export program as a basis and just
>> change the output of the mapper.
>>
>> I will say that I spent a lot of time thinking about backups and DR and I
>> didn't really worry much about hbase versions. The file formats for hbase
>> don't change that often and when they do, there is usually a pretty
>> straight-forward conversion process. Also, if you are doing something like
>> full daily backups then I am having trouble imagining a scenario where you
>> would need to restore from anything but the most recent backup.
>>
>> Depending on which version of hbase you are using, there are probably much
>> bigger issues with using export for backups that you should worry about
>> like being able to restore in a timely fashion, preserving deletes and
>> impact of the backup procress on your SLA.
>>
>> Paul
>>
>>
>> On 8/16/12 7:31 AM, "Rita" <rm...@gmail.com> wrote:
>>
>> >I am sure this topic has been visited many times but I though I ask to
>> see
>> >if anything changed.
>> >
>> >We are using hbase with close to 40b rows and backing up the data is
>> >non-trivial. We can use export table to another Hadoop/HDFS filesystem
>> but
>> >I am not aware of any guaranteed way of preserving data from one version
>> >of
>> >Hbase to another (specifically if its very old) . Is there a program
>> which
>> >will serialize the data into JSON/XML and dump it on a Unix filesystem?
>> >Once I get the data we can compress it whatever we like and back it up
>> >using our internal software.
>> >
>> >
>> >
>> >
>> >--
>> >--- Get your facts first, then you can distort them as you please.--
>>
>>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>



-- 
--- Get your facts first, then you can distort them as you please.--

Re: backup strategies

Posted by Rita <rm...@gmail.com>.
what is the typical conversion process? My biggest worry is I come from a
higher version of Hbase to a lower version of Hbase, say CDH4 to CDH3U1.



On Thu, Aug 16, 2012 at 7:53 AM, Paul Mackles <pm...@adobe.com> wrote:

> Hi Rita
>
> By default, the export that ships with hbase writes KeyValue objects to a
> sequence file. It is a very simple app and it wouldn't be hard to roll
> your own export program to write to whatever format you wanted (its a very
> simple app). You can use the current export program as a basis and just
> change the output of the mapper.
>
> I will say that I spent a lot of time thinking about backups and DR and I
> didn't really worry much about hbase versions. The file formats for hbase
> don't change that often and when they do, there is usually a pretty
> straight-forward conversion process. Also, if you are doing something like
> full daily backups then I am having trouble imagining a scenario where you
> would need to restore from anything but the most recent backup.
>
> Depending on which version of hbase you are using, there are probably much
> bigger issues with using export for backups that you should worry about
> like being able to restore in a timely fashion, preserving deletes and
> impact of the backup procress on your SLA.
>
> Paul
>
>
> On 8/16/12 7:31 AM, "Rita" <rm...@gmail.com> wrote:
>
> >I am sure this topic has been visited many times but I though I ask to see
> >if anything changed.
> >
> >We are using hbase with close to 40b rows and backing up the data is
> >non-trivial. We can use export table to another Hadoop/HDFS filesystem but
> >I am not aware of any guaranteed way of preserving data from one version
> >of
> >Hbase to another (specifically if its very old) . Is there a program which
> >will serialize the data into JSON/XML and dump it on a Unix filesystem?
> >Once I get the data we can compress it whatever we like and back it up
> >using our internal software.
> >
> >
> >
> >
> >--
> >--- Get your facts first, then you can distort them as you please.--
>
>


-- 
--- Get your facts first, then you can distort them as you please.--

Re: backup strategies

Posted by Paul Mackles <pm...@adobe.com>.
Hi Rita

By default, the export that ships with hbase writes KeyValue objects to a
sequence file. It is a very simple app and it wouldn't be hard to roll
your own export program to write to whatever format you wanted (its a very
simple app). You can use the current export program as a basis and just
change the output of the mapper.

I will say that I spent a lot of time thinking about backups and DR and I
didn't really worry much about hbase versions. The file formats for hbase
don't change that often and when they do, there is usually a pretty
straight-forward conversion process. Also, if you are doing something like
full daily backups then I am having trouble imagining a scenario where you
would need to restore from anything but the most recent backup.

Depending on which version of hbase you are using, there are probably much
bigger issues with using export for backups that you should worry about
like being able to restore in a timely fashion, preserving deletes and
impact of the backup procress on your SLA.

Paul


On 8/16/12 7:31 AM, "Rita" <rm...@gmail.com> wrote:

>I am sure this topic has been visited many times but I though I ask to see
>if anything changed.
>
>We are using hbase with close to 40b rows and backing up the data is
>non-trivial. We can use export table to another Hadoop/HDFS filesystem but
>I am not aware of any guaranteed way of preserving data from one version
>of
>Hbase to another (specifically if its very old) . Is there a program which
>will serialize the data into JSON/XML and dump it on a Unix filesystem?
>Once I get the data we can compress it whatever we like and back it up
>using our internal software.
>
>
>
>
>-- 
>--- Get your facts first, then you can distort them as you please.--