You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Ding, Hui" <hu...@sap.com> on 2008/09/22 21:20:17 UTC

RE: [LIKELY JUNK]Back Up Strategies

This should be something the operators of your data store worriy about. 
E.g., say hdfs uses three replicas, one should be on a local rack, the
other on a different rack (to protect against power outage)
And a third on a remote data center...

If you have only a small cluster, then maybe use ups to guard against
power outage and watch out for storms? 
After all, what are the chances that a meteorite hit your data center?

-----Original Message-----
From: Charles Mason [mailto:charlie.mas@gmail.com] 
Sent: Monday, September 22, 2008 12:13 PM
To: hbase-user@hadoop.apache.org
Subject: [LIKELY JUNK]Back Up Strategies

Hi All,

I was wondering what the options there are for backup and dumping an
HBase database. I appreciate that having it run on top of a HDFS
cluster can protect against individual node failure. However that
still doesn't protect against the massive but thankfully rare
disasters which take out whole server racks, fire, floods, etc...

As far as I can tell there are two options:

1, Scan each table and dump the entire row to some external location,
like MySQL Dump does for MySQL. Then to recover simply put the new
data back. I am sure the performance of this is going to be fairly
bad.

2, Image the data stored on the HDFS cluster. Aren't there some big
issues with it not grabbing a consistent image as some updates won't
be flushed? Is there any way to force that, or to make it be
consistent some way, perhaps via snapshoting?

Have I missed anything? Anyone got any suggestions?

Charlie M

Re: [LIKELY JUNK]Back Up Strategies

Posted by Charles Mason <ch...@gmail.com>.
On Mon, Sep 22, 2008 at 8:20 PM, Ding, Hui <hu...@sap.com> wrote:
> This should be something the operators of your data store worriy about.
> E.g., say hdfs uses three replicas, one should be on a local rack, the
> other on a different rack (to protect against power outage)
> And a third on a remote data center...
>
> If you have only a small cluster, then maybe use ups to guard against
> power outage and watch out for storms?
> After all, what are the chances that a meteorite hit your data center?
Well knowing my luck :)

There are some times when moving data from one cluster to another is
important. Moving data from a development cluster to a production one
is another useful feature.

I know some clusters are so vast its not so practical or not so
important depending on what the data represents.

Charlie M

> -----Original Message-----
> From: Charles Mason [mailto:charlie.mas@gmail.com]
> Sent: Monday, September 22, 2008 12:13 PM
> To: hbase-user@hadoop.apache.org
> Subject: [LIKELY JUNK]Back Up Strategies
>
> Hi All,
>
> I was wondering what the options there are for backup and dumping an
> HBase database. I appreciate that having it run on top of a HDFS
> cluster can protect against individual node failure. However that
> still doesn't protect against the massive but thankfully rare
> disasters which take out whole server racks, fire, floods, etc...
>
> As far as I can tell there are two options:
>
> 1, Scan each table and dump the entire row to some external location,
> like MySQL Dump does for MySQL. Then to recover simply put the new
> data back. I am sure the performance of this is going to be fairly
> bad.
>
> 2, Image the data stored on the HDFS cluster. Aren't there some big
> issues with it not grabbing a consistent image as some updates won't
> be flushed? Is there any way to force that, or to make it be
> consistent some way, perhaps via snapshoting?
>
> Have I missed anything? Anyone got any suggestions?
>
> Charlie M
>