You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by R W <ai...@gmail.com> on 2014/04/02 07:56:41 UTC

What is the best practice for backup HBase data?

Hi Guys

I'm using hbase org.apache.hadoop.hbase.mapreduce.Export
/ org.apache.hadoop.hbase.mapreduce.Import to backup and restore HBase
data, at least it's good to me, i would like to know if there are any
better solutions or practices on how to backup HBase data, that will be
really helpful for us, thanks.

Cheers
aij

Re: What is the best practice for backup HBase data?

Posted by Demai Ni <ni...@gmail.com>.

we are working on a backup/restore solution in
https://issues.apache.org/jira/browse/HBASE-7912, which will use snapshot
and exportsnapshot for full backup and also use WALPlayer for incremental
backup. the patches are coming.

For critical data, real time replication is the way to go :
https://hbase.apache.org/replication.html. The drop-back of replication is
that when the master cluster get hot with a lot of put/update/delete, the
replication will also consume the resource. And backup/restore can be
scheduled at a 'quiet' time.  Well, there is always trade-off.

demai




On Wed, Apr 2, 2014 at 7:28 AM, Ted Yu <yu...@gmail.com> wrote:

> Modification to CLONE_TEST wouldn't affect original snapshot.
>
> Cheers
>
>
> On Wed, Apr 2, 2014 at 6:56 AM, R W <ai...@gmail.com> wrote:
>
> > Hi Ted
> >
> > OK, i guess i know how it works, so when i execute the clone operation,
> > data for the new table will be copied from the snapshot, so if my new
> table
> > is called "CLONE_TEST", i think on hdfs it will have a path like this
> > /hbase/CLONE_TEST which has the copied the data, then further
> modification
> > to CLONE_TEST table has nothing to do with the original snapshot, am i
> > correct? Thanks for your quick response :)
> >
> > Cheers
> > aij
> >
> >
> > On Wed, Apr 2, 2014 at 9:47 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > For first question about clone from snapshot, there is no copy of
> > snapshot
> > > involved.
> > > The clone is made from the snapshot itself.
> > >
> > > Cheers
> > >
> > > On Apr 2, 2014, at 4:23 AM, R W <ai...@gmail.com> wrote:
> > >
> > > > Hi Esteban
> > > >
> > > > I checked the snapshot feature and tried myself, it's very good, one
> of
> > > the
> > > > introduction
> > > >
> > >
> >
> http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/mentioned
> > > > about:
> > > >
> > > > Clone a snapshot: This operation creates a new table using the same
> > > schema
> > > >> and with the same data present in the specified snapshot. The result
> > of
> > > >> this operation is a new fully functional table that can can be
> > modified
> > > >> with no impact on the original table or the snapshot.
> > > >
> > > >
> > > > I think this clone operation will make a copy of the snapshot, then
> > > create
> > > > the new table from the copy of the snapshot, am i correct? Otherwise,
> > > > modification to the new table will change the snapshot, right?
> > > >
> > > > Another question, if we want to backup hbase data somewhere else, it
> > > seems
> > > > we cannot go with snapshot feature, we want the data to be backup
> even
> > > > after the whole Hadoop cluster down, any idea?
> > > >
> > > > Thanks
> > > > aij
> > > >
> > > >
> > > > On Wed, Apr 2, 2014 at 2:12 PM, Esteban Gutierrez <
> > esteban@cloudera.com
> > > >wrote:
> > > >
> > > >> Hello Aij,
> > > >>
> > > >> Snapshots are the suggested method since HBase 0.94.6, they provide
> > > better
> > > >> consistency for backing up data in HBase. You can find more
> > information
> > > in
> > > >> the HBase Book here:
> > > >>
> > > >> https://hbase.apache.org/book.html#ops.snapshots
> > > >>
> > > >> Depending on your use case and resources you might want to consider
> > > >> replication as well:
> > > >>
> > > >> http://hbase.apache.org/replication.html
> > > >>
> > > >> cheers,
> > > >> esteban.
> > > >>
> > > >>
> > > >> --
> > > >> Cloudera, Inc.
> > > >>
> > > >>
> > > >>
> > > >> On Tue, Apr 1, 2014 at 10:56 PM, R W <ai...@gmail.com> wrote:
> > > >>
> > > >>> Hi Guys
> > > >>>
> > > >>> I'm using hbase org.apache.hadoop.hbase.mapreduce.Export
> > > >>> / org.apache.hadoop.hbase.mapreduce.Import to backup and restore
> > HBase
> > > >>> data, at least it's good to me, i would like to know if there are
> any
> > > >>> better solutions or practices on how to backup HBase data, that
> will
> > be
> > > >>> really helpful for us, thanks.
> > > >>>
> > > >>> Cheers
> > > >>> aij
> > > >>
> > >
> >
>

Re: What is the best practice for backup HBase data?

Posted by Ted Yu <yu...@gmail.com>.

Modification to CLONE_TEST wouldn't affect original snapshot.

Cheers


On Wed, Apr 2, 2014 at 6:56 AM, R W <ai...@gmail.com> wrote:

> Hi Ted
>
> OK, i guess i know how it works, so when i execute the clone operation,
> data for the new table will be copied from the snapshot, so if my new table
> is called "CLONE_TEST", i think on hdfs it will have a path like this
> /hbase/CLONE_TEST which has the copied the data, then further modification
> to CLONE_TEST table has nothing to do with the original snapshot, am i
> correct? Thanks for your quick response :)
>
> Cheers
> aij
>
>
> On Wed, Apr 2, 2014 at 9:47 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > For first question about clone from snapshot, there is no copy of
> snapshot
> > involved.
> > The clone is made from the snapshot itself.
> >
> > Cheers
> >
> > On Apr 2, 2014, at 4:23 AM, R W <ai...@gmail.com> wrote:
> >
> > > Hi Esteban
> > >
> > > I checked the snapshot feature and tried myself, it's very good, one of
> > the
> > > introduction
> > >
> >
> http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/mentioned
> > > about:
> > >
> > > Clone a snapshot: This operation creates a new table using the same
> > schema
> > >> and with the same data present in the specified snapshot. The result
> of
> > >> this operation is a new fully functional table that can can be
> modified
> > >> with no impact on the original table or the snapshot.
> > >
> > >
> > > I think this clone operation will make a copy of the snapshot, then
> > create
> > > the new table from the copy of the snapshot, am i correct? Otherwise,
> > > modification to the new table will change the snapshot, right?
> > >
> > > Another question, if we want to backup hbase data somewhere else, it
> > seems
> > > we cannot go with snapshot feature, we want the data to be backup even
> > > after the whole Hadoop cluster down, any idea?
> > >
> > > Thanks
> > > aij
> > >
> > >
> > > On Wed, Apr 2, 2014 at 2:12 PM, Esteban Gutierrez <
> esteban@cloudera.com
> > >wrote:
> > >
> > >> Hello Aij,
> > >>
> > >> Snapshots are the suggested method since HBase 0.94.6, they provide
> > better
> > >> consistency for backing up data in HBase. You can find more
> information
> > in
> > >> the HBase Book here:
> > >>
> > >> https://hbase.apache.org/book.html#ops.snapshots
> > >>
> > >> Depending on your use case and resources you might want to consider
> > >> replication as well:
> > >>
> > >> http://hbase.apache.org/replication.html
> > >>
> > >> cheers,
> > >> esteban.
> > >>
> > >>
> > >> --
> > >> Cloudera, Inc.
> > >>
> > >>
> > >>
> > >> On Tue, Apr 1, 2014 at 10:56 PM, R W <ai...@gmail.com> wrote:
> > >>
> > >>> Hi Guys
> > >>>
> > >>> I'm using hbase org.apache.hadoop.hbase.mapreduce.Export
> > >>> / org.apache.hadoop.hbase.mapreduce.Import to backup and restore
> HBase
> > >>> data, at least it's good to me, i would like to know if there are any
> > >>> better solutions or practices on how to backup HBase data, that will
> be
> > >>> really helpful for us, thanks.
> > >>>
> > >>> Cheers
> > >>> aij
> > >>
> >
>

Re: What is the best practice for backup HBase data?

Posted by R W <ai...@gmail.com>.

Hi Ted

OK, i guess i know how it works, so when i execute the clone operation,
data for the new table will be copied from the snapshot, so if my new table
is called "CLONE_TEST", i think on hdfs it will have a path like this
/hbase/CLONE_TEST which has the copied the data, then further modification
to CLONE_TEST table has nothing to do with the original snapshot, am i
correct? Thanks for your quick response :)

Cheers
aij


On Wed, Apr 2, 2014 at 9:47 PM, Ted Yu <yu...@gmail.com> wrote:

> For first question about clone from snapshot, there is no copy of snapshot
> involved.
> The clone is made from the snapshot itself.
>
> Cheers
>
> On Apr 2, 2014, at 4:23 AM, R W <ai...@gmail.com> wrote:
>
> > Hi Esteban
> >
> > I checked the snapshot feature and tried myself, it's very good, one of
> the
> > introduction
> >
> http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/mentioned
> > about:
> >
> > Clone a snapshot: This operation creates a new table using the same
> schema
> >> and with the same data present in the specified snapshot. The result of
> >> this operation is a new fully functional table that can can be modified
> >> with no impact on the original table or the snapshot.
> >
> >
> > I think this clone operation will make a copy of the snapshot, then
> create
> > the new table from the copy of the snapshot, am i correct? Otherwise,
> > modification to the new table will change the snapshot, right?
> >
> > Another question, if we want to backup hbase data somewhere else, it
> seems
> > we cannot go with snapshot feature, we want the data to be backup even
> > after the whole Hadoop cluster down, any idea?
> >
> > Thanks
> > aij
> >
> >
> > On Wed, Apr 2, 2014 at 2:12 PM, Esteban Gutierrez <esteban@cloudera.com
> >wrote:
> >
> >> Hello Aij,
> >>
> >> Snapshots are the suggested method since HBase 0.94.6, they provide
> better
> >> consistency for backing up data in HBase. You can find more information
> in
> >> the HBase Book here:
> >>
> >> https://hbase.apache.org/book.html#ops.snapshots
> >>
> >> Depending on your use case and resources you might want to consider
> >> replication as well:
> >>
> >> http://hbase.apache.org/replication.html
> >>
> >> cheers,
> >> esteban.
> >>
> >>
> >> --
> >> Cloudera, Inc.
> >>
> >>
> >>
> >> On Tue, Apr 1, 2014 at 10:56 PM, R W <ai...@gmail.com> wrote:
> >>
> >>> Hi Guys
> >>>
> >>> I'm using hbase org.apache.hadoop.hbase.mapreduce.Export
> >>> / org.apache.hadoop.hbase.mapreduce.Import to backup and restore HBase
> >>> data, at least it's good to me, i would like to know if there are any
> >>> better solutions or practices on how to backup HBase data, that will be
> >>> really helpful for us, thanks.
> >>>
> >>> Cheers
> >>> aij
> >>
>

Re: What is the best practice for backup HBase data?

Posted by Ted Yu <yu...@gmail.com>.

For first question about clone from snapshot, there is no copy of snapshot involved. 
The clone is made from the snapshot itself. 

Cheers

On Apr 2, 2014, at 4:23 AM, R W <ai...@gmail.com> wrote:

> Hi Esteban
> 
> I checked the snapshot feature and tried myself, it's very good, one of the
> introduction
> http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/mentioned
> about:
> 
> Clone a snapshot: This operation creates a new table using the same schema
>> and with the same data present in the specified snapshot. The result of
>> this operation is a new fully functional table that can can be modified
>> with no impact on the original table or the snapshot.
> 
> 
> I think this clone operation will make a copy of the snapshot, then create
> the new table from the copy of the snapshot, am i correct? Otherwise,
> modification to the new table will change the snapshot, right?
> 
> Another question, if we want to backup hbase data somewhere else, it seems
> we cannot go with snapshot feature, we want the data to be backup even
> after the whole Hadoop cluster down, any idea?
> 
> Thanks
> aij
> 
> 
> On Wed, Apr 2, 2014 at 2:12 PM, Esteban Gutierrez <es...@cloudera.com>wrote:
> 
>> Hello Aij,
>> 
>> Snapshots are the suggested method since HBase 0.94.6, they provide better
>> consistency for backing up data in HBase. You can find more information in
>> the HBase Book here:
>> 
>> https://hbase.apache.org/book.html#ops.snapshots
>> 
>> Depending on your use case and resources you might want to consider
>> replication as well:
>> 
>> http://hbase.apache.org/replication.html
>> 
>> cheers,
>> esteban.
>> 
>> 
>> --
>> Cloudera, Inc.
>> 
>> 
>> 
>> On Tue, Apr 1, 2014 at 10:56 PM, R W <ai...@gmail.com> wrote:
>> 
>>> Hi Guys
>>> 
>>> I'm using hbase org.apache.hadoop.hbase.mapreduce.Export
>>> / org.apache.hadoop.hbase.mapreduce.Import to backup and restore HBase
>>> data, at least it's good to me, i would like to know if there are any
>>> better solutions or practices on how to backup HBase data, that will be
>>> really helpful for us, thanks.
>>> 
>>> Cheers
>>> aij
>>

Re: What is the best practice for backup HBase data?

Posted by R W <ai...@gmail.com>.

Hi JM

I see your point, actually we are trying to backup the hbase data to AWS S3
on a daily/weekly basis, that's why in my first mail i think maybe
org.apache.hadoop.hbase.mapreduce.Export /
org.apache.hadoop.hbase.mapreduce.Import
would be the best choice for us, any idea?

Thanks
aij


On Wed, Apr 2, 2014 at 9:51 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> Hi,
>
> For incremental, you might want to look at replication....
>
> JM
>
>
> 2014-04-02 9:49 GMT-04:00 R W <ai...@gmail.com>:
>
> > Hi JM
> >
> > Here is the problem if we use the ExportSnapshot feature to export data
> to
> > the dest cluster, it seems we cannot do incremental backup.
> >
> > Thanks
> > aij
> >
> >
> > On Wed, Apr 2, 2014 at 7:54 PM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org
> > > wrote:
> >
> > > Hi,
> > >
> > > You can take a look at replication. Activate replication from a date X
> > and
> > > then copy table from date 0 to date x from the origin to the dest
> > cluster.
> > >
> > > You can also export the snapshot. Take a look at 15.8.8 here:
> > > http://hbase.apache.org/book/ops.snapshots.html
> > >
> > > JM
> > >
> > >
> > > 2014-04-02 7:23 GMT-04:00 R W <ai...@gmail.com>:
> > >
> > > > Hi Esteban
> > > >
> > > > I checked the snapshot feature and tried myself, it's very good, one
> of
> > > the
> > > > introduction
> > > >
> > > >
> > >
> >
> http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/mentioned
> > > > about:
> > > >
> > > > Clone a snapshot: This operation creates a new table using the same
> > > schema
> > > > > and with the same data present in the specified snapshot. The
> result
> > of
> > > > > this operation is a new fully functional table that can can be
> > modified
> > > > > with no impact on the original table or the snapshot.
> > > >
> > > >
> > > > I think this clone operation will make a copy of the snapshot, then
> > > create
> > > > the new table from the copy of the snapshot, am i correct? Otherwise,
> > > > modification to the new table will change the snapshot, right?
> > > >
> > > > Another question, if we want to backup hbase data somewhere else, it
> > > seems
> > > > we cannot go with snapshot feature, we want the data to be backup
> even
> > > > after the whole Hadoop cluster down, any idea?
> > > >
> > > > Thanks
> > > > aij
> > > >
> > > >
> > > > On Wed, Apr 2, 2014 at 2:12 PM, Esteban Gutierrez <
> > esteban@cloudera.com
> > > > >wrote:
> > > >
> > > > > Hello Aij,
> > > > >
> > > > > Snapshots are the suggested method since HBase 0.94.6, they provide
> > > > better
> > > > > consistency for backing up data in HBase. You can find more
> > information
> > > > in
> > > > > the HBase Book here:
> > > > >
> > > > > https://hbase.apache.org/book.html#ops.snapshots
> > > > >
> > > > > Depending on your use case and resources you might want to consider
> > > > > replication as well:
> > > > >
> > > > > http://hbase.apache.org/replication.html
> > > > >
> > > > > cheers,
> > > > > esteban.
> > > > >
> > > > >
> > > > > --
> > > > > Cloudera, Inc.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Apr 1, 2014 at 10:56 PM, R W <ai...@gmail.com> wrote:
> > > > >
> > > > > > Hi Guys
> > > > > >
> > > > > > I'm using hbase org.apache.hadoop.hbase.mapreduce.Export
> > > > > > / org.apache.hadoop.hbase.mapreduce.Import to backup and restore
> > > HBase
> > > > > > data, at least it's good to me, i would like to know if there are
> > any
> > > > > > better solutions or practices on how to backup HBase data, that
> > will
> > > be
> > > > > > really helpful for us, thanks.
> > > > > >
> > > > > > Cheers
> > > > > > aij
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: What is the best practice for backup HBase data?

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.

Hi,

For incremental, you might want to look at replication....

JM


2014-04-02 9:49 GMT-04:00 R W <ai...@gmail.com>:

> Hi JM
>
> Here is the problem if we use the ExportSnapshot feature to export data to
> the dest cluster, it seems we cannot do incremental backup.
>
> Thanks
> aij
>
>
> On Wed, Apr 2, 2014 at 7:54 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org
> > wrote:
>
> > Hi,
> >
> > You can take a look at replication. Activate replication from a date X
> and
> > then copy table from date 0 to date x from the origin to the dest
> cluster.
> >
> > You can also export the snapshot. Take a look at 15.8.8 here:
> > http://hbase.apache.org/book/ops.snapshots.html
> >
> > JM
> >
> >
> > 2014-04-02 7:23 GMT-04:00 R W <ai...@gmail.com>:
> >
> > > Hi Esteban
> > >
> > > I checked the snapshot feature and tried myself, it's very good, one of
> > the
> > > introduction
> > >
> > >
> >
> http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/mentioned
> > > about:
> > >
> > > Clone a snapshot: This operation creates a new table using the same
> > schema
> > > > and with the same data present in the specified snapshot. The result
> of
> > > > this operation is a new fully functional table that can can be
> modified
> > > > with no impact on the original table or the snapshot.
> > >
> > >
> > > I think this clone operation will make a copy of the snapshot, then
> > create
> > > the new table from the copy of the snapshot, am i correct? Otherwise,
> > > modification to the new table will change the snapshot, right?
> > >
> > > Another question, if we want to backup hbase data somewhere else, it
> > seems
> > > we cannot go with snapshot feature, we want the data to be backup even
> > > after the whole Hadoop cluster down, any idea?
> > >
> > > Thanks
> > > aij
> > >
> > >
> > > On Wed, Apr 2, 2014 at 2:12 PM, Esteban Gutierrez <
> esteban@cloudera.com
> > > >wrote:
> > >
> > > > Hello Aij,
> > > >
> > > > Snapshots are the suggested method since HBase 0.94.6, they provide
> > > better
> > > > consistency for backing up data in HBase. You can find more
> information
> > > in
> > > > the HBase Book here:
> > > >
> > > > https://hbase.apache.org/book.html#ops.snapshots
> > > >
> > > > Depending on your use case and resources you might want to consider
> > > > replication as well:
> > > >
> > > > http://hbase.apache.org/replication.html
> > > >
> > > > cheers,
> > > > esteban.
> > > >
> > > >
> > > > --
> > > > Cloudera, Inc.
> > > >
> > > >
> > > >
> > > > On Tue, Apr 1, 2014 at 10:56 PM, R W <ai...@gmail.com> wrote:
> > > >
> > > > > Hi Guys
> > > > >
> > > > > I'm using hbase org.apache.hadoop.hbase.mapreduce.Export
> > > > > / org.apache.hadoop.hbase.mapreduce.Import to backup and restore
> > HBase
> > > > > data, at least it's good to me, i would like to know if there are
> any
> > > > > better solutions or practices on how to backup HBase data, that
> will
> > be
> > > > > really helpful for us, thanks.
> > > > >
> > > > > Cheers
> > > > > aij
> > > > >
> > > >
> > >
> >
>

Re: How to decide the next HMaster?

Posted by Mikhail Antonov <ol...@gmail.com>.

http://zookeeper.apache.org/doc/trunk/recipes.html#sc_leaderElection

This is the recipe how simple ZK-based leader election goes.

-Mikhail


2014-04-23 8:24 GMT-07:00 gortiz <go...@pragsis.com>:

>
> But, it's the first which responses to Zookeeper or which creates the
> znode? I don't know how it works exactly this process, where could I read
> more about it?
>
> On 08/04/14 18:57, Jean-Daniel Cryans wrote:
>
>> It's a simple leader election via ZooKeeper.
>>
>> J-D
>>
>>
>> On Tue, Apr 8, 2014 at 7:18 AM, gortiz <go...@pragsis.com> wrote:
>>
>>  Could someone explain me which it's the process to select the next
>>> HMaster
>>> when the current one is gone down?? I've been looking for information
>>> about
>>> it in the documentation, but, I haven't found anything.
>>>
>>>
>>>
>>>
>
> --
> *Guillermo Ortiz*
> /Big Data Developer/
>
> Telf.: +34 917 680 490
> Fax: +34 913 833 301
> C/ Manuel Tovar, 49-53 - 28034 Madrid - Spain
>
> _http://www.bidoop.es_
>
>


-- 
Thanks,
Michael Antonov

Re: How to decide the next HMaster?

Posted by gortiz <go...@pragsis.com>.

But, it's the first which responses to Zookeeper or which creates the 
znode? I don't know how it works exactly this process, where could I 
read more about it?

On 08/04/14 18:57, Jean-Daniel Cryans wrote:
> It's a simple leader election via ZooKeeper.
>
> J-D
>
>
> On Tue, Apr 8, 2014 at 7:18 AM, gortiz <go...@pragsis.com> wrote:
>
>> Could someone explain me which it's the process to select the next HMaster
>> when the current one is gone down?? I've been looking for information about
>> it in the documentation, but, I haven't found anything.
>>
>>
>>


-- 
*Guillermo Ortiz*
/Big Data Developer/

Telf.: +34 917 680 490
Fax: +34 913 833 301
C/ Manuel Tovar, 49-53 - 28034 Madrid - Spain

_http://www.bidoop.es_

Re: How to decide the next HMaster?

Posted by Jean-Daniel Cryans <jd...@apache.org>.

It's a simple leader election via ZooKeeper.

J-D

On Tue, Apr 8, 2014 at 7:18 AM, gortiz <go...@pragsis.com> wrote:

> Could someone explain me which it's the process to select the next HMaster
> when the current one is gone down?? I've been looking for information about
> it in the documentation, but, I haven't found anything.
>
>
>

How to decide the next HMaster?

Posted by gortiz <go...@pragsis.com>.

Could someone explain me which it's the process to select the next 
HMaster when the current one is gone down?? I've been looking for 
information about it in the documentation, but, I haven't found anything.

Re: What is the best practice for backup HBase data?

Posted by R W <ai...@gmail.com>.

Hi JM

Here is the problem if we use the ExportSnapshot feature to export data to
the dest cluster, it seems we cannot do incremental backup.

Thanks
aij


On Wed, Apr 2, 2014 at 7:54 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> Hi,
>
> You can take a look at replication. Activate replication from a date X and
> then copy table from date 0 to date x from the origin to the dest cluster.
>
> You can also export the snapshot. Take a look at 15.8.8 here:
> http://hbase.apache.org/book/ops.snapshots.html
>
> JM
>
>
> 2014-04-02 7:23 GMT-04:00 R W <ai...@gmail.com>:
>
> > Hi Esteban
> >
> > I checked the snapshot feature and tried myself, it's very good, one of
> the
> > introduction
> >
> >
> http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/mentioned
> > about:
> >
> > Clone a snapshot: This operation creates a new table using the same
> schema
> > > and with the same data present in the specified snapshot. The result of
> > > this operation is a new fully functional table that can can be modified
> > > with no impact on the original table or the snapshot.
> >
> >
> > I think this clone operation will make a copy of the snapshot, then
> create
> > the new table from the copy of the snapshot, am i correct? Otherwise,
> > modification to the new table will change the snapshot, right?
> >
> > Another question, if we want to backup hbase data somewhere else, it
> seems
> > we cannot go with snapshot feature, we want the data to be backup even
> > after the whole Hadoop cluster down, any idea?
> >
> > Thanks
> > aij
> >
> >
> > On Wed, Apr 2, 2014 at 2:12 PM, Esteban Gutierrez <esteban@cloudera.com
> > >wrote:
> >
> > > Hello Aij,
> > >
> > > Snapshots are the suggested method since HBase 0.94.6, they provide
> > better
> > > consistency for backing up data in HBase. You can find more information
> > in
> > > the HBase Book here:
> > >
> > > https://hbase.apache.org/book.html#ops.snapshots
> > >
> > > Depending on your use case and resources you might want to consider
> > > replication as well:
> > >
> > > http://hbase.apache.org/replication.html
> > >
> > > cheers,
> > > esteban.
> > >
> > >
> > > --
> > > Cloudera, Inc.
> > >
> > >
> > >
> > > On Tue, Apr 1, 2014 at 10:56 PM, R W <ai...@gmail.com> wrote:
> > >
> > > > Hi Guys
> > > >
> > > > I'm using hbase org.apache.hadoop.hbase.mapreduce.Export
> > > > / org.apache.hadoop.hbase.mapreduce.Import to backup and restore
> HBase
> > > > data, at least it's good to me, i would like to know if there are any
> > > > better solutions or practices on how to backup HBase data, that will
> be
> > > > really helpful for us, thanks.
> > > >
> > > > Cheers
> > > > aij
> > > >
> > >
> >
>

Re: What is the best practice for backup HBase data?

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.

Hi,

You can take a look at replication. Activate replication from a date X and
then copy table from date 0 to date x from the origin to the dest cluster.

You can also export the snapshot. Take a look at 15.8.8 here:
http://hbase.apache.org/book/ops.snapshots.html

JM


2014-04-02 7:23 GMT-04:00 R W <ai...@gmail.com>:

> Hi Esteban
>
> I checked the snapshot feature and tried myself, it's very good, one of the
> introduction
>
> http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/mentioned
> about:
>
> Clone a snapshot: This operation creates a new table using the same schema
> > and with the same data present in the specified snapshot. The result of
> > this operation is a new fully functional table that can can be modified
> > with no impact on the original table or the snapshot.
>
>
> I think this clone operation will make a copy of the snapshot, then create
> the new table from the copy of the snapshot, am i correct? Otherwise,
> modification to the new table will change the snapshot, right?
>
> Another question, if we want to backup hbase data somewhere else, it seems
> we cannot go with snapshot feature, we want the data to be backup even
> after the whole Hadoop cluster down, any idea?
>
> Thanks
> aij
>
>
> On Wed, Apr 2, 2014 at 2:12 PM, Esteban Gutierrez <esteban@cloudera.com
> >wrote:
>
> > Hello Aij,
> >
> > Snapshots are the suggested method since HBase 0.94.6, they provide
> better
> > consistency for backing up data in HBase. You can find more information
> in
> > the HBase Book here:
> >
> > https://hbase.apache.org/book.html#ops.snapshots
> >
> > Depending on your use case and resources you might want to consider
> > replication as well:
> >
> > http://hbase.apache.org/replication.html
> >
> > cheers,
> > esteban.
> >
> >
> > --
> > Cloudera, Inc.
> >
> >
> >
> > On Tue, Apr 1, 2014 at 10:56 PM, R W <ai...@gmail.com> wrote:
> >
> > > Hi Guys
> > >
> > > I'm using hbase org.apache.hadoop.hbase.mapreduce.Export
> > > / org.apache.hadoop.hbase.mapreduce.Import to backup and restore HBase
> > > data, at least it's good to me, i would like to know if there are any
> > > better solutions or practices on how to backup HBase data, that will be
> > > really helpful for us, thanks.
> > >
> > > Cheers
> > > aij
> > >
> >
>

Re: What is the best practice for backup HBase data?

Posted by R W <ai...@gmail.com>.

Hi Esteban

I checked the snapshot feature and tried myself, it's very good, one of the
introduction
http://blog.cloudera.com/blog/2013/03/introduction-to-apache-hbase-snapshots/mentioned
about:

Clone a snapshot: This operation creates a new table using the same schema
> and with the same data present in the specified snapshot. The result of
> this operation is a new fully functional table that can can be modified
> with no impact on the original table or the snapshot.


I think this clone operation will make a copy of the snapshot, then create
the new table from the copy of the snapshot, am i correct? Otherwise,
modification to the new table will change the snapshot, right?

Another question, if we want to backup hbase data somewhere else, it seems
we cannot go with snapshot feature, we want the data to be backup even
after the whole Hadoop cluster down, any idea?

Thanks
aij


On Wed, Apr 2, 2014 at 2:12 PM, Esteban Gutierrez <es...@cloudera.com>wrote:

> Hello Aij,
>
> Snapshots are the suggested method since HBase 0.94.6, they provide better
> consistency for backing up data in HBase. You can find more information in
> the HBase Book here:
>
> https://hbase.apache.org/book.html#ops.snapshots
>
> Depending on your use case and resources you might want to consider
> replication as well:
>
> http://hbase.apache.org/replication.html
>
> cheers,
> esteban.
>
>
> --
> Cloudera, Inc.
>
>
>
> On Tue, Apr 1, 2014 at 10:56 PM, R W <ai...@gmail.com> wrote:
>
> > Hi Guys
> >
> > I'm using hbase org.apache.hadoop.hbase.mapreduce.Export
> > / org.apache.hadoop.hbase.mapreduce.Import to backup and restore HBase
> > data, at least it's good to me, i would like to know if there are any
> > better solutions or practices on how to backup HBase data, that will be
> > really helpful for us, thanks.
> >
> > Cheers
> > aij
> >
>

Re: What is the best practice for backup HBase data?

Posted by Esteban Gutierrez <es...@cloudera.com>.

Hello Aij,

Snapshots are the suggested method since HBase 0.94.6, they provide better
consistency for backing up data in HBase. You can find more information in
the HBase Book here:

https://hbase.apache.org/book.html#ops.snapshots

Depending on your use case and resources you might want to consider
replication as well:

http://hbase.apache.org/replication.html

cheers,
esteban.

--
Cloudera, Inc.

On Tue, Apr 1, 2014 at 10:56 PM, R W <ai...@gmail.com> wrote:

> Hi Guys
>
> I'm using hbase org.apache.hadoop.hbase.mapreduce.Export
> / org.apache.hadoop.hbase.mapreduce.Import to backup and restore HBase
> data, at least it's good to me, i would like to know if there are any
> better solutions or practices on how to backup HBase data, that will be
> really helpful for us, thanks.
>
> Cheers
> aij
>