You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Asaf Mesika <as...@gmail.com> on 2013/04/30 18:07:58 UTC

Re: discp versus export

The replication.html reference appears to contain a reference to a bug
(2611) which was solved two years ago :)


On Wed, Mar 6, 2013 at 12:15 AM, Damien Hardy <dh...@viadeoteam.com> wrote:

> IMO the easier would be hbase export. For long term offline backup (for
> disaster recovery). It can even be stored on a different hdfs storage than
> the one used by hbase using a full hdfs:// url as destination directory.
> Le 5 mars 2013 22:52, "Leonid Fedotov" <lf...@hortonworks.com> a écrit
> :
>
> > Rita,
> > it seems like replication will be the best option for you.
> > Take a look on this doc:
> > http://hbase.apache.org/replication.html
> >
> > Thank you!
> >
> > Sincerely,
> > Leonid Fedotov
> > On Mar 4, 2013, at 4:18 PM, Rita wrote:
> >
> > > the end goal is to have a backup of our hbase tables.
> > >
> > >
> > > On Mon, Mar 4, 2013 at 7:10 AM, Kevin O'dell <kevin.odell@cloudera.com
> > >wrote:
> > >
> > >> DistCP is typically used for HDFS level back up jobs.  It can be used
> > for
> > >> HBase but can be quite tricky.  I would recommend using Export,
> > CopyTable,
> > >> or Replication.  These are tools designed for HBase backup.  What is
> the
> > >> end goal?
> > >>
> > >> On Mon, Mar 4, 2013 at 7:00 AM, Manish Bhoge <
> > manishbhoge@rocketmail.com>wrote:
> > >>
> > >>> Export and distcp has different application. Use discp when you need
> to
> > >>> move data across clusters. Do you want to export table data outside
> > your
> > >>> cluster? If not then export table is better.
> > >>>
> > >>> Sent from HTC via Rocket! excuse typo.
> > >>>
> > >>>
> > >>
> > >>
> > >> --
> > >> Kevin O'Dell
> > >> Customer Operations Engineer, Cloudera
> > >>
> > >
> > >
> > >
> > > --
> > > --- Get your facts first, then you can distort them as you please.--
> >
> >
>

Re: discp versus export

Posted by Suraj Varma <sv...@gmail.com>.
Read this: http://blog.sematext.com/2011/03/11/hbase-backup-options/ for
the high level difference between export and distcp.
The key factor here is the data in memstore that has not been flushed out
to disk yet ... and the resultant inconsistency if you just do distcp.
--Suraj


On Tue, Apr 30, 2013 at 9:07 AM, Asaf Mesika <as...@gmail.com> wrote:

> The replication.html reference appears to contain a reference to a bug
> (2611) which was solved two years ago :)
>
>
> On Wed, Mar 6, 2013 at 12:15 AM, Damien Hardy <dh...@viadeoteam.com>
> wrote:
>
> > IMO the easier would be hbase export. For long term offline backup (for
> > disaster recovery). It can even be stored on a different hdfs storage
> than
> > the one used by hbase using a full hdfs:// url as destination directory.
> > Le 5 mars 2013 22:52, "Leonid Fedotov" <lf...@hortonworks.com> a
> écrit
> > :
> >
> > > Rita,
> > > it seems like replication will be the best option for you.
> > > Take a look on this doc:
> > > http://hbase.apache.org/replication.html
> > >
> > > Thank you!
> > >
> > > Sincerely,
> > > Leonid Fedotov
> > > On Mar 4, 2013, at 4:18 PM, Rita wrote:
> > >
> > > > the end goal is to have a backup of our hbase tables.
> > > >
> > > >
> > > > On Mon, Mar 4, 2013 at 7:10 AM, Kevin O'dell <
> kevin.odell@cloudera.com
> > > >wrote:
> > > >
> > > >> DistCP is typically used for HDFS level back up jobs.  It can be
> used
> > > for
> > > >> HBase but can be quite tricky.  I would recommend using Export,
> > > CopyTable,
> > > >> or Replication.  These are tools designed for HBase backup.  What is
> > the
> > > >> end goal?
> > > >>
> > > >> On Mon, Mar 4, 2013 at 7:00 AM, Manish Bhoge <
> > > manishbhoge@rocketmail.com>wrote:
> > > >>
> > > >>> Export and distcp has different application. Use discp when you
> need
> > to
> > > >>> move data across clusters. Do you want to export table data outside
> > > your
> > > >>> cluster? If not then export table is better.
> > > >>>
> > > >>> Sent from HTC via Rocket! excuse typo.
> > > >>>
> > > >>>
> > > >>
> > > >>
> > > >> --
> > > >> Kevin O'Dell
> > > >> Customer Operations Engineer, Cloudera
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > --- Get your facts first, then you can distort them as you please.--
> > >
> > >
> >
>