You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Tianying Chang <ty...@gmail.com> on 2014/03/25 22:08:24 UTC

no-flush based snapshot policy?

Hi,

I need a new snapshot policy which sits in between the disabled and flushed
version. So, basically:
I cannot disable the table, but I also don't need the snapshot to be that
"consistent" where all RS coordinated to flush the region before taking the
snapshot.

Re: no-flush based snapshot policy?

Posted by Demai Ni <ni...@gmail.com>.
Ted,

you are right. We are targeting HBASE-7912 to 1.0 and 0.98, which 1.0 as
the priority for now.  :-)

BTW, we have some code to leverage HBASE-9426 so that we can go distributed
LOG roll at RS level before taking snapshot. I will open a jira to share
that code for discussion purpose.

Demai


On Wed, Apr 2, 2014 at 6:19 PM, Ted Yu <yu...@gmail.com> wrote:

> HBASE-9426 (Make custom distributed barrier procedure pluggable) has been
> back ported to 0.98
> So porting the work from HBASE-7912 to 0.98 would be relatively easy.
>
> I am not aware of this going into 0.94
>
> Cheers
>
>
> On Wed, Apr 2, 2014 at 6:13 PM, Varun Sharma <va...@pinterest.com> wrote:
>
> > Seems like those JIRAs are 1.0 - did not see a 0.94 version # there ?
> >
> >
> > On Wed, Apr 2, 2014 at 1:40 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > Tianying:
> > > Have you seen the design doc attached to HBASE-7912 'HBase
> Backup/Restore
> > > Based on HBase Snapshot' ?
> > >
> > > Cheers
> > >
> > > >
> > > > > On Tue, Mar 25, 2014 at 2:38 PM, Tianying Chang <tychang@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I need a new snapshot policy. Basically, I cannot disable the
> > table,
> > > > but
> > > > > I
> > > > > > also don't need the snapshot to be that "consistent" where all RS
> > > > > > coordinated to flush the region before taking the snapshot since
> it
> > > > slow
> > > > > > down production cluster when flush take too long. It is OK for me
> > if
> > > > the
> > > > > > snapshot missed the data in memstore because I will use WALPlayer
> > to
> > > > fill
> > > > > > the data gap that is not in the snapshot but has been persisted
> (in
> > > > WAL).
> > > > > > So I should have no data loss.
> > > > > >
> > > > > > As a quick hack way to test this in my hbase backup workflow, I
> > just
> > > > add
> > > > > a
> > > > > > config key, and skip the flushcache() in file
> > > > > > *regionserver/snapshot/FlushSnapshotSubprocedure.java*, something
> > > like
> > > > > > below.  It seems works fine for me, where all data are recovered
> > in a
> > > > new
> > > > > > cluster after running WALPlayer.
> > > > > >
> > > > > > Does anyone see any problem like data corruption, etc?
> > > > > >
> > > > > >
> > > > > > LOG.debug("Flush Snapshotting region " + region.toString() + "
> > > > > > started...");
> > > > > > if (noFlushNeeded)
> > > > > > {
> > > > > >    LOG.debug("No flush before taking snapshot");
> > > > > > } else
> > > > > > {
> > > > > >     region.flushcache();
> > > > > > }
> > > > > >
> > > > > > If there is no data corruption issue with this policy, I can add
> an
> > > > > > parameter from hbase shell, so that people can dynamically decide
> > > when
> > > > to
> > > > > > use no-flush snapshot.
> > > > > >
> > > > > > Thanks
> > > > > > Tian-Ying
> > > > > >
> > > > > > On Tue, Mar 25, 2014 at 2:08 PM, Tianying Chang <
> tychang@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I need a new snapshot policy which sits in between the disabled
> > and
> > > > > > > flushed version. So, basically:
> > > > > > > I cannot disable the table, but I also don't need the snapshot
> to
> > > be
> > > > > that
> > > > > > > "consistent" where all RS coordinated to flush the region
> before
> > > > taking
> > > > > > the
> > > > > > > snapshot.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: no-flush based snapshot policy?

Posted by Ted Yu <yu...@gmail.com>.
HBASE-9426 (Make custom distributed barrier procedure pluggable) has been
back ported to 0.98
So porting the work from HBASE-7912 to 0.98 would be relatively easy.

I am not aware of this going into 0.94

Cheers


On Wed, Apr 2, 2014 at 6:13 PM, Varun Sharma <va...@pinterest.com> wrote:

> Seems like those JIRAs are 1.0 - did not see a 0.94 version # there ?
>
>
> On Wed, Apr 2, 2014 at 1:40 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Tianying:
> > Have you seen the design doc attached to HBASE-7912 'HBase Backup/Restore
> > Based on HBase Snapshot' ?
> >
> > Cheers
> >
> > >
> > > > On Tue, Mar 25, 2014 at 2:38 PM, Tianying Chang <ty...@gmail.com>
> > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I need a new snapshot policy. Basically, I cannot disable the
> table,
> > > but
> > > > I
> > > > > also don't need the snapshot to be that "consistent" where all RS
> > > > > coordinated to flush the region before taking the snapshot since it
> > > slow
> > > > > down production cluster when flush take too long. It is OK for me
> if
> > > the
> > > > > snapshot missed the data in memstore because I will use WALPlayer
> to
> > > fill
> > > > > the data gap that is not in the snapshot but has been persisted (in
> > > WAL).
> > > > > So I should have no data loss.
> > > > >
> > > > > As a quick hack way to test this in my hbase backup workflow, I
> just
> > > add
> > > > a
> > > > > config key, and skip the flushcache() in file
> > > > > *regionserver/snapshot/FlushSnapshotSubprocedure.java*, something
> > like
> > > > > below.  It seems works fine for me, where all data are recovered
> in a
> > > new
> > > > > cluster after running WALPlayer.
> > > > >
> > > > > Does anyone see any problem like data corruption, etc?
> > > > >
> > > > >
> > > > > LOG.debug("Flush Snapshotting region " + region.toString() + "
> > > > > started...");
> > > > > if (noFlushNeeded)
> > > > > {
> > > > >    LOG.debug("No flush before taking snapshot");
> > > > > } else
> > > > > {
> > > > >     region.flushcache();
> > > > > }
> > > > >
> > > > > If there is no data corruption issue with this policy, I can add an
> > > > > parameter from hbase shell, so that people can dynamically decide
> > when
> > > to
> > > > > use no-flush snapshot.
> > > > >
> > > > > Thanks
> > > > > Tian-Ying
> > > > >
> > > > > On Tue, Mar 25, 2014 at 2:08 PM, Tianying Chang <tychang@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I need a new snapshot policy which sits in between the disabled
> and
> > > > > > flushed version. So, basically:
> > > > > > I cannot disable the table, but I also don't need the snapshot to
> > be
> > > > that
> > > > > > "consistent" where all RS coordinated to flush the region before
> > > taking
> > > > > the
> > > > > > snapshot.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: no-flush based snapshot policy?

Posted by Varun Sharma <va...@pinterest.com>.
Seems like those JIRAs are 1.0 - did not see a 0.94 version # there ?


On Wed, Apr 2, 2014 at 1:40 PM, Ted Yu <yu...@gmail.com> wrote:

> Tianying:
> Have you seen the design doc attached to HBASE-7912 'HBase Backup/Restore
> Based on HBase Snapshot' ?
>
> Cheers
>
> >
> > > On Tue, Mar 25, 2014 at 2:38 PM, Tianying Chang <ty...@gmail.com>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > I need a new snapshot policy. Basically, I cannot disable the table,
> > but
> > > I
> > > > also don't need the snapshot to be that "consistent" where all RS
> > > > coordinated to flush the region before taking the snapshot since it
> > slow
> > > > down production cluster when flush take too long. It is OK for me if
> > the
> > > > snapshot missed the data in memstore because I will use WALPlayer to
> > fill
> > > > the data gap that is not in the snapshot but has been persisted (in
> > WAL).
> > > > So I should have no data loss.
> > > >
> > > > As a quick hack way to test this in my hbase backup workflow, I just
> > add
> > > a
> > > > config key, and skip the flushcache() in file
> > > > *regionserver/snapshot/FlushSnapshotSubprocedure.java*, something
> like
> > > > below.  It seems works fine for me, where all data are recovered in a
> > new
> > > > cluster after running WALPlayer.
> > > >
> > > > Does anyone see any problem like data corruption, etc?
> > > >
> > > >
> > > > LOG.debug("Flush Snapshotting region " + region.toString() + "
> > > > started...");
> > > > if (noFlushNeeded)
> > > > {
> > > >    LOG.debug("No flush before taking snapshot");
> > > > } else
> > > > {
> > > >     region.flushcache();
> > > > }
> > > >
> > > > If there is no data corruption issue with this policy, I can add an
> > > > parameter from hbase shell, so that people can dynamically decide
> when
> > to
> > > > use no-flush snapshot.
> > > >
> > > > Thanks
> > > > Tian-Ying
> > > >
> > > > On Tue, Mar 25, 2014 at 2:08 PM, Tianying Chang <ty...@gmail.com>
> > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I need a new snapshot policy which sits in between the disabled and
> > > > > flushed version. So, basically:
> > > > > I cannot disable the table, but I also don't need the snapshot to
> be
> > > that
> > > > > "consistent" where all RS coordinated to flush the region before
> > taking
> > > > the
> > > > > snapshot.
> > > > >
> > > >
> > >
> >
>

Re: no-flush based snapshot policy?

Posted by Ted Yu <yu...@gmail.com>.
Tianying:
Have you seen the design doc attached to HBASE-7912 'HBase Backup/Restore
Based on HBase Snapshot' ?

Cheers

>
> > On Tue, Mar 25, 2014 at 2:38 PM, Tianying Chang <ty...@gmail.com>
> wrote:
> >
> > > Hi,
> > >
> > > I need a new snapshot policy. Basically, I cannot disable the table,
> but
> > I
> > > also don't need the snapshot to be that "consistent" where all RS
> > > coordinated to flush the region before taking the snapshot since it
> slow
> > > down production cluster when flush take too long. It is OK for me if
> the
> > > snapshot missed the data in memstore because I will use WALPlayer to
> fill
> > > the data gap that is not in the snapshot but has been persisted (in
> WAL).
> > > So I should have no data loss.
> > >
> > > As a quick hack way to test this in my hbase backup workflow, I just
> add
> > a
> > > config key, and skip the flushcache() in file
> > > *regionserver/snapshot/FlushSnapshotSubprocedure.java*, something like
> > > below.  It seems works fine for me, where all data are recovered in a
> new
> > > cluster after running WALPlayer.
> > >
> > > Does anyone see any problem like data corruption, etc?
> > >
> > >
> > > LOG.debug("Flush Snapshotting region " + region.toString() + "
> > > started...");
> > > if (noFlushNeeded)
> > > {
> > >    LOG.debug("No flush before taking snapshot");
> > > } else
> > > {
> > >     region.flushcache();
> > > }
> > >
> > > If there is no data corruption issue with this policy, I can add an
> > > parameter from hbase shell, so that people can dynamically decide when
> to
> > > use no-flush snapshot.
> > >
> > > Thanks
> > > Tian-Ying
> > >
> > > On Tue, Mar 25, 2014 at 2:08 PM, Tianying Chang <ty...@gmail.com>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > I need a new snapshot policy which sits in between the disabled and
> > > > flushed version. So, basically:
> > > > I cannot disable the table, but I also don't need the snapshot to be
> > that
> > > > "consistent" where all RS coordinated to flush the region before
> taking
> > > the
> > > > snapshot.
> > > >
> > >
> >
>

Re: no-flush based snapshot policy?

Posted by Tianying Chang <ty...@gmail.com>.
Cool. Thanks for the confirmation. For my case, I can just add the new
config key in the hbase-site.xml and then restart the RS to take effect,
and use that all the time. Not sure if it is good enough if people also
want both flush and non-flush snapshot available without restarting RS.

Thanks
Tian-Ying


On Tue, Mar 25, 2014 at 4:17 PM, Matteo Bertozzi <th...@gmail.com>wrote:

> There is no data corruption or other kind of problems by skipping the
> flush.
> It will only not include the memstore data in the snapshot, which is
> basically what you asking for.
> so, sounds good to me if you want add that flag.
>
> probably having a shell flag will be "harder" to implement, since you have
> to pass it to the master
> and then add it to the snapshot information in the zk procedure, and then
> read it from the RS.
> Not a big thing, but you have touch lots of different places. It is not
> like a static conf property that you read on the RS and you are done.
>
> Matteo
>
>
>
> On Tue, Mar 25, 2014 at 2:38 PM, Tianying Chang <ty...@gmail.com> wrote:
>
> > Hi,
> >
> > I need a new snapshot policy. Basically, I cannot disable the table, but
> I
> > also don't need the snapshot to be that "consistent" where all RS
> > coordinated to flush the region before taking the snapshot since it slow
> > down production cluster when flush take too long. It is OK for me if the
> > snapshot missed the data in memstore because I will use WALPlayer to fill
> > the data gap that is not in the snapshot but has been persisted (in WAL).
> > So I should have no data loss.
> >
> > As a quick hack way to test this in my hbase backup workflow, I just add
> a
> > config key, and skip the flushcache() in file
> > *regionserver/snapshot/FlushSnapshotSubprocedure.java*, something like
> > below.  It seems works fine for me, where all data are recovered in a new
> > cluster after running WALPlayer.
> >
> > Does anyone see any problem like data corruption, etc?
> >
> >
> > LOG.debug("Flush Snapshotting region " + region.toString() + "
> > started...");
> > if (noFlushNeeded)
> > {
> >    LOG.debug("No flush before taking snapshot");
> > } else
> > {
> >     region.flushcache();
> > }
> >
> > If there is no data corruption issue with this policy, I can add an
> > parameter from hbase shell, so that people can dynamically decide when to
> > use no-flush snapshot.
> >
> > Thanks
> > Tian-Ying
> >
> > On Tue, Mar 25, 2014 at 2:08 PM, Tianying Chang <ty...@gmail.com>
> wrote:
> >
> > > Hi,
> > >
> > > I need a new snapshot policy which sits in between the disabled and
> > > flushed version. So, basically:
> > > I cannot disable the table, but I also don't need the snapshot to be
> that
> > > "consistent" where all RS coordinated to flush the region before taking
> > the
> > > snapshot.
> > >
> >
>

Re: no-flush based snapshot policy?

Posted by Matteo Bertozzi <th...@gmail.com>.
There is no data corruption or other kind of problems by skipping the flush.
It will only not include the memstore data in the snapshot, which is
basically what you asking for.
so, sounds good to me if you want add that flag.

probably having a shell flag will be "harder" to implement, since you have
to pass it to the master
and then add it to the snapshot information in the zk procedure, and then
read it from the RS.
Not a big thing, but you have touch lots of different places. It is not
like a static conf property that you read on the RS and you are done.

Matteo



On Tue, Mar 25, 2014 at 2:38 PM, Tianying Chang <ty...@gmail.com> wrote:

> Hi,
>
> I need a new snapshot policy. Basically, I cannot disable the table, but I
> also don't need the snapshot to be that "consistent" where all RS
> coordinated to flush the region before taking the snapshot since it slow
> down production cluster when flush take too long. It is OK for me if the
> snapshot missed the data in memstore because I will use WALPlayer to fill
> the data gap that is not in the snapshot but has been persisted (in WAL).
> So I should have no data loss.
>
> As a quick hack way to test this in my hbase backup workflow, I just add a
> config key, and skip the flushcache() in file
> *regionserver/snapshot/FlushSnapshotSubprocedure.java*, something like
> below.  It seems works fine for me, where all data are recovered in a new
> cluster after running WALPlayer.
>
> Does anyone see any problem like data corruption, etc?
>
>
> LOG.debug("Flush Snapshotting region " + region.toString() + "
> started...");
> if (noFlushNeeded)
> {
>    LOG.debug("No flush before taking snapshot");
> } else
> {
>     region.flushcache();
> }
>
> If there is no data corruption issue with this policy, I can add an
> parameter from hbase shell, so that people can dynamically decide when to
> use no-flush snapshot.
>
> Thanks
> Tian-Ying
>
> On Tue, Mar 25, 2014 at 2:08 PM, Tianying Chang <ty...@gmail.com> wrote:
>
> > Hi,
> >
> > I need a new snapshot policy which sits in between the disabled and
> > flushed version. So, basically:
> > I cannot disable the table, but I also don't need the snapshot to be that
> > "consistent" where all RS coordinated to flush the region before taking
> the
> > snapshot.
> >
>

Re: no-flush based snapshot policy?

Posted by Tianying Chang <ty...@gmail.com>.
Hi,

I need a new snapshot policy. Basically, I cannot disable the table, but I
also don't need the snapshot to be that "consistent" where all RS
coordinated to flush the region before taking the snapshot since it slow
down production cluster when flush take too long. It is OK for me if the
snapshot missed the data in memstore because I will use WALPlayer to fill
the data gap that is not in the snapshot but has been persisted (in WAL).
So I should have no data loss.

As a quick hack way to test this in my hbase backup workflow, I just add a
config key, and skip the flushcache() in file
*regionserver/snapshot/FlushSnapshotSubprocedure.java*, something like
below.  It seems works fine for me, where all data are recovered in a new
cluster after running WALPlayer.

Does anyone see any problem like data corruption, etc?


LOG.debug("Flush Snapshotting region " + region.toString() + " started...");
if (noFlushNeeded)
{
   LOG.debug("No flush before taking snapshot");
} else
{
    region.flushcache();
}

If there is no data corruption issue with this policy, I can add an
parameter from hbase shell, so that people can dynamically decide when to
use no-flush snapshot.

Thanks
Tian-Ying

On Tue, Mar 25, 2014 at 2:08 PM, Tianying Chang <ty...@gmail.com> wrote:

> Hi,
>
> I need a new snapshot policy which sits in between the disabled and
> flushed version. So, basically:
> I cannot disable the table, but I also don't need the snapshot to be that
> "consistent" where all RS coordinated to flush the region before taking the
> snapshot.
>

Re: no-flush based snapshot policy?

Posted by Tianying Chang <ty...@gmail.com>.
Sorry, sent hit accidentally, please ignore this one. will send again after
done.


On Tue, Mar 25, 2014 at 2:08 PM, Tianying Chang <ty...@gmail.com> wrote:

> Hi,
>
> I need a new snapshot policy which sits in between the disabled and
> flushed version. So, basically:
> I cannot disable the table, but I also don't need the snapshot to be that
> "consistent" where all RS coordinated to flush the region before taking the
> snapshot.
>