You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Biju N <bi...@gmail.com> on 2019/05/03 19:44:20 UTC

Re: Purposefully keeping around WAL files

Hi Sean, Is there a JIRA ticket for this to follow?

On Sat, Mar 16, 2019 at 2:00 PM Andrew Purtell <an...@gmail.com>
wrote:

> Running the file through a standard compressor. Makes handling more
> straightforward eg copy to local filesystem and extraction. We could wait
> to do it until all references to the WAL file are gone so as to not
> complicate things like replication.
>
>
> > On Mar 16, 2019, at 10:17 AM, Sean Busbey <bu...@apache.org> wrote:
> >
> > Yeah I like the idea of compressing them. you thinking of rewriting
> > them with the wal compression feature enabled, or just something
> > simple like running the whole file through a compressor? Maybe I
> > should poke at what difference in resultant file size looks like.
> >
> > IIRC things already get moved out to archive before being deleted.
> > There's a default TTL of something like 10 minutes before a WAL can be
> > deleted from the archive area.
> >
> > disadvantage to always compressing archived WALs would be overhead to
> > the Replication process? anything else?
> >
> > On Sat, Mar 16, 2019 at 10:51 AM Andrew Purtell
> > <an...@gmail.com> wrote:
> >>
> >> How about an option that tells the cleaner to archive them, with
> compression? There’s a lot of wastage in WAL files due to repeated
> information, and reasons to not enable WAL compression for live files, but
> I think little reason not to rewrite an archived WAL file with a typical
> and standard archival compression format like BZIP if retaining it for only
> possible debugging purposes. (Or maybe a home grown incremental backup
> solution built on snapshots and log replay. Or...)
> >>
> >> So, a switch that tells the cleaner to archive rather than delete, and
> maybe another toggle that starts a background task to find archived WALs
> that are uncompressed and compress them, only removing them once the
> compressed version is in place. Compress, optionally, in a temporary
> location with final atomic rename like compaction.
> >>
> >> ?
> >>
> >>
> >>> On Mar 16, 2019, at 7:01 AM, Sean Busbey <bu...@apache.org> wrote:
> >>>
> >>> Hi folks!
> >>>
> >>> Sometimes while working to diagnose an HBase failure in production
> settings
> >>> I need to ensure WALs stick around so that I can examine or possibly
> replay
> >>> them. For difficult problems on clusters with plenty of HDFS space
> relative
> >>> to the HBase write workload sometimes that might mean for days or a
> week.
> >>>
> >>> The way I've always done this is by setting up placeholder replication
> >>> information for a peer that's disabled. It nicely makes the cleaner
> chore
> >>> pass over things, doesn't require a restart of anything, and has a
> >>> relatively straight forward way to go back to normal.
> >>>
> >>> Lately I've been thinking that I do this often enough that a command
> for it
> >>> would be better (kind of like how we can turn the balancer on and off).
> >>>
> >>> How do other folks handle this operational need? Am I just missing an
> >>> easier way?
> >>>
> >>> If a new command is needed, what do folks think the minimally useful
> >>> version is? Keep all WALs until told otherwise? Limit to most
> recent/oldest
> >>> X bytes? Limit to files that include edits to certain
> >>> namespace/table/region?
>

Re: Purposefully keeping around WAL files

Posted by Sean Busbey <bu...@apache.org>.
Nope, didn't get far enough in specifying an approach to file a JIRA.

If you're up for making a go of it, feel free to start a new one.

On Fri, May 3, 2019, 14:44 Biju N <bi...@gmail.com> wrote:

> Hi Sean, Is there a JIRA ticket for this to follow?
>
> On Sat, Mar 16, 2019 at 2:00 PM Andrew Purtell <an...@gmail.com>
> wrote:
>
> > Running the file through a standard compressor. Makes handling more
> > straightforward eg copy to local filesystem and extraction. We could wait
> > to do it until all references to the WAL file are gone so as to not
> > complicate things like replication.
> >
> >
> > > On Mar 16, 2019, at 10:17 AM, Sean Busbey <bu...@apache.org> wrote:
> > >
> > > Yeah I like the idea of compressing them. you thinking of rewriting
> > > them with the wal compression feature enabled, or just something
> > > simple like running the whole file through a compressor? Maybe I
> > > should poke at what difference in resultant file size looks like.
> > >
> > > IIRC things already get moved out to archive before being deleted.
> > > There's a default TTL of something like 10 minutes before a WAL can be
> > > deleted from the archive area.
> > >
> > > disadvantage to always compressing archived WALs would be overhead to
> > > the Replication process? anything else?
> > >
> > > On Sat, Mar 16, 2019 at 10:51 AM Andrew Purtell
> > > <an...@gmail.com> wrote:
> > >>
> > >> How about an option that tells the cleaner to archive them, with
> > compression? There’s a lot of wastage in WAL files due to repeated
> > information, and reasons to not enable WAL compression for live files,
> but
> > I think little reason not to rewrite an archived WAL file with a typical
> > and standard archival compression format like BZIP if retaining it for
> only
> > possible debugging purposes. (Or maybe a home grown incremental backup
> > solution built on snapshots and log replay. Or...)
> > >>
> > >> So, a switch that tells the cleaner to archive rather than delete, and
> > maybe another toggle that starts a background task to find archived WALs
> > that are uncompressed and compress them, only removing them once the
> > compressed version is in place. Compress, optionally, in a temporary
> > location with final atomic rename like compaction.
> > >>
> > >> ?
> > >>
> > >>
> > >>> On Mar 16, 2019, at 7:01 AM, Sean Busbey <bu...@apache.org> wrote:
> > >>>
> > >>> Hi folks!
> > >>>
> > >>> Sometimes while working to diagnose an HBase failure in production
> > settings
> > >>> I need to ensure WALs stick around so that I can examine or possibly
> > replay
> > >>> them. For difficult problems on clusters with plenty of HDFS space
> > relative
> > >>> to the HBase write workload sometimes that might mean for days or a
> > week.
> > >>>
> > >>> The way I've always done this is by setting up placeholder
> replication
> > >>> information for a peer that's disabled. It nicely makes the cleaner
> > chore
> > >>> pass over things, doesn't require a restart of anything, and has a
> > >>> relatively straight forward way to go back to normal.
> > >>>
> > >>> Lately I've been thinking that I do this often enough that a command
> > for it
> > >>> would be better (kind of like how we can turn the balancer on and
> off).
> > >>>
> > >>> How do other folks handle this operational need? Am I just missing an
> > >>> easier way?
> > >>>
> > >>> If a new command is needed, what do folks think the minimally useful
> > >>> version is? Keep all WALs until told otherwise? Limit to most
> > recent/oldest
> > >>> X bytes? Limit to files that include edits to certain
> > >>> namespace/table/region?
> >
>