You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Neil Yalowitz <ne...@gmail.com> on 2012/01/12 18:56:56 UTC

heavy writing and compaction storms

Hi all,

What strategies do HBase 0.90 users employ to deal with or avoid the
so-called "compaction storm"?  I'm referring to the issue referred to in
2.8.2.7 here:

http://hbase.apache.org/book.html#important_configurations

The MR job I'm working with executes many PUTs during the Map phase with
HTable.put() in batches of 1,000.  The keys are well distributed which,
while ideal for evenly distributed PUT performance, creates a level
increase of Storefiles on all regions.  When a compaction threshold is
reached for one region, it is usually reached for many regions... causing
many, many regions to request compaction.  Seems like a classic "compaction
storm" problem.  With a thousand regions all requesting compaction, the
compactionQueueSize will quickly climb for a server.

Some options we have discussed for this problem:

1) an HBase cooldown - slowing down the writes by feeding the input files
at a slower interval

I'm not certain this will fix the problem.  It still seems likely that
evenly distributed writes will eventually trigger many regions to request
compaction.

2) an HBase cooldown with a major_compact - disabling all automatic
compaction by setting the compaction thresholds at a very high number and
then running a major_compact on the two tables our MR job writes to

I'm using the following settings to completely disable all compaction:

hbase.regionserver.thread.splitcompactcheckfrequency = Integer.MAX_VALUE
(is this setting deprecated in 0.90?  what about 0.92?)
hbase.hstore.compactionThreshold = Integer.MAX_VALUE
hbase.hstore.blockingStoreFiles  = Integer.MAX_VALUE
hbase.hstore.compaction.max = Integer.MAX_VALUE
hbase.hstore.blockingWaitTime = 0

This looks ugly, but it seems to be the only way to ensure that compaction
will not occur (unless I'm missing something).  Obviously, a system that is
not periodically manually compacted will eventually go down in flames with
these settings.

3) manually compact only certain regions - disabling all automatic
compaction as mentioned in #2 and have a separate job that polls the
regions and compacts certain regions according to need, but not allowing
all regions to compact automatically


What are other people's experiences with this issue?  Performing all
compaction during a cooldown period (#2)?  Performing compaction in a
rolling fashion (#3)?  Slower writes (#1)?  Something completely different?


Thanks,

Neil Yalowitz

Re: heavy writing and compaction storms

Posted by Jean-Daniel Cryans <jd...@apache.org>.
On Thu, Jan 12, 2012 at 3:47 PM, Neil Yalowitz <ne...@gmail.com> wrote:
> Thanks for the response, J-D.  Some followups:
>
> Would love to, but I'm dealing with a "increment a counter" issue.  I
> created a separate email thread for that question since it's off-topic from
> the compaction storms.

And I replied :)

> Switching off automatic mode... does this include disabling minor
> compactions?  I can disable the scheduled major compactions like this:
>
> hbase.hregion.majorcompaction = 0

Yes.

>
> ...but this will only stop scheduled major compaction.  What about minor
> compactions that occur during a write-heavy job?  That requires something
> more radical:
>
> hbase.hstore.compactionThreshold = Integer.MAX_VALUE
>
> I think I should probably shoot myself just for even suggesting it, but
> desperation produces desperate solutions...

You could set it higher, but with bigger memstores it shouldn't be an
issue anymore.

> Gotcha.  I assume that this value is set with:
>
> hbase.hregion.memstore.flush.size
>
> ...which is a cluster-wide setting (or perhaps RS-wide).  Choosing the
> really-high-number is a bit tricky though (more about that below).

I meant setting it on the table, you can even do it through the shell.

> Can you expand on this?
>
> A hypothetical:  Assume that the
> hbase.regionserver.global.memstore.upperLimit and lowerLimit for a
> regionserver allows for a heap size of 10GB to be available for memstore
> and we have 10 regions per regionserver.  Should the
> hbase.hregion.memstore.flush.size = 1GB?

Ok so so if you have 10GB, the default for the lower limit makes it
that it will start force flushing memstores once your hit 3.5GB of
data across all your memstores. If you are loading those memstores
equally, setting the memstore size to anywhere bigger than ~350MB will
have almost no effect since they will get force flushed.

If you aren't doing random reads, you could give more memory to the
memstores by giving less to the block cache. hfile.block.cache.size is
25% by default, lower that and give the equal amount to both the upper
and lower limit.

>
> Also, how does this change with a table with more than one column family?
> As I understand it, each column family has a memstore.

Your understanding is correct, and currently the region will flush on
the size of all families summed up. This means smaller files and more
compactions.

>
>
> Thanks for your responses so far.
>

At your service,

J-D

Re: heavy writing and compaction storms

Posted by Neil Yalowitz <ne...@gmail.com>.
Thanks for the response, J-D.  Some followups:

> First you should consider using bulk import instead of a massive MR job.

Would love to, but I'm dealing with a "increment a counter" issue.  I
created a separate email thread for that question since it's off-topic from
the compaction storms.

> - make sure you pre-split:
http://hbase.apache.org/book/important_configurations.html#disable.splitting

Gotcha.  Confirms some discussions I've been having.  Thanks.

> - regarding major compactions, usually people switch off the
> automatic mode and cron it to run like X times a week during low
> traffic (in your case, just don't use them during the import)

Switching off automatic mode... does this include disabling minor
compactions?  I can disable the scheduled major compactions like this:

hbase.hregion.majorcompaction = 0

...but this will only stop scheduled major compaction.  What about minor
compactions that occur during a write-heavy job?  That requires something
more radical:

hbase.hstore.compactionThreshold = Integer.MAX_VALUE

I think I should probably shoot myself just for even suggesting it, but
desperation produces desperate solutions...

> - set the MEMSTORE_FLUSHSIZE to a really high number during the
> import so that you flush big files and compact as least as possible.
> The default configs work best for a real time load, not an import.

Gotcha.  I assume that this value is set with:

hbase.hregion.memstore.flush.size

...which is a cluster-wide setting (or perhaps RS-wide).  Choosing the
really-high-number is a bit tricky though (more about that below).

> Regarding number of regions and memstore size, a perfect config would
> be where you can load all the memstores completely before flushing
> them.

Can you expand on this?

A hypothetical:  Assume that the
hbase.regionserver.global.memstore.upperLimit and lowerLimit for a
regionserver allows for a heap size of 10GB to be available for memstore
and we have 10 regions per regionserver.  Should the
hbase.hregion.memstore.flush.size = 1GB?

Also, how does this change with a table with more than one column family?
As I understand it, each column family has a memstore.


Thanks for your responses so far.

Neil Yalowitz

On Thu, Jan 12, 2012 at 1:12 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Hi,
>
> First you should consider using bulk import instead of a massive MR
> job. If you decide against that, then
>
>  - make sure you pre-split:
>
> http://hbase.apache.org/book/important_configurations.html#disable.splitting
>  - regarding major compactions, usually people switch off the
> automatic mode and cron it to run like X times a week during low
> traffic (in your case, just don't use them during the import)
>  - set the MEMSTORE_FLUSHSIZE to a really high number during the
> import so that you flush big files and compact as least as possible.
> The default configs work best for a real time load, not an import.
>
> Also I guess you already know that you need a big heap, no swapping, etc.
>
> Regarding number of regions and memstore size, a perfect config would
> be where you can load all the memstores completely before flushing
> them. hbase.regionserver.global.memstore.upperLimit is the percentage
> your memstores can occupy in the heap and
> hbase.regionserver.global.memstore.lowerLimit is the point at which it
> starts force flushing regions. Take that into account too.
>
> Hope this helps getting you started.
>
> J-D
>
> On Thu, Jan 12, 2012 at 9:56 AM, Neil Yalowitz <ne...@gmail.com>
> wrote:
> > Hi all,
> >
> > What strategies do HBase 0.90 users employ to deal with or avoid the
> > so-called "compaction storm"?  I'm referring to the issue referred to in
> > 2.8.2.7 here:
> >
> > http://hbase.apache.org/book.html#important_configurations
> >
> > The MR job I'm working with executes many PUTs during the Map phase with
> > HTable.put() in batches of 1,000.  The keys are well distributed which,
> > while ideal for evenly distributed PUT performance, creates a level
> > increase of Storefiles on all regions.  When a compaction threshold is
> > reached for one region, it is usually reached for many regions... causing
> > many, many regions to request compaction.  Seems like a classic
> "compaction
> > storm" problem.  With a thousand regions all requesting compaction, the
> > compactionQueueSize will quickly climb for a server.
> >
> > Some options we have discussed for this problem:
> >
> > 1) an HBase cooldown - slowing down the writes by feeding the input files
> > at a slower interval
> >
> > I'm not certain this will fix the problem.  It still seems likely that
> > evenly distributed writes will eventually trigger many regions to request
> > compaction.
> >
> > 2) an HBase cooldown with a major_compact - disabling all automatic
> > compaction by setting the compaction thresholds at a very high number and
> > then running a major_compact on the two tables our MR job writes to
> >
> > I'm using the following settings to completely disable all compaction:
> >
> > hbase.regionserver.thread.splitcompactcheckfrequency = Integer.MAX_VALUE
> > (is this setting deprecated in 0.90?  what about 0.92?)
> > hbase.hstore.compactionThreshold = Integer.MAX_VALUE
> > hbase.hstore.blockingStoreFiles  = Integer.MAX_VALUE
> > hbase.hstore.compaction.max = Integer.MAX_VALUE
> > hbase.hstore.blockingWaitTime = 0
> >
> > This looks ugly, but it seems to be the only way to ensure that
> compaction
> > will not occur (unless I'm missing something).  Obviously, a system that
> is
> > not periodically manually compacted will eventually go down in flames
> with
> > these settings.
> >
> > 3) manually compact only certain regions - disabling all automatic
> > compaction as mentioned in #2 and have a separate job that polls the
> > regions and compacts certain regions according to need, but not allowing
> > all regions to compact automatically
> >
> >
> > What are other people's experiences with this issue?  Performing all
> > compaction during a cooldown period (#2)?  Performing compaction in a
> > rolling fashion (#3)?  Slower writes (#1)?  Something completely
> different?
> >
> >
> > Thanks,
> >
> > Neil Yalowitz
>

Re: heavy writing and compaction storms

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Hi,

First you should consider using bulk import instead of a massive MR
job. If you decide against that, then

 - make sure you pre-split:
http://hbase.apache.org/book/important_configurations.html#disable.splitting
 - regarding major compactions, usually people switch off the
automatic mode and cron it to run like X times a week during low
traffic (in your case, just don't use them during the import)
 - set the MEMSTORE_FLUSHSIZE to a really high number during the
import so that you flush big files and compact as least as possible.
The default configs work best for a real time load, not an import.

Also I guess you already know that you need a big heap, no swapping, etc.

Regarding number of regions and memstore size, a perfect config would
be where you can load all the memstores completely before flushing
them. hbase.regionserver.global.memstore.upperLimit is the percentage
your memstores can occupy in the heap and
hbase.regionserver.global.memstore.lowerLimit is the point at which it
starts force flushing regions. Take that into account too.

Hope this helps getting you started.

J-D

On Thu, Jan 12, 2012 at 9:56 AM, Neil Yalowitz <ne...@gmail.com> wrote:
> Hi all,
>
> What strategies do HBase 0.90 users employ to deal with or avoid the
> so-called "compaction storm"?  I'm referring to the issue referred to in
> 2.8.2.7 here:
>
> http://hbase.apache.org/book.html#important_configurations
>
> The MR job I'm working with executes many PUTs during the Map phase with
> HTable.put() in batches of 1,000.  The keys are well distributed which,
> while ideal for evenly distributed PUT performance, creates a level
> increase of Storefiles on all regions.  When a compaction threshold is
> reached for one region, it is usually reached for many regions... causing
> many, many regions to request compaction.  Seems like a classic "compaction
> storm" problem.  With a thousand regions all requesting compaction, the
> compactionQueueSize will quickly climb for a server.
>
> Some options we have discussed for this problem:
>
> 1) an HBase cooldown - slowing down the writes by feeding the input files
> at a slower interval
>
> I'm not certain this will fix the problem.  It still seems likely that
> evenly distributed writes will eventually trigger many regions to request
> compaction.
>
> 2) an HBase cooldown with a major_compact - disabling all automatic
> compaction by setting the compaction thresholds at a very high number and
> then running a major_compact on the two tables our MR job writes to
>
> I'm using the following settings to completely disable all compaction:
>
> hbase.regionserver.thread.splitcompactcheckfrequency = Integer.MAX_VALUE
> (is this setting deprecated in 0.90?  what about 0.92?)
> hbase.hstore.compactionThreshold = Integer.MAX_VALUE
> hbase.hstore.blockingStoreFiles  = Integer.MAX_VALUE
> hbase.hstore.compaction.max = Integer.MAX_VALUE
> hbase.hstore.blockingWaitTime = 0
>
> This looks ugly, but it seems to be the only way to ensure that compaction
> will not occur (unless I'm missing something).  Obviously, a system that is
> not periodically manually compacted will eventually go down in flames with
> these settings.
>
> 3) manually compact only certain regions - disabling all automatic
> compaction as mentioned in #2 and have a separate job that polls the
> regions and compacts certain regions according to need, but not allowing
> all regions to compact automatically
>
>
> What are other people's experiences with this issue?  Performing all
> compaction during a cooldown period (#2)?  Performing compaction in a
> rolling fashion (#3)?  Slower writes (#1)?  Something completely different?
>
>
> Thanks,
>
> Neil Yalowitz