You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Wayne <wa...@gmail.com> on 2011/05/23 15:40:32 UTC

memstore flush blocking write pause

In order to reduce the total number of regions we have up'd the max region
size to 5g. This has kept us below 100 regions per node but the side affect
is pauses occurring every 1-2 min under heavy writes to a single region. We
see the "too many store files delaying flush up to 90sec" warning every
couple of minutes. We have upped the size of the memstore flush size (256m)
as well as upped the blockingstorefiles (15) but these pauses
are occurring more than writes are occurring. In the end our write
through-put has degraded considerably.

Are we better off going to a 1G region size? Will this help the situation?
We were always told less/bigger regions was better but this "seems" to be a
bad side affect of that. Should we instead increase the memstore flush size
even more? We are also having serious JVM problems...what is our best course
of action here?

Thanks for any help/advice that can be provided.

Re: memstore flush blocking write pause

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Minor compactions don't compact all files, only a few of them
depending on what the algorithm computes, so flushing before the
compaction (which might already be running!) could very well put you
in the same situation at the end.

Something we know we need to do is a merging flush:
https://issues.apache.org/jira/browse/HBASE-3656

Patches are welcome 24/7 :)

J-D

On Tue, Jun 14, 2011 at 12:14 AM, Weihua JIANG <we...@gmail.com> wrote:
> I looked at the code. This warn message is printed by
> MemStoreFlusher.flushRegion(). If there are too many store files, it
> first request compaction and wait 90s, then flush mem store.
>
> My question is: why not flush mem store before compaction?  In current
> mode, the result is  a compacted store file + a new flush store file.
> This makes it easier to reach compaction criteria later. If flush
> before compaction, then the result is a compacted store file.
>
> Thanks
> Weihua
>
> 2011/6/13 Sheng Chen <ch...@gmail.com>:
>> I've met with the same problem.
>> Update operations are blocked by memstore flushing, and memstore flushing is
>> blocked by a compaction ("too many store files, delay flushing for 90s").
>>
>> Have you got any solutions?
>>
>> 2011/5/23 Wayne <wa...@gmail.com>
>>
>>> We have 4 CFs, but only 1 is ever used for a given region. What about
>>> upping
>>> the size per memstore file to 1G? We have 5x limit of 256m which results in
>>> lots of messages like "memstore size 1.3g is >= than blocking 1.2g size".
>>> Maybe given the bigger region size we need a bigger memstore size?
>>>
>>> Here is a region server log snippet for this occurring 2x in less than a 2
>>> minute period.
>>>
>>> http://pastebin.com/CxAQSXTt
>>>
>>>
>>> On Mon, May 23, 2011 at 11:33 AM, Stack <st...@duboce.net> wrote:
>>>
>>> > On Mon, May 23, 2011 at 6:40 AM, Wayne <wa...@gmail.com> wrote:
>>> > > In order to reduce the total number of regions we have up'd the max
>>> > region
>>> > > size to 5g. This has kept us below 100 regions per node but the side
>>> > affect
>>> > > is pauses occurring every 1-2 min under heavy writes to a single
>>> region.
>>> > We
>>> > > see the "too many store files delaying flush up to 90sec" warning every
>>> > > couple of minutes. We have upped the size of the memstore flush size
>>> > (256m)
>>> > > as well as upped the blockingstorefiles (15) but these pauses
>>> > > are occurring more than writes are occurring. In the end our write
>>> > > through-put has degraded considerably.
>>> > >
>>> >
>>> > How many column families?  Pastebin a regionserver log.  You could up
>>> > the number of store files before we put up the blocking writes gate
>>> > but then you might have runaway files to compact.
>>> >
>>> > St.Ack
>>> >
>>>
>>
>

Re: memstore flush blocking write pause

Posted by Weihua JIANG <we...@gmail.com>.

I looked at the code. This warn message is printed by
MemStoreFlusher.flushRegion(). If there are too many store files, it
first request compaction and wait 90s, then flush mem store.

My question is: why not flush mem store before compaction?  In current
mode, the result is  a compacted store file + a new flush store file.
This makes it easier to reach compaction criteria later. If flush
before compaction, then the result is a compacted store file.

Thanks
Weihua

2011/6/13 Sheng Chen <ch...@gmail.com>:
> I've met with the same problem.
> Update operations are blocked by memstore flushing, and memstore flushing is
> blocked by a compaction ("too many store files, delay flushing for 90s").
>
> Have you got any solutions?
>
> 2011/5/23 Wayne <wa...@gmail.com>
>
>> We have 4 CFs, but only 1 is ever used for a given region. What about
>> upping
>> the size per memstore file to 1G? We have 5x limit of 256m which results in
>> lots of messages like "memstore size 1.3g is >= than blocking 1.2g size".
>> Maybe given the bigger region size we need a bigger memstore size?
>>
>> Here is a region server log snippet for this occurring 2x in less than a 2
>> minute period.
>>
>> http://pastebin.com/CxAQSXTt
>>
>>
>> On Mon, May 23, 2011 at 11:33 AM, Stack <st...@duboce.net> wrote:
>>
>> > On Mon, May 23, 2011 at 6:40 AM, Wayne <wa...@gmail.com> wrote:
>> > > In order to reduce the total number of regions we have up'd the max
>> > region
>> > > size to 5g. This has kept us below 100 regions per node but the side
>> > affect
>> > > is pauses occurring every 1-2 min under heavy writes to a single
>> region.
>> > We
>> > > see the "too many store files delaying flush up to 90sec" warning every
>> > > couple of minutes. We have upped the size of the memstore flush size
>> > (256m)
>> > > as well as upped the blockingstorefiles (15) but these pauses
>> > > are occurring more than writes are occurring. In the end our write
>> > > through-put has degraded considerably.
>> > >
>> >
>> > How many column families?  Pastebin a regionserver log.  You could up
>> > the number of store files before we put up the blocking writes gate
>> > but then you might have runaway files to compact.
>> >
>> > St.Ack
>> >
>>
>

Re: memstore flush blocking write pause

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Wayne mentions a few of them in his original message in this thread,
also please have a look at this chapter of the online book:
http://hbase.apache.org/book/performance.html

J-D

On Mon, Jun 13, 2011 at 11:21 PM, Sheng Chen <ch...@gmail.com> wrote:
> Thank you JD.
>
> My hbase works as a log archive storage, there is a continuing stream of
> inserting.
> So it is normally write heavy. Is there any advice for this scenario?
> Thanks.
>
> Sean
>
>
>
> 2011/6/14 Jean-Daniel Cryans <jd...@apache.org>
>
>> Unless your normal workload is very heavy on writes (which is Wayne's
>> case), you're better off using bulk loading:
>> http://hbase.apache.org/bulk-loads.html
>>
>> J-D
>>
>> On Mon, Jun 13, 2011 at 12:26 AM, Sheng Chen <ch...@gmail.com>
>> wrote:
>> > I've met with the same problem.
>> > Update operations are blocked by memstore flushing, and memstore flushing
>> is
>> > blocked by a compaction ("too many store files, delay flushing for 90s").
>> >
>> > Have you got any solutions?
>> >
>> > 2011/5/23 Wayne <wa...@gmail.com>
>> >
>> >> We have 4 CFs, but only 1 is ever used for a given region. What about
>> >> upping
>> >> the size per memstore file to 1G? We have 5x limit of 256m which results
>> in
>> >> lots of messages like "memstore size 1.3g is >= than blocking 1.2g
>> size".
>> >> Maybe given the bigger region size we need a bigger memstore size?
>> >>
>> >> Here is a region server log snippet for this occurring 2x in less than a
>> 2
>> >> minute period.
>> >>
>> >> http://pastebin.com/CxAQSXTt
>> >>
>> >>
>> >> On Mon, May 23, 2011 at 11:33 AM, Stack <st...@duboce.net> wrote:
>> >>
>> >> > On Mon, May 23, 2011 at 6:40 AM, Wayne <wa...@gmail.com> wrote:
>> >> > > In order to reduce the total number of regions we have up'd the max
>> >> > region
>> >> > > size to 5g. This has kept us below 100 regions per node but the side
>> >> > affect
>> >> > > is pauses occurring every 1-2 min under heavy writes to a single
>> >> region.
>> >> > We
>> >> > > see the "too many store files delaying flush up to 90sec" warning
>> every
>> >> > > couple of minutes. We have upped the size of the memstore flush size
>> >> > (256m)
>> >> > > as well as upped the blockingstorefiles (15) but these pauses
>> >> > > are occurring more than writes are occurring. In the end our write
>> >> > > through-put has degraded considerably.
>> >> > >
>> >> >
>> >> > How many column families?  Pastebin a regionserver log.  You could up
>> >> > the number of store files before we put up the blocking writes gate
>> >> > but then you might have runaway files to compact.
>> >> >
>> >> > St.Ack
>> >> >
>> >>
>> >
>>
>

Re: memstore flush blocking write pause

Posted by Sheng Chen <ch...@gmail.com>.

Thank you JD.

My hbase works as a log archive storage, there is a continuing stream of
inserting.
So it is normally write heavy. Is there any advice for this scenario?
Thanks.

Sean



2011/6/14 Jean-Daniel Cryans <jd...@apache.org>

> Unless your normal workload is very heavy on writes (which is Wayne's
> case), you're better off using bulk loading:
> http://hbase.apache.org/bulk-loads.html
>
> J-D
>
> On Mon, Jun 13, 2011 at 12:26 AM, Sheng Chen <ch...@gmail.com>
> wrote:
> > I've met with the same problem.
> > Update operations are blocked by memstore flushing, and memstore flushing
> is
> > blocked by a compaction ("too many store files, delay flushing for 90s").
> >
> > Have you got any solutions?
> >
> > 2011/5/23 Wayne <wa...@gmail.com>
> >
> >> We have 4 CFs, but only 1 is ever used for a given region. What about
> >> upping
> >> the size per memstore file to 1G? We have 5x limit of 256m which results
> in
> >> lots of messages like "memstore size 1.3g is >= than blocking 1.2g
> size".
> >> Maybe given the bigger region size we need a bigger memstore size?
> >>
> >> Here is a region server log snippet for this occurring 2x in less than a
> 2
> >> minute period.
> >>
> >> http://pastebin.com/CxAQSXTt
> >>
> >>
> >> On Mon, May 23, 2011 at 11:33 AM, Stack <st...@duboce.net> wrote:
> >>
> >> > On Mon, May 23, 2011 at 6:40 AM, Wayne <wa...@gmail.com> wrote:
> >> > > In order to reduce the total number of regions we have up'd the max
> >> > region
> >> > > size to 5g. This has kept us below 100 regions per node but the side
> >> > affect
> >> > > is pauses occurring every 1-2 min under heavy writes to a single
> >> region.
> >> > We
> >> > > see the "too many store files delaying flush up to 90sec" warning
> every
> >> > > couple of minutes. We have upped the size of the memstore flush size
> >> > (256m)
> >> > > as well as upped the blockingstorefiles (15) but these pauses
> >> > > are occurring more than writes are occurring. In the end our write
> >> > > through-put has degraded considerably.
> >> > >
> >> >
> >> > How many column families?  Pastebin a regionserver log.  You could up
> >> > the number of store files before we put up the blocking writes gate
> >> > but then you might have runaway files to compact.
> >> >
> >> > St.Ack
> >> >
> >>
> >
>

Re: memstore flush blocking write pause

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Unless your normal workload is very heavy on writes (which is Wayne's
case), you're better off using bulk loading:
http://hbase.apache.org/bulk-loads.html

J-D

On Mon, Jun 13, 2011 at 12:26 AM, Sheng Chen <ch...@gmail.com> wrote:
> I've met with the same problem.
> Update operations are blocked by memstore flushing, and memstore flushing is
> blocked by a compaction ("too many store files, delay flushing for 90s").
>
> Have you got any solutions?
>
> 2011/5/23 Wayne <wa...@gmail.com>
>
>> We have 4 CFs, but only 1 is ever used for a given region. What about
>> upping
>> the size per memstore file to 1G? We have 5x limit of 256m which results in
>> lots of messages like "memstore size 1.3g is >= than blocking 1.2g size".
>> Maybe given the bigger region size we need a bigger memstore size?
>>
>> Here is a region server log snippet for this occurring 2x in less than a 2
>> minute period.
>>
>> http://pastebin.com/CxAQSXTt
>>
>>
>> On Mon, May 23, 2011 at 11:33 AM, Stack <st...@duboce.net> wrote:
>>
>> > On Mon, May 23, 2011 at 6:40 AM, Wayne <wa...@gmail.com> wrote:
>> > > In order to reduce the total number of regions we have up'd the max
>> > region
>> > > size to 5g. This has kept us below 100 regions per node but the side
>> > affect
>> > > is pauses occurring every 1-2 min under heavy writes to a single
>> region.
>> > We
>> > > see the "too many store files delaying flush up to 90sec" warning every
>> > > couple of minutes. We have upped the size of the memstore flush size
>> > (256m)
>> > > as well as upped the blockingstorefiles (15) but these pauses
>> > > are occurring more than writes are occurring. In the end our write
>> > > through-put has degraded considerably.
>> > >
>> >
>> > How many column families?  Pastebin a regionserver log.  You could up
>> > the number of store files before we put up the blocking writes gate
>> > but then you might have runaway files to compact.
>> >
>> > St.Ack
>> >
>>
>

Re: memstore flush blocking write pause

Posted by Sheng Chen <ch...@gmail.com>.

I've met with the same problem.
Update operations are blocked by memstore flushing, and memstore flushing is
blocked by a compaction ("too many store files, delay flushing for 90s").

Have you got any solutions?

2011/5/23 Wayne <wa...@gmail.com>

> We have 4 CFs, but only 1 is ever used for a given region. What about
> upping
> the size per memstore file to 1G? We have 5x limit of 256m which results in
> lots of messages like "memstore size 1.3g is >= than blocking 1.2g size".
> Maybe given the bigger region size we need a bigger memstore size?
>
> Here is a region server log snippet for this occurring 2x in less than a 2
> minute period.
>
> http://pastebin.com/CxAQSXTt
>
>
> On Mon, May 23, 2011 at 11:33 AM, Stack <st...@duboce.net> wrote:
>
> > On Mon, May 23, 2011 at 6:40 AM, Wayne <wa...@gmail.com> wrote:
> > > In order to reduce the total number of regions we have up'd the max
> > region
> > > size to 5g. This has kept us below 100 regions per node but the side
> > affect
> > > is pauses occurring every 1-2 min under heavy writes to a single
> region.
> > We
> > > see the "too many store files delaying flush up to 90sec" warning every
> > > couple of minutes. We have upped the size of the memstore flush size
> > (256m)
> > > as well as upped the blockingstorefiles (15) but these pauses
> > > are occurring more than writes are occurring. In the end our write
> > > through-put has degraded considerably.
> > >
> >
> > How many column families?  Pastebin a regionserver log.  You could up
> > the number of store files before we put up the blocking writes gate
> > but then you might have runaway files to compact.
> >
> > St.Ack
> >
>

Re: memstore flush blocking write pause

Posted by Wayne <wa...@gmail.com>.

We have 4 CFs, but only 1 is ever used for a given region. What about upping
the size per memstore file to 1G? We have 5x limit of 256m which results in
lots of messages like "memstore size 1.3g is >= than blocking 1.2g size".
Maybe given the bigger region size we need a bigger memstore size?

Here is a region server log snippet for this occurring 2x in less than a 2
minute period.

http://pastebin.com/CxAQSXTt

On Mon, May 23, 2011 at 11:33 AM, Stack <st...@duboce.net> wrote:

> On Mon, May 23, 2011 at 6:40 AM, Wayne <wa...@gmail.com> wrote:
> > In order to reduce the total number of regions we have up'd the max
> region
> > size to 5g. This has kept us below 100 regions per node but the side
> affect
> > is pauses occurring every 1-2 min under heavy writes to a single region.
> We
> > see the "too many store files delaying flush up to 90sec" warning every
> > couple of minutes. We have upped the size of the memstore flush size
> (256m)
> > as well as upped the blockingstorefiles (15) but these pauses
> > are occurring more than writes are occurring. In the end our write
> > through-put has degraded considerably.
> >
>
> How many column families?  Pastebin a regionserver log.  You could up
> the number of store files before we put up the blocking writes gate
> but then you might have runaway files to compact.
>
> St.Ack
>

Re: memstore flush blocking write pause

Posted by Stack <st...@duboce.net>.

On Mon, May 23, 2011 at 6:40 AM, Wayne <wa...@gmail.com> wrote:
> In order to reduce the total number of regions we have up'd the max region
> size to 5g. This has kept us below 100 regions per node but the side affect
> is pauses occurring every 1-2 min under heavy writes to a single region. We
> see the "too many store files delaying flush up to 90sec" warning every
> couple of minutes. We have upped the size of the memstore flush size (256m)
> as well as upped the blockingstorefiles (15) but these pauses
> are occurring more than writes are occurring. In the end our write
> through-put has degraded considerably.
>

How many column families?  Pastebin a regionserver log.  You could up
the number of store files before we put up the blocking writes gate
but then you might have runaway files to compact.

St.Ack