You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Adrien Mogenet <ad...@gmail.com> on 2013/12/08 13:00:38 UTC

Default value for Periodic Flusher

Hi there,

I'm wondering if the Periodic Flusher should be disabled by default?

During a recent upgrade, I've noticed a strange behavior modification on my
servers, and it was due to this new feature, supplied with a "1 hour"
default value. I think upgrades should keep default behavior as close as in
previous versions. "By chance", logs were mentioning this "Periodic
Flusher" but this could have taken much more time to debug otherwise :-)

What are you thoughts guys?
(Perhaps should be cc'ed to dev list?)
-- 
Adrien Mogenet
http://www.borntosegfault.com

Re: Default value for Periodic Flusher

Posted by Ted Yu <yu...@gmail.com>.
>From hbase-default.xml :
    <name>hbase.regionserver.optionalcacheflushinterval</name>
    <value>3600000</value>
    <description>
    Maximum amount of time an edit lives in memory before being
automatically flushed.
    Default 1 hour. Set it to 0 to disable automatic flushing.

Can you adjust the above parameter to fit your workload ?

Cheers


On Tue, Dec 10, 2013 at 2:16 PM, Adrien Mogenet <ad...@gmail.com>wrote:

> Hi guys,
>
> I've upgraded to 0.94.11. Here is my "worst-case scenario" :
>
> - let say each regionserver has 3 GB memstore
> - let say compaction max filesize is ~200 GB, min. 2 files, max 10 files.
> - let say memstore is growing "slowly" (1 GB / hour per RS)
>
> Then, automatically flushing every hour will lead into 1 GB storefiles,
> being compacted into storefiles of 2 GB, 3 GB, 4.... up to 200 GB.
> Sometimes, my write-load becomes very low, and periodic flusher will flush
> perhaps 1 MB of data, it will trigger a minor compaction of hundreds
> gigabytes + 1 MB; it seems to be lots of IO just to merge 1 MB of data.
>
> Previously (ie. lack of periodic flusher) memstore was creating 3 GB
> storefiles, and thus creating (after minor compactions) 3 GB, 6 GB, 9 GB...
> up to 200 GB storefiles. And if memstore is growing slowly, it won't
> generate small storefiles on HDFS. If think it looks like a more reasonable
> IO-load, doesn't it?
>
> I deeply agree with Periodic Flusher relevance, but I don't think it's
> suitable for everyone. Do you share my opinion wrt. my workload?
>
>
> On Sun, Dec 8, 2013 at 10:36 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Adrien:
> > This config was introduced in 0.94.8
> >
> > Which release did you upgrade to ?
> >
> > As Jean-Marc said, telling us the issue (along with log snippet) would
> > help.
> >
> > Cheers
> >
> >
> > On Mon, Dec 9, 2013 at 1:26 AM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org
> > > wrote:
> >
> > > Hi Adrien,
> > >
> > > What was the bad behavior you faced because of that? Maybe it's what
> need
> > > to be fixed more than the periodic flusher? Or put a bigger default
> > value?
> > >
> > > JM
> > >
> > >
> > > 2013/12/8 Adrien Mogenet <ad...@gmail.com>
> > >
> > > > Hi there,
> > > >
> > > > I'm wondering if the Periodic Flusher should be disabled by default?
> > > >
> > > > During a recent upgrade, I've noticed a strange behavior modification
> > on
> > > my
> > > > servers, and it was due to this new feature, supplied with a "1 hour"
> > > > default value. I think upgrades should keep default behavior as close
> > as
> > > in
> > > > previous versions. "By chance", logs were mentioning this "Periodic
> > > > Flusher" but this could have taken much more time to debug otherwise
> > :-)
> > > >
> > > > What are you thoughts guys?
> > > > (Perhaps should be cc'ed to dev list?)
> > > > --
> > > > Adrien Mogenet
> > > > http://www.borntosegfault.com
> > > >
> > >
> >
>
>
>
> --
> Adrien Mogenet
> http://www.borntosegfault.com
>

Re: Default value for Periodic Flusher

Posted by Adrien Mogenet <ad...@gmail.com>.
Yep, I obviously turned it off for my use case since MTTR is not a big
concern.
I just wanted to talk about the default value, not about my personal case
especially.
If you consider MTTR is the main objective, then I agree 1 hour is a decent
default value :)

Vladimir > you're right, I forgot to mention these points that made my
scenario even worse.

Thanks for sharing your opinions :)


On Wed, Dec 11, 2013 at 4:56 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> There is no perfect setting. For Adrien's usecase, this should be turned
> off. For some other usecases, this should be turned on.
>
> I'm -0 to turn it off. I will prefer people to usually have a good MTTR
> rather than less compactions. I'm not sure if 200gb region size is a common
> usecase.
>
>
> 2013/12/10 lars hofhansl <la...@apache.org>
>
> > Should we default it to 0 (off) then? (As Ted pointed out, you can turn
> > this off).
> > Are you not worried about MTTR when a RegionServer dies with 3GB of logs
> > to replay?
> >
> >
> > -- Lars
> >
> >
> >
> > ________________________________
> >  From: Adrien Mogenet <ad...@gmail.com>
> > To: user <us...@hbase.apache.org>
> > Sent: Tuesday, December 10, 2013 2:16 PM
> > Subject: Re: Default value for Periodic Flusher
> >
> >
> > Hi guys,
> >
> > I've upgraded to 0.94.11. Here is my "worst-case scenario" :
> >
> > - let say each regionserver has 3 GB memstore
> > - let say compaction max filesize is ~200 GB, min. 2 files, max 10 files.
> > - let say memstore is growing "slowly" (1 GB / hour per RS)
> >
> > Then, automatically flushing every hour will lead into 1 GB storefiles,
> > being compacted into storefiles of 2 GB, 3 GB, 4.... up to 200 GB.
> > Sometimes, my write-load becomes very low, and periodic flusher will
> flush
> > perhaps 1 MB of data, it will trigger a minor compaction of hundreds
> > gigabytes + 1 MB; it seems to be lots of IO just to merge 1 MB of data.
> >
> > Previously (ie. lack of periodic flusher) memstore was creating 3 GB
> > storefiles, and thus creating (after minor compactions) 3 GB, 6 GB, 9
> GB...
> > up to 200 GB storefiles. And if memstore is growing slowly, it won't
> > generate small storefiles on HDFS. If think it looks like a more
> reasonable
> > IO-load, doesn't it?
> >
> > I deeply agree with Periodic Flusher relevance, but I don't think it's
> > suitable for everyone. Do you share my opinion wrt. my workload?
> >
> >
> > On Sun, Dec 8, 2013 at 10:36 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > Adrien:
> > > This config was introduced in 0.94.8
> > >
> > > Which release did you upgrade to ?
> > >
> > > As Jean-Marc said, telling us the issue (along with log snippet) would
> > > help.
> > >
> > > Cheers
> > >
> > >
> > > On Mon, Dec 9, 2013 at 1:26 AM, Jean-Marc Spaggiari <
> > > jean-marc@spaggiari.org
> > > > wrote:
> > >
> > > > Hi Adrien,
> > > >
> > > > What was the bad behavior you faced because of that? Maybe it's what
> > need
> > > > to be fixed more than the periodic flusher? Or put a bigger default
> > > value?
> > > >
> > > > JM
> > > >
> > > >
> > > > 2013/12/8 Adrien Mogenet <ad...@gmail.com>
> > > >
> > > > > Hi there,
> > > > >
> > > > > I'm wondering if the Periodic Flusher should be disabled by
> default?
> > > > >
> > > > > During a recent upgrade, I've noticed a strange behavior
> modification
> > > on
> > > > my
> > > > > servers, and it was due to this new feature, supplied with a "1
> hour"
> > > > > default value. I think upgrades should keep default behavior as
> close
> > > as
> > > > in
> > > > > previous versions. "By chance", logs were mentioning this "Periodic
> > > > > Flusher" but this could have taken much more time to debug
> otherwise
> > > :-)
> > > > >
> > > > > What are you thoughts guys?
> > > > > (Perhaps should be cc'ed to dev list?)
> > > > > --
> > > > > Adrien Mogenet
> > > > > http://www.borntosegfault.com
> >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Adrien Mogenet
> > http://www.borntosegfault.com
> >
>



-- 
Adrien Mogenet
http://www.borntosegfault.com

Re: Default value for Periodic Flusher

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
There is no perfect setting. For Adrien's usecase, this should be turned
off. For some other usecases, this should be turned on.

I'm -0 to turn it off. I will prefer people to usually have a good MTTR
rather than less compactions. I'm not sure if 200gb region size is a common
usecase.


2013/12/10 lars hofhansl <la...@apache.org>

> Should we default it to 0 (off) then? (As Ted pointed out, you can turn
> this off).
> Are you not worried about MTTR when a RegionServer dies with 3GB of logs
> to replay?
>
>
> -- Lars
>
>
>
> ________________________________
>  From: Adrien Mogenet <ad...@gmail.com>
> To: user <us...@hbase.apache.org>
> Sent: Tuesday, December 10, 2013 2:16 PM
> Subject: Re: Default value for Periodic Flusher
>
>
> Hi guys,
>
> I've upgraded to 0.94.11. Here is my "worst-case scenario" :
>
> - let say each regionserver has 3 GB memstore
> - let say compaction max filesize is ~200 GB, min. 2 files, max 10 files.
> - let say memstore is growing "slowly" (1 GB / hour per RS)
>
> Then, automatically flushing every hour will lead into 1 GB storefiles,
> being compacted into storefiles of 2 GB, 3 GB, 4.... up to 200 GB.
> Sometimes, my write-load becomes very low, and periodic flusher will flush
> perhaps 1 MB of data, it will trigger a minor compaction of hundreds
> gigabytes + 1 MB; it seems to be lots of IO just to merge 1 MB of data.
>
> Previously (ie. lack of periodic flusher) memstore was creating 3 GB
> storefiles, and thus creating (after minor compactions) 3 GB, 6 GB, 9 GB...
> up to 200 GB storefiles. And if memstore is growing slowly, it won't
> generate small storefiles on HDFS. If think it looks like a more reasonable
> IO-load, doesn't it?
>
> I deeply agree with Periodic Flusher relevance, but I don't think it's
> suitable for everyone. Do you share my opinion wrt. my workload?
>
>
> On Sun, Dec 8, 2013 at 10:36 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Adrien:
> > This config was introduced in 0.94.8
> >
> > Which release did you upgrade to ?
> >
> > As Jean-Marc said, telling us the issue (along with log snippet) would
> > help.
> >
> > Cheers
> >
> >
> > On Mon, Dec 9, 2013 at 1:26 AM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org
> > > wrote:
> >
> > > Hi Adrien,
> > >
> > > What was the bad behavior you faced because of that? Maybe it's what
> need
> > > to be fixed more than the periodic flusher? Or put a bigger default
> > value?
> > >
> > > JM
> > >
> > >
> > > 2013/12/8 Adrien Mogenet <ad...@gmail.com>
> > >
> > > > Hi there,
> > > >
> > > > I'm wondering if the Periodic Flusher should be disabled by default?
> > > >
> > > > During a recent upgrade, I've noticed a strange behavior modification
> > on
> > > my
> > > > servers, and it was due to this new feature, supplied with a "1 hour"
> > > > default value. I think upgrades should keep default behavior as close
> > as
> > > in
> > > > previous versions. "By chance", logs were mentioning this "Periodic
> > > > Flusher" but this could have taken much more time to debug otherwise
> > :-)
> > > >
> > > > What are you thoughts guys?
> > > > (Perhaps should be cc'ed to dev list?)
> > > > --
> > > > Adrien Mogenet
> > > > http://www.borntosegfault.com
>
> > > >
> > >
> >
>
>
>
> --
> Adrien Mogenet
> http://www.borntosegfault.com
>

Re: Default value for Periodic Flusher

Posted by lars hofhansl <la...@apache.org>.
Should we default it to 0 (off) then? (As Ted pointed out, you can turn this off).
Are you not worried about MTTR when a RegionServer dies with 3GB of logs to replay?


-- Lars



________________________________
 From: Adrien Mogenet <ad...@gmail.com>
To: user <us...@hbase.apache.org> 
Sent: Tuesday, December 10, 2013 2:16 PM
Subject: Re: Default value for Periodic Flusher
 

Hi guys,

I've upgraded to 0.94.11. Here is my "worst-case scenario" :

- let say each regionserver has 3 GB memstore
- let say compaction max filesize is ~200 GB, min. 2 files, max 10 files.
- let say memstore is growing "slowly" (1 GB / hour per RS)

Then, automatically flushing every hour will lead into 1 GB storefiles,
being compacted into storefiles of 2 GB, 3 GB, 4.... up to 200 GB.
Sometimes, my write-load becomes very low, and periodic flusher will flush
perhaps 1 MB of data, it will trigger a minor compaction of hundreds
gigabytes + 1 MB; it seems to be lots of IO just to merge 1 MB of data.

Previously (ie. lack of periodic flusher) memstore was creating 3 GB
storefiles, and thus creating (after minor compactions) 3 GB, 6 GB, 9 GB...
up to 200 GB storefiles. And if memstore is growing slowly, it won't
generate small storefiles on HDFS. If think it looks like a more reasonable
IO-load, doesn't it?

I deeply agree with Periodic Flusher relevance, but I don't think it's
suitable for everyone. Do you share my opinion wrt. my workload?


On Sun, Dec 8, 2013 at 10:36 PM, Ted Yu <yu...@gmail.com> wrote:

> Adrien:
> This config was introduced in 0.94.8
>
> Which release did you upgrade to ?
>
> As Jean-Marc said, telling us the issue (along with log snippet) would
> help.
>
> Cheers
>
>
> On Mon, Dec 9, 2013 at 1:26 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org
> > wrote:
>
> > Hi Adrien,
> >
> > What was the bad behavior you faced because of that? Maybe it's what need
> > to be fixed more than the periodic flusher? Or put a bigger default
> value?
> >
> > JM
> >
> >
> > 2013/12/8 Adrien Mogenet <ad...@gmail.com>
> >
> > > Hi there,
> > >
> > > I'm wondering if the Periodic Flusher should be disabled by default?
> > >
> > > During a recent upgrade, I've noticed a strange behavior modification
> on
> > my
> > > servers, and it was due to this new feature, supplied with a "1 hour"
> > > default value. I think upgrades should keep default behavior as close
> as
> > in
> > > previous versions. "By chance", logs were mentioning this "Periodic
> > > Flusher" but this could have taken much more time to debug otherwise
> :-)
> > >
> > > What are you thoughts guys?
> > > (Perhaps should be cc'ed to dev list?)
> > > --
> > > Adrien Mogenet
> > > http://www.borntosegfault.com

> > >
> >
>



-- 
Adrien Mogenet
http://www.borntosegfault.com

RE: Default value for Periodic Flusher

Posted by Vladimir Rodionov <vr...@carrieriq.com>.
I do agree that flush interval must be configurable (I think its configurable).

> I've upgraded to 0.94.11. Here is my "worst-case scenario" :
> - let say each regionserver has 3 GB memstore
> - let say compaction max filesize is ~200 GB, min. 2 files, max 10 files.
> - let say memstore is growing "slowly" (1 GB / hour per RS)
> Then, automatically flushing every hour will lead into 1 GB storefiles,
> being compacted into storefiles of 2 GB, 3 GB, 4.... up to 200 GB.

Nope. The reality is even worse than one could imagine.

Memstore 'size' is estimated Java heap usage which includes 'object' overheads (mostly KeyValue)
For small rows, the ratio of serialized memstore (store file) and estimated heap size is close to 5x (store file is smaller of course).
If you enable compression (2-3x) -> your store file will be 10-15 times smaller than your Memstore (not 1 GB,  but 70-100MB)

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________
From: Adrien Mogenet [adrien.mogenet@gmail.com]
Sent: Tuesday, December 10, 2013 2:16 PM
To: user
Subject: Re: Default value for Periodic Flusher

Hi guys,

I've upgraded to 0.94.11. Here is my "worst-case scenario" :

- let say each regionserver has 3 GB memstore
- let say compaction max filesize is ~200 GB, min. 2 files, max 10 files.
- let say memstore is growing "slowly" (1 GB / hour per RS)

Then, automatically flushing every hour will lead into 1 GB storefiles,
being compacted into storefiles of 2 GB, 3 GB, 4.... up to 200 GB.
Sometimes, my write-load becomes very low, and periodic flusher will flush
perhaps 1 MB of data, it will trigger a minor compaction of hundreds
gigabytes + 1 MB; it seems to be lots of IO just to merge 1 MB of data.

Previously (ie. lack of periodic flusher) memstore was creating 3 GB
storefiles, and thus creating (after minor compactions) 3 GB, 6 GB, 9 GB...
up to 200 GB storefiles. And if memstore is growing slowly, it won't
generate small storefiles on HDFS. If think it looks like a more reasonable
IO-load, doesn't it?

I deeply agree with Periodic Flusher relevance, but I don't think it's
suitable for everyone. Do you share my opinion wrt. my workload?


On Sun, Dec 8, 2013 at 10:36 PM, Ted Yu <yu...@gmail.com> wrote:


Confidentiality Notice:  The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited.  If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com and delete or destroy any copy of this message and its attachments.

Re: Default value for Periodic Flusher

Posted by Adrien Mogenet <ad...@gmail.com>.
Hi guys,

I've upgraded to 0.94.11. Here is my "worst-case scenario" :

- let say each regionserver has 3 GB memstore
- let say compaction max filesize is ~200 GB, min. 2 files, max 10 files.
- let say memstore is growing "slowly" (1 GB / hour per RS)

Then, automatically flushing every hour will lead into 1 GB storefiles,
being compacted into storefiles of 2 GB, 3 GB, 4.... up to 200 GB.
Sometimes, my write-load becomes very low, and periodic flusher will flush
perhaps 1 MB of data, it will trigger a minor compaction of hundreds
gigabytes + 1 MB; it seems to be lots of IO just to merge 1 MB of data.

Previously (ie. lack of periodic flusher) memstore was creating 3 GB
storefiles, and thus creating (after minor compactions) 3 GB, 6 GB, 9 GB...
up to 200 GB storefiles. And if memstore is growing slowly, it won't
generate small storefiles on HDFS. If think it looks like a more reasonable
IO-load, doesn't it?

I deeply agree with Periodic Flusher relevance, but I don't think it's
suitable for everyone. Do you share my opinion wrt. my workload?


On Sun, Dec 8, 2013 at 10:36 PM, Ted Yu <yu...@gmail.com> wrote:

> Adrien:
> This config was introduced in 0.94.8
>
> Which release did you upgrade to ?
>
> As Jean-Marc said, telling us the issue (along with log snippet) would
> help.
>
> Cheers
>
>
> On Mon, Dec 9, 2013 at 1:26 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org
> > wrote:
>
> > Hi Adrien,
> >
> > What was the bad behavior you faced because of that? Maybe it's what need
> > to be fixed more than the periodic flusher? Or put a bigger default
> value?
> >
> > JM
> >
> >
> > 2013/12/8 Adrien Mogenet <ad...@gmail.com>
> >
> > > Hi there,
> > >
> > > I'm wondering if the Periodic Flusher should be disabled by default?
> > >
> > > During a recent upgrade, I've noticed a strange behavior modification
> on
> > my
> > > servers, and it was due to this new feature, supplied with a "1 hour"
> > > default value. I think upgrades should keep default behavior as close
> as
> > in
> > > previous versions. "By chance", logs were mentioning this "Periodic
> > > Flusher" but this could have taken much more time to debug otherwise
> :-)
> > >
> > > What are you thoughts guys?
> > > (Perhaps should be cc'ed to dev list?)
> > > --
> > > Adrien Mogenet
> > > http://www.borntosegfault.com
> > >
> >
>



-- 
Adrien Mogenet
http://www.borntosegfault.com

Re: Default value for Periodic Flusher

Posted by Ted Yu <yu...@gmail.com>.
Adrien:
This config was introduced in 0.94.8

Which release did you upgrade to ?

As Jean-Marc said, telling us the issue (along with log snippet) would help.

Cheers


On Mon, Dec 9, 2013 at 1:26 AM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> Hi Adrien,
>
> What was the bad behavior you faced because of that? Maybe it's what need
> to be fixed more than the periodic flusher? Or put a bigger default value?
>
> JM
>
>
> 2013/12/8 Adrien Mogenet <ad...@gmail.com>
>
> > Hi there,
> >
> > I'm wondering if the Periodic Flusher should be disabled by default?
> >
> > During a recent upgrade, I've noticed a strange behavior modification on
> my
> > servers, and it was due to this new feature, supplied with a "1 hour"
> > default value. I think upgrades should keep default behavior as close as
> in
> > previous versions. "By chance", logs were mentioning this "Periodic
> > Flusher" but this could have taken much more time to debug otherwise :-)
> >
> > What are you thoughts guys?
> > (Perhaps should be cc'ed to dev list?)
> > --
> > Adrien Mogenet
> > http://www.borntosegfault.com
> >
>

Re: Default value for Periodic Flusher

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Adrien,

What was the bad behavior you faced because of that? Maybe it's what need
to be fixed more than the periodic flusher? Or put a bigger default value?

JM


2013/12/8 Adrien Mogenet <ad...@gmail.com>

> Hi there,
>
> I'm wondering if the Periodic Flusher should be disabled by default?
>
> During a recent upgrade, I've noticed a strange behavior modification on my
> servers, and it was due to this new feature, supplied with a "1 hour"
> default value. I think upgrades should keep default behavior as close as in
> previous versions. "By chance", logs were mentioning this "Periodic
> Flusher" but this could have taken much more time to debug otherwise :-)
>
> What are you thoughts guys?
> (Perhaps should be cc'ed to dev list?)
> --
> Adrien Mogenet
> http://www.borntosegfault.com
>