You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Ivan Tretyakov <it...@griddynamics.com> on 2013/11/28 10:46:23 UTC

Region server block cache and memstore size

Hi!

We are using HBase 0.92.1-cdh4.1.1. To import data the only way we use is
bulk load. And our common access pattern is sequential scans of different
parts of the tables.

Since that we are considering to disable block cache by setting
hbase.block.cache.size to zero.
But We've found following in HBase book (
http://hbase.apache.org/book/important_configurations.html):

"Do not turn off block cache (You'd do it by setting hbase.block.cache.size
to zero). Currently we do not do well if you do this because the
regionserver will spend all its time loading hfile indices over and over
again. If your working set it such that block cache does you no good, at
least size the block cache such that hfile indices will stay up in the
cache (you can get a rough idea on the size you need by surveying
regionserver UIs; you'll see index block size accounted near the top of the
webpage)."

Another thing we consider to reduce is memstore size by tuning following
options hbase.regionserver.global.memstore.upperLimit and
hbase.regionserver.global.memstore.lowerLimit.

So, my questions are:

Does it make sense to touch these options in our case?
Is this memory reserved or other processes inside regionserver can use it?

Thanks in advance!

-- 
Best Regards
Ivan Tretyakov

Deployment Engineer
Grid Dynamics
+7 812 640 38 76
Skype: ivan.v.tretyakov
www.griddynamics.com
itretyakov@griddynamics.com

Re: Region server block cache and memstore size

Posted by Kevin O'dell <ke...@cloudera.com>.
I agree with what Anoop said here, just because they are scans, it doesn't
make a lot of sense to turn off your block cache.  Are you trying to save
memory?  As for the memstore global limits, you will want to set those to
something like

upper .11
lower .10

  You have to leave at the minimum .10, as a safety value of .09 has been
hardcoded.  On a related topic, does anyone know why we have that safety
value?  I would recommend bumping your block cache to .65, on a 16GB heap
that now leaves you with 10.4GB of block cache per node.  This may help
some of your scans speed up.


On Thu, Nov 28, 2013 at 4:50 AM, Anoop John <an...@gmail.com> wrote:

> So you use Bulk load with HFileOpFormat for writing data?  Then you can
> reduce the hbase.regionserver.global.memstore.upperLimit and
> hbase.regionserver.global.memstore.lowerLimit  and give more heap % for the
> block cache.  Not getting why u try to reduce that also.
>
> -Anoop-
>
>
> On Thu, Nov 28, 2013 at 3:16 PM, Ivan Tretyakov <
> itretyakov@griddynamics.com
> > wrote:
>
> > Hi!
> >
> > We are using HBase 0.92.1-cdh4.1.1. To import data the only way we use is
> > bulk load. And our common access pattern is sequential scans of different
> > parts of the tables.
> >
> > Since that we are considering to disable block cache by setting
> > hbase.block.cache.size to zero.
> > But We've found following in HBase book (
> > http://hbase.apache.org/book/important_configurations.html):
> >
> > "Do not turn off block cache (You'd do it by setting
> hbase.block.cache.size
> > to zero). Currently we do not do well if you do this because the
> > regionserver will spend all its time loading hfile indices over and over
> > again. If your working set it such that block cache does you no good, at
> > least size the block cache such that hfile indices will stay up in the
> > cache (you can get a rough idea on the size you need by surveying
> > regionserver UIs; you'll see index block size accounted near the top of
> the
> > webpage)."
> >
> > Another thing we consider to reduce is memstore size by tuning following
> > options hbase.regionserver.global.memstore.upperLimit and
> > hbase.regionserver.global.memstore.lowerLimit.
> >
> > So, my questions are:
> >
> > Does it make sense to touch these options in our case?
> > Is this memory reserved or other processes inside regionserver can use
> it?
> >
> > Thanks in advance!
> >
> > --
> > Best Regards
> > Ivan Tretyakov
> >
> > Deployment Engineer
> > Grid Dynamics
> > +7 812 640 38 76<
> https://mail.google.com/mail/u/0/html/compose/static_files/blank_quirks.html#
> >
> > Skype: ivan.v.tretyakov
> > www.griddynamics.com
> > itretyakov@griddynamics.com
> >
>



-- 
Kevin O'Dell
Systems Engineer, Cloudera

Re: Region server block cache and memstore size

Posted by Anoop John <an...@gmail.com>.
So you use Bulk load with HFileOpFormat for writing data?  Then you can
reduce the hbase.regionserver.global.memstore.upperLimit and
hbase.regionserver.global.memstore.lowerLimit  and give more heap % for the
block cache.  Not getting why u try to reduce that also.

-Anoop-


On Thu, Nov 28, 2013 at 3:16 PM, Ivan Tretyakov <itretyakov@griddynamics.com
> wrote:

> Hi!
>
> We are using HBase 0.92.1-cdh4.1.1. To import data the only way we use is
> bulk load. And our common access pattern is sequential scans of different
> parts of the tables.
>
> Since that we are considering to disable block cache by setting
> hbase.block.cache.size to zero.
> But We've found following in HBase book (
> http://hbase.apache.org/book/important_configurations.html):
>
> "Do not turn off block cache (You'd do it by setting hbase.block.cache.size
> to zero). Currently we do not do well if you do this because the
> regionserver will spend all its time loading hfile indices over and over
> again. If your working set it such that block cache does you no good, at
> least size the block cache such that hfile indices will stay up in the
> cache (you can get a rough idea on the size you need by surveying
> regionserver UIs; you'll see index block size accounted near the top of the
> webpage)."
>
> Another thing we consider to reduce is memstore size by tuning following
> options hbase.regionserver.global.memstore.upperLimit and
> hbase.regionserver.global.memstore.lowerLimit.
>
> So, my questions are:
>
> Does it make sense to touch these options in our case?
> Is this memory reserved or other processes inside regionserver can use it?
>
> Thanks in advance!
>
> --
> Best Regards
> Ivan Tretyakov
>
> Deployment Engineer
> Grid Dynamics
> +7 812 640 38 76<https://mail.google.com/mail/u/0/html/compose/static_files/blank_quirks.html#>
> Skype: ivan.v.tretyakov
> www.griddynamics.com
> itretyakov@griddynamics.com
>