You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Henry Hung <YT...@winbond.com> on 2014/04/25 08:07:13 UTC

suggestion for how eliminate memory problem in heavy-write hbase region server

Dear All,

My current hbase environment is heavy write cluster with constant 2000+ insert rows / second spread to 10 region servers.
Each day I also need to do data deletion, and that will add a lot of IO to the cluster.

The problem is sometimes after a week, one of the region server will crash because
2014-04-10T10:17:47.200+0800: 1281486.956: [GC 1281486.956: [ParNew (promotion failed): 235959K->235959K(235968K), 0.0836790 secs]1281487.040: [CMS2014-04-10T10:21:14.957+0800: 1281694.712: [CMS-concurrent-sweep: 267.111/279.155 secs] [Times: user=334.79 sys=14.38, real=279.11 secs]
(concurrent mode failure): 13961950K->6802914K(16515072K), 209.9436660 secs] 14186496K->6802914K(16751040K), [CMS Perm : 42864K->42859K(71816K)], 210.0274680 secs] [Times: user=210.18 sys=0.01, real=209.99 secs]

I look into the gc log and usually find some information about CMS concurrent sweep that took a very long time to complete, such as:
2014-04-10T10:15:56.929+0800: 1281376.684: [CMS-concurrent-sweep: 48.834/58.027 secs] [Times: user=101.52 sys=11.82, real=58.02 secs]

I do a lot of google-ing and already read the Todd Lipcon avoiding full GC, or other blogs that sometimes tells me how to set jvm flags such as this:
-XX:+UseParNewGC
-XX:CMSInitiatingOccupancyFraction=70
-Xmn256m
-Xmx16384m
-XX:+DisableExplicitGC
-XX:+UseCompressedOops
-XX:PermSize=160m
-XX:MaxPermSize=160m
-XX:GCTimeRatio=19
-XX:SoftRefLRUPolicyMSPerMB=0
-XX:SurvivorRatio=2
-XX:MaxTenuringThreshold=1
-XX:+UseFastAccessorMethods
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:+UseCMSCompactAtFullCollection
-XX:CMSFullGCsBeforeCompaction=0
-XX:+CMSClassUnloadingEnabled
-XX:CMSMaxAbortablePrecleanTime=300
-XX:+CMSScavengeBeforeRemark

But alas, the problem still exist.

I also know that java 1.7 has a new G1GC that probably can be used to fix this problem, but I don't know if hbase 0.96 is ready to use it?

I would really appreciate it if someone out there can share one or two things about jvm configuration to achieve a more stable region server.

Best regards,
Henry

________________________________
The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

RE: suggestion for how eliminate memory problem in heavy-write hbase region server

Posted by Henry Hung <YT...@winbond.com>.

Hi to everyone, just want to make some update regarding my hbase memory problem.

I being doing some thinking and recently come up with an idea that the root cause probably just a CPU saturation.

Each of slave nodes is running these processes (to benefit from data locality):
Datanode
Nodemanager
Regionserver

Each data node has only single quad-core CPU i7 with RAM 32GB, and number of container for nodemanager is calculated based only from how many RAM available, that is 8.
With 8 containers, for CPU it will be an over-kill and probably will have impact to regionserver GC thread.

So, this week I will change nodemanager configuration so it will only have 2 to 3 containers for each slave nodes.
If regionserver do not crash for 3 weeks, than I will be positive that CPU saturation is the culprit.

Best regards,
Henry

-----Original Message-----
From: lars hofhansl [mailto:larsh@apache.org]
Sent: Tuesday, April 29, 2014 7:36 AM
To: user@hbase.apache.org
Subject: Re: suggestion for how eliminate memory problem in heavy-write hbase region server

Please let us know how it is going, Henry.

I would at least try setting:
-XX:+UseCMSInitiatingOccupancyOnly
As well as increasing the new size (despite what I have said in other threads, it might be needed here). Maybe start with 512mb instead of 256:
-Xmn512m

I would remove these as it looks like your perm gen might be too small:
-XX:PermSize=160m
-XX:MaxPermSize=160m

Personally I would not recommend G1 just yet. AFAIK it has a nasty fall back to single threaded full GC.

-- Lars


----- Original Message -----
From: Henry Hung <YT...@winbond.com>
To: "user@hbase.apache.org" <us...@hbase.apache.org>
Cc:
Sent: Sunday, April 27, 2014 6:44 PM
Subject: RE: suggestion for how eliminate memory problem in heavy-write hbase region server

@Bryan,

Do I have to recompile the Hadoop-2.2.0 and hbase-0.96 using java 1.7 or just change the JAVA_HOME and restart it?

I would probably use Oracle Java 1.7 u55 to test it.

Best regards,
Henry

-----Original Message-----
From: Bryan Beaudreault [mailto:bbeaudreault@hubspot.com]
Sent: Monday, April 28, 2014 8:45 AM
To: user@hbase.apache.org
Subject: Re: suggestion for how eliminate memory problem in heavy-write hbase region server

Keep in mind the math for heap.  Memstore is 40%, and that is divided across all regions for a RS.  With 2GB, that's under 30mb per region assuming well distributed writes.

Are your regionservers starved for CPU?  Either way, I'd try the java7 G1 GC on one regionserver and report back.  We run with 25GB heaps and never have long pauses, so 16GB should be fine with enough CPU.


On Sun, Apr 27, 2014 at 8:27 PM, Henry Hung <YT...@winbond.com> wrote:

> @Vladimir:
>
> My total region count is 330 right now, but I expect the number will
> surpass 1000 at the end of this year.
> Although current system is heavy-write, but I expect it to be also a
> read intense system, because I want to do a lot of data analysis in the future.
>
> About memory allocation, I assign 16 GB heap memory just because I
> have so many RAM space left over to allocate (each node has 32GB RAM installed).
> I never really thought that 2GB heap will be enough, but will give it
> a shot.
>
> Best regards,
> Henry
>
> -----Original Message-----
> From: Vladimir Rodionov [mailto:vrodionov@carrieriq.com]
> Sent: Saturday, April 26, 2014 1:48 AM
> To: user@hbase.apache.org
> Subject: RE: suggestion for how eliminate memory problem in
> heavy-write hbase region server
>
> I am just wondering why do you need large heaps on write - heavy cluster?
> How many regions per RS do you have? Usually large heaps need for read
> intensive applications to keep blocks in a cache, but now we have
> option to keep these blocks off heap (at least since 0.96) With
> MemSLAB for MemStore enabled you need 2MB (by default) per region,
> this is what you need to take into account first. Even 1000 regions won't eat more than 2GB of heap.
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>
> ________________________________________
> From: Ted Yu [yuzhihong@gmail.com]
> Sent: Friday, April 25, 2014 9:47 AM
> To: user@hbase.apache.org
> Subject: Re: suggestion for how eliminate memory problem in
> heavy-write hbase region server
>
> Henry:
> Please also take a look at the following thread:
>
>
> http://search-hadoop.com/m/51M4jeDMyy1/GC+recommendations+for+large+Re
> gion+Server+heaps&subj=RE+GC+recommendations+for+large+Region+Server+h
> eaps
>
>
> On Thu, Apr 24, 2014 at 11:17 PM, Mikhail Antonov
> <olorinbant@gmail.com
> >wrote:
>
> > Henry,
> >
> > http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html
> > - that may give some insights.
> >
> > -Mikhail
> >
> >
> > 2014-04-24 23:07 GMT-07:00 Henry Hung <YT...@winbond.com>:
> >
> > > Dear All,
> > >
> > > My current hbase environment is heavy write cluster with constant
> > > 2000+ insert rows / second spread to 10 region servers.
> > > Each day I also need to do data deletion, and that will add a lot
> > > of IO
> > to
> > > the cluster.
> > >
> > > The problem is sometimes after a week, one of the region server
> > > will
> > crash
> > > because
> > > 2014-04-10T10:17:47.200+0800: 1281486.956: [GC 1281486.956:
> > > [ParNew (promotion failed): 235959K->235959K(235968K), 0.0836790
> > secs]1281487.040:
> > > [CMS2014-04-10T10:21:14.957+0800: 1281694.712: [CMS-concurrent-sweep:
> > > 267.111/279.155 secs] [Times: user=334.79 sys=14.38, real=279.11
> > > secs] (concurrent mode failure): 13961950K->6802914K(16515072K),
> > > 209.9436660 secs] 14186496K->6802914K(16751040K), [CMS Perm :
> > 42864K->42859K(71816K)],
> > > 210.0274680 secs] [Times: user=210.18 sys=0.01, real=209.99 secs]
> > >
> > > I look into the gc log and usually find some information about CMS
> > > concurrent sweep that took a very long time to complete, such as:
> > > 2014-04-10T10:15:56.929+0800: 1281376.684: [CMS-concurrent-sweep:
> > > 48.834/58.027 secs] [Times: user=101.52 sys=11.82, real=58.02
> > > secs]
> > >
> > > I do a lot of google-ing and already read the Todd Lipcon avoiding
> > > full GC, or other blogs that sometimes tells me how to set jvm
> > > flags such as
> > > this:
> > > -XX:+UseParNewGC
> > > -XX:CMSInitiatingOccupancyFraction=70
> > > -Xmn256m
> > > -Xmx16384m
> > > -XX:+DisableExplicitGC
> > > -XX:+UseCompressedOops
> > > -XX:PermSize=160m
> > > -XX:MaxPermSize=160m
> > > -XX:GCTimeRatio=19
> > > -XX:SoftRefLRUPolicyMSPerMB=0
> > > -XX:SurvivorRatio=2
> > > -XX:MaxTenuringThreshold=1
> > > -XX:+UseFastAccessorMethods
> > > -XX:+UseParNewGC
> > > -XX:+UseConcMarkSweepGC
> > > -XX:+CMSParallelRemarkEnabled
> > > -XX:+UseCMSCompactAtFullCollection
> > > -XX:CMSFullGCsBeforeCompaction=0
> > > -XX:+CMSClassUnloadingEnabled
> > > -XX:CMSMaxAbortablePrecleanTime=300
> > > -XX:+CMSScavengeBeforeRemark
> > >
> > > But alas, the problem still exist.
> > >
> > > I also know that java 1.7 has a new G1GC that probably can be used
> > > to fix this problem, but I don't know if hbase 0.96 is ready to use it?
> > >
> > > I would really appreciate it if someone out there can share one or
> > > two things about jvm configuration to achieve a more stable region
> server.
> > >
> > > Best regards,
> > > Henry
> > >
> > > ________________________________
> > > The privileged confidential information contained in this email is
> > > intended for use only by the addressees as indicated by the
> > > original
> > sender
> > > of this email. If you are not the addressee indicated in this
> > > email or
> > are
> > > not responsible for delivery of the email to such a person, please
> > > kindly reply to the sender indicating this fact and delete all
> > > copies of it from your computer and network server immediately.
> > > Your cooperation is highly appreciated. It is advised that any
> > > unauthorized use of confidential information of Winbond is
> > > strictly prohibited; and any information in
> > this
> > > email irrelevant to the official business of Winbond shall be
> > > deemed as neither given nor endorsed by Winbond.
> > >
> >
> >
> >
> > --
> > Thanks,
> > Michael Antonov
> >
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended
> to be read only by the individual or entity to whom this message is
> addressed. If the reader of this message is not the intended recipient
> or an agent or designee of the intended recipient, please note that
> any review, use, disclosure or distribution of this message or its
> attachments, in any form, is strictly prohibited.  If you have
> received this message in error, please immediately notify the sender
> and/or Notifications@carrieriq.com and delete or destroy any copy of this message and its attachments.

>
> The privileged confidential information contained in this email is
> intended for use only by the addressees as indicated by the original
> sender of this email. If you are not the addressee indicated in this
> email or are not responsible for delivery of the email to such a
> person, please kindly reply to the sender indicating this fact and
> delete all copies of it from your computer and network server
> immediately. Your cooperation is highly appreciated. It is advised
> that any unauthorized use of confidential information of Winbond is
> strictly prohibited; and any information in this email irrelevant to
> the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.
>

The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

Re: suggestion for how eliminate memory problem in heavy-write hbase region server

Posted by lars hofhansl <la...@apache.org>.

Please let us know how it is going, Henry.

I would at least try setting:
-XX:+UseCMSInitiatingOccupancyOnly
As well as increasing the new size (despite what I have said in other threads, it might be needed here). Maybe start with 512mb instead of 256:
-Xmn512m

I would remove these as it looks like your perm gen might be too small:
-XX:PermSize=160m
-XX:MaxPermSize=160m

Personally I would not recommend G1 just yet. AFAIK it has a nasty fall back to single threaded full GC.

-- Lars


----- Original Message -----
From: Henry Hung <YT...@winbond.com>
To: "user@hbase.apache.org" <us...@hbase.apache.org>
Cc: 
Sent: Sunday, April 27, 2014 6:44 PM
Subject: RE: suggestion for how eliminate memory problem in heavy-write hbase region server

@Bryan,

Do I have to recompile the Hadoop-2.2.0 and hbase-0.96 using java 1.7 or just change the JAVA_HOME and restart it?

I would probably use Oracle Java 1.7 u55 to test it.

Best regards,
Henry

-----Original Message-----
From: Bryan Beaudreault [mailto:bbeaudreault@hubspot.com]
Sent: Monday, April 28, 2014 8:45 AM
To: user@hbase.apache.org
Subject: Re: suggestion for how eliminate memory problem in heavy-write hbase region server

Keep in mind the math for heap.  Memstore is 40%, and that is divided across all regions for a RS.  With 2GB, that's under 30mb per region assuming well distributed writes.

Are your regionservers starved for CPU?  Either way, I'd try the java7 G1 GC on one regionserver and report back.  We run with 25GB heaps and never have long pauses, so 16GB should be fine with enough CPU.


On Sun, Apr 27, 2014 at 8:27 PM, Henry Hung <YT...@winbond.com> wrote:

> @Vladimir:
>
> My total region count is 330 right now, but I expect the number will
> surpass 1000 at the end of this year.
> Although current system is heavy-write, but I expect it to be also a
> read intense system, because I want to do a lot of data analysis in the future.
>
> About memory allocation, I assign 16 GB heap memory just because I
> have so many RAM space left over to allocate (each node has 32GB RAM installed).
> I never really thought that 2GB heap will be enough, but will give it
> a shot.
>
> Best regards,
> Henry
>
> -----Original Message-----
> From: Vladimir Rodionov [mailto:vrodionov@carrieriq.com]
> Sent: Saturday, April 26, 2014 1:48 AM
> To: user@hbase.apache.org
> Subject: RE: suggestion for how eliminate memory problem in
> heavy-write hbase region server
>
> I am just wondering why do you need large heaps on write - heavy cluster?
> How many regions per RS do you have? Usually large heaps need for read
> intensive applications to keep blocks in a cache, but now we have
> option to keep these blocks off heap (at least since 0.96) With
> MemSLAB for MemStore enabled you need 2MB (by default) per region,
> this is what you need to take into account first. Even 1000 regions won't eat more than 2GB of heap.
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>
> ________________________________________
> From: Ted Yu [yuzhihong@gmail.com]
> Sent: Friday, April 25, 2014 9:47 AM
> To: user@hbase.apache.org
> Subject: Re: suggestion for how eliminate memory problem in
> heavy-write hbase region server
>
> Henry:
> Please also take a look at the following thread:
>
>
> http://search-hadoop.com/m/51M4jeDMyy1/GC+recommendations+for+large+Re
> gion+Server+heaps&subj=RE+GC+recommendations+for+large+Region+Server+h
> eaps
>
>
> On Thu, Apr 24, 2014 at 11:17 PM, Mikhail Antonov
> <olorinbant@gmail.com
> >wrote:
>
> > Henry,
> >
> > http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html
> > - that may give some insights.
> >
> > -Mikhail
> >
> >
> > 2014-04-24 23:07 GMT-07:00 Henry Hung <YT...@winbond.com>:
> >
> > > Dear All,
> > >
> > > My current hbase environment is heavy write cluster with constant
> > > 2000+ insert rows / second spread to 10 region servers.
> > > Each day I also need to do data deletion, and that will add a lot
> > > of IO
> > to
> > > the cluster.
> > >
> > > The problem is sometimes after a week, one of the region server
> > > will
> > crash
> > > because
> > > 2014-04-10T10:17:47.200+0800: 1281486.956: [GC 1281486.956:
> > > [ParNew (promotion failed): 235959K->235959K(235968K), 0.0836790
> > secs]1281487.040:
> > > [CMS2014-04-10T10:21:14.957+0800: 1281694.712: [CMS-concurrent-sweep:
> > > 267.111/279.155 secs] [Times: user=334.79 sys=14.38, real=279.11
> > > secs] (concurrent mode failure): 13961950K->6802914K(16515072K),
> > > 209.9436660 secs] 14186496K->6802914K(16751040K), [CMS Perm :
> > 42864K->42859K(71816K)],
> > > 210.0274680 secs] [Times: user=210.18 sys=0.01, real=209.99 secs]
> > >
> > > I look into the gc log and usually find some information about CMS
> > > concurrent sweep that took a very long time to complete, such as:
> > > 2014-04-10T10:15:56.929+0800: 1281376.684: [CMS-concurrent-sweep:
> > > 48.834/58.027 secs] [Times: user=101.52 sys=11.82, real=58.02
> > > secs]
> > >
> > > I do a lot of google-ing and already read the Todd Lipcon avoiding
> > > full GC, or other blogs that sometimes tells me how to set jvm
> > > flags such as
> > > this:
> > > -XX:+UseParNewGC
> > > -XX:CMSInitiatingOccupancyFraction=70
> > > -Xmn256m
> > > -Xmx16384m
> > > -XX:+DisableExplicitGC
> > > -XX:+UseCompressedOops
> > > -XX:PermSize=160m
> > > -XX:MaxPermSize=160m
> > > -XX:GCTimeRatio=19
> > > -XX:SoftRefLRUPolicyMSPerMB=0
> > > -XX:SurvivorRatio=2
> > > -XX:MaxTenuringThreshold=1
> > > -XX:+UseFastAccessorMethods
> > > -XX:+UseParNewGC
> > > -XX:+UseConcMarkSweepGC
> > > -XX:+CMSParallelRemarkEnabled
> > > -XX:+UseCMSCompactAtFullCollection
> > > -XX:CMSFullGCsBeforeCompaction=0
> > > -XX:+CMSClassUnloadingEnabled
> > > -XX:CMSMaxAbortablePrecleanTime=300
> > > -XX:+CMSScavengeBeforeRemark
> > >
> > > But alas, the problem still exist.
> > >
> > > I also know that java 1.7 has a new G1GC that probably can be used
> > > to fix this problem, but I don't know if hbase 0.96 is ready to use it?
> > >
> > > I would really appreciate it if someone out there can share one or
> > > two things about jvm configuration to achieve a more stable region
> server.
> > >
> > > Best regards,
> > > Henry
> > >
> > > ________________________________
> > > The privileged confidential information contained in this email is
> > > intended for use only by the addressees as indicated by the
> > > original
> > sender
> > > of this email. If you are not the addressee indicated in this
> > > email or
> > are
> > > not responsible for delivery of the email to such a person, please
> > > kindly reply to the sender indicating this fact and delete all
> > > copies of it from your computer and network server immediately.
> > > Your cooperation is highly appreciated. It is advised that any
> > > unauthorized use of confidential information of Winbond is
> > > strictly prohibited; and any information in
> > this
> > > email irrelevant to the official business of Winbond shall be
> > > deemed as neither given nor endorsed by Winbond.
> > >
> >
> >
> >
> > --
> > Thanks,
> > Michael Antonov
> >
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended
> to be read only by the individual or entity to whom this message is
> addressed. If the reader of this message is not the intended recipient
> or an agent or designee of the intended recipient, please note that
> any review, use, disclosure or distribution of this message or its
> attachments, in any form, is strictly prohibited.  If you have
> received this message in error, please immediately notify the sender
> and/or Notifications@carrieriq.com and delete or destroy any copy of this message and its attachments.

>
> The privileged confidential information contained in this email is
> intended for use only by the addressees as indicated by the original
> sender of this email. If you are not the addressee indicated in this
> email or are not responsible for delivery of the email to such a
> person, please kindly reply to the sender indicating this fact and
> delete all copies of it from your computer and network server
> immediately. Your cooperation is highly appreciated. It is advised
> that any unauthorized use of confidential information of Winbond is
> strictly prohibited; and any information in this email irrelevant to
> the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.
>

The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

Re: suggestion for how eliminate memory problem in heavy-write hbase region server

Posted by Bryan Beaudreault <bb...@hubspot.com>.

Just changing JAVA_HOME worked for us.


On Sun, Apr 27, 2014 at 9:44 PM, Henry Hung <YT...@winbond.com> wrote:

> @Bryan,
>
> Do I have to recompile the Hadoop-2.2.0 and hbase-0.96 using java 1.7 or
> just change the JAVA_HOME and restart it?
>
> I would probably use Oracle Java 1.7 u55 to test it.
>
> Best regards,
> Henry
>
> -----Original Message-----
> From: Bryan Beaudreault [mailto:bbeaudreault@hubspot.com]
> Sent: Monday, April 28, 2014 8:45 AM
> To: user@hbase.apache.org
> Subject: Re: suggestion for how eliminate memory problem in heavy-write
> hbase region server
>
> Keep in mind the math for heap.  Memstore is 40%, and that is divided
> across all regions for a RS.  With 2GB, that's under 30mb per region
> assuming well distributed writes.
>
> Are your regionservers starved for CPU?  Either way, I'd try the java7 G1
> GC on one regionserver and report back.  We run with 25GB heaps and never
> have long pauses, so 16GB should be fine with enough CPU.
>
>
> On Sun, Apr 27, 2014 at 8:27 PM, Henry Hung <YT...@winbond.com> wrote:
>
> > @Vladimir:
> >
> > My total region count is 330 right now, but I expect the number will
> > surpass 1000 at the end of this year.
> > Although current system is heavy-write, but I expect it to be also a
> > read intense system, because I want to do a lot of data analysis in the
> future.
> >
> > About memory allocation, I assign 16 GB heap memory just because I
> > have so many RAM space left over to allocate (each node has 32GB RAM
> installed).
> > I never really thought that 2GB heap will be enough, but will give it
> > a shot.
> >
> > Best regards,
> > Henry
> >
> > -----Original Message-----
> > From: Vladimir Rodionov [mailto:vrodionov@carrieriq.com]
> > Sent: Saturday, April 26, 2014 1:48 AM
> > To: user@hbase.apache.org
> > Subject: RE: suggestion for how eliminate memory problem in
> > heavy-write hbase region server
> >
> > I am just wondering why do you need large heaps on write - heavy cluster?
> > How many regions per RS do you have? Usually large heaps need for read
> > intensive applications to keep blocks in a cache, but now we have
> > option to keep these blocks off heap (at least since 0.96) With
> > MemSLAB for MemStore enabled you need 2MB (by default) per region,
> > this is what you need to take into account first. Even 1000 regions
> won't eat more than 2GB of heap.
> >
> > Best regards,
> > Vladimir Rodionov
> > Principal Platform Engineer
> > Carrier IQ, www.carrieriq.com
> > e-mail: vrodionov@carrieriq.com
> >
> > ________________________________________
> > From: Ted Yu [yuzhihong@gmail.com]
> > Sent: Friday, April 25, 2014 9:47 AM
> > To: user@hbase.apache.org
> > Subject: Re: suggestion for how eliminate memory problem in
> > heavy-write hbase region server
> >
> > Henry:
> > Please also take a look at the following thread:
> >
> >
> > http://search-hadoop.com/m/51M4jeDMyy1/GC+recommendations+for+large+Re
> > gion+Server+heaps&subj=RE+GC+recommendations+for+large+Region+Server+h
> > eaps
> >
> >
> > On Thu, Apr 24, 2014 at 11:17 PM, Mikhail Antonov
> > <olorinbant@gmail.com
> > >wrote:
> >
> > > Henry,
> > >
> > > http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html
> > > - that may give some insights.
> > >
> > > -Mikhail
> > >
> > >
> > > 2014-04-24 23:07 GMT-07:00 Henry Hung <YT...@winbond.com>:
> > >
> > > > Dear All,
> > > >
> > > > My current hbase environment is heavy write cluster with constant
> > > > 2000+ insert rows / second spread to 10 region servers.
> > > > Each day I also need to do data deletion, and that will add a lot
> > > > of IO
> > > to
> > > > the cluster.
> > > >
> > > > The problem is sometimes after a week, one of the region server
> > > > will
> > > crash
> > > > because
> > > > 2014-04-10T10:17:47.200+0800: 1281486.956: [GC 1281486.956:
> > > > [ParNew (promotion failed): 235959K->235959K(235968K), 0.0836790
> > > secs]1281487.040:
> > > > [CMS2014-04-10T10:21:14.957+0800: 1281694.712: [CMS-concurrent-sweep:
> > > > 267.111/279.155 secs] [Times: user=334.79 sys=14.38, real=279.11
> > > > secs] (concurrent mode failure): 13961950K->6802914K(16515072K),
> > > > 209.9436660 secs] 14186496K->6802914K(16751040K), [CMS Perm :
> > > 42864K->42859K(71816K)],
> > > > 210.0274680 secs] [Times: user=210.18 sys=0.01, real=209.99 secs]
> > > >
> > > > I look into the gc log and usually find some information about CMS
> > > > concurrent sweep that took a very long time to complete, such as:
> > > > 2014-04-10T10:15:56.929+0800: 1281376.684: [CMS-concurrent-sweep:
> > > > 48.834/58.027 secs] [Times: user=101.52 sys=11.82, real=58.02
> > > > secs]
> > > >
> > > > I do a lot of google-ing and already read the Todd Lipcon avoiding
> > > > full GC, or other blogs that sometimes tells me how to set jvm
> > > > flags such as
> > > > this:
> > > > -XX:+UseParNewGC
> > > > -XX:CMSInitiatingOccupancyFraction=70
> > > > -Xmn256m
> > > > -Xmx16384m
> > > > -XX:+DisableExplicitGC
> > > > -XX:+UseCompressedOops
> > > > -XX:PermSize=160m
> > > > -XX:MaxPermSize=160m
> > > > -XX:GCTimeRatio=19
> > > > -XX:SoftRefLRUPolicyMSPerMB=0
> > > > -XX:SurvivorRatio=2
> > > > -XX:MaxTenuringThreshold=1
> > > > -XX:+UseFastAccessorMethods
> > > > -XX:+UseParNewGC
> > > > -XX:+UseConcMarkSweepGC
> > > > -XX:+CMSParallelRemarkEnabled
> > > > -XX:+UseCMSCompactAtFullCollection
> > > > -XX:CMSFullGCsBeforeCompaction=0
> > > > -XX:+CMSClassUnloadingEnabled
> > > > -XX:CMSMaxAbortablePrecleanTime=300
> > > > -XX:+CMSScavengeBeforeRemark
> > > >
> > > > But alas, the problem still exist.
> > > >
> > > > I also know that java 1.7 has a new G1GC that probably can be used
> > > > to fix this problem, but I don't know if hbase 0.96 is ready to use
> it?
> > > >
> > > > I would really appreciate it if someone out there can share one or
> > > > two things about jvm configuration to achieve a more stable region
> > server.
> > > >
> > > > Best regards,
> > > > Henry
> > > >
> > > > ________________________________
> > > > The privileged confidential information contained in this email is
> > > > intended for use only by the addressees as indicated by the
> > > > original
> > > sender
> > > > of this email. If you are not the addressee indicated in this
> > > > email or
> > > are
> > > > not responsible for delivery of the email to such a person, please
> > > > kindly reply to the sender indicating this fact and delete all
> > > > copies of it from your computer and network server immediately.
> > > > Your cooperation is highly appreciated. It is advised that any
> > > > unauthorized use of confidential information of Winbond is
> > > > strictly prohibited; and any information in
> > > this
> > > > email irrelevant to the official business of Winbond shall be
> > > > deemed as neither given nor endorsed by Winbond.
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Michael Antonov
> > >
> >
> > Confidentiality Notice:  The information contained in this message,
> > including any attachments hereto, may be confidential and is intended
> > to be read only by the individual or entity to whom this message is
> > addressed. If the reader of this message is not the intended recipient
> > or an agent or designee of the intended recipient, please note that
> > any review, use, disclosure or distribution of this message or its
> > attachments, in any form, is strictly prohibited.  If you have
> > received this message in error, please immediately notify the sender
> > and/or Notifications@carrieriq.com and delete or destroy any copy of
> this message and its attachments.
> >
> > The privileged confidential information contained in this email is
> > intended for use only by the addressees as indicated by the original
> > sender of this email. If you are not the addressee indicated in this
> > email or are not responsible for delivery of the email to such a
> > person, please kindly reply to the sender indicating this fact and
> > delete all copies of it from your computer and network server
> > immediately. Your cooperation is highly appreciated. It is advised
> > that any unauthorized use of confidential information of Winbond is
> > strictly prohibited; and any information in this email irrelevant to
> > the official business of Winbond shall be deemed as neither given nor
> endorsed by Winbond.
> >
>
> The privileged confidential information contained in this email is
> intended for use only by the addressees as indicated by the original sender
> of this email. If you are not the addressee indicated in this email or are
> not responsible for delivery of the email to such a person, please kindly
> reply to the sender indicating this fact and delete all copies of it from
> your computer and network server immediately. Your cooperation is highly
> appreciated. It is advised that any unauthorized use of confidential
> information of Winbond is strictly prohibited; and any information in this
> email irrelevant to the official business of Winbond shall be deemed as
> neither given nor endorsed by Winbond.
>

RE: suggestion for how eliminate memory problem in heavy-write hbase region server

Posted by Henry Hung <YT...@winbond.com>.

@Bryan,

Do I have to recompile the Hadoop-2.2.0 and hbase-0.96 using java 1.7 or just change the JAVA_HOME and restart it?

I would probably use Oracle Java 1.7 u55 to test it.

Best regards,
Henry

-----Original Message-----
From: Bryan Beaudreault [mailto:bbeaudreault@hubspot.com]
Sent: Monday, April 28, 2014 8:45 AM
To: user@hbase.apache.org
Subject: Re: suggestion for how eliminate memory problem in heavy-write hbase region server

Keep in mind the math for heap.  Memstore is 40%, and that is divided across all regions for a RS.  With 2GB, that's under 30mb per region assuming well distributed writes.

Are your regionservers starved for CPU?  Either way, I'd try the java7 G1 GC on one regionserver and report back.  We run with 25GB heaps and never have long pauses, so 16GB should be fine with enough CPU.


On Sun, Apr 27, 2014 at 8:27 PM, Henry Hung <YT...@winbond.com> wrote:

> @Vladimir:
>
> My total region count is 330 right now, but I expect the number will
> surpass 1000 at the end of this year.
> Although current system is heavy-write, but I expect it to be also a
> read intense system, because I want to do a lot of data analysis in the future.
>
> About memory allocation, I assign 16 GB heap memory just because I
> have so many RAM space left over to allocate (each node has 32GB RAM installed).
> I never really thought that 2GB heap will be enough, but will give it
> a shot.
>
> Best regards,
> Henry
>
> -----Original Message-----
> From: Vladimir Rodionov [mailto:vrodionov@carrieriq.com]
> Sent: Saturday, April 26, 2014 1:48 AM
> To: user@hbase.apache.org
> Subject: RE: suggestion for how eliminate memory problem in
> heavy-write hbase region server
>
> I am just wondering why do you need large heaps on write - heavy cluster?
> How many regions per RS do you have? Usually large heaps need for read
> intensive applications to keep blocks in a cache, but now we have
> option to keep these blocks off heap (at least since 0.96) With
> MemSLAB for MemStore enabled you need 2MB (by default) per region,
> this is what you need to take into account first. Even 1000 regions won't eat more than 2GB of heap.
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>
> ________________________________________
> From: Ted Yu [yuzhihong@gmail.com]
> Sent: Friday, April 25, 2014 9:47 AM
> To: user@hbase.apache.org
> Subject: Re: suggestion for how eliminate memory problem in
> heavy-write hbase region server
>
> Henry:
> Please also take a look at the following thread:
>
>
> http://search-hadoop.com/m/51M4jeDMyy1/GC+recommendations+for+large+Re
> gion+Server+heaps&subj=RE+GC+recommendations+for+large+Region+Server+h
> eaps
>
>
> On Thu, Apr 24, 2014 at 11:17 PM, Mikhail Antonov
> <olorinbant@gmail.com
> >wrote:
>
> > Henry,
> >
> > http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html
> > - that may give some insights.
> >
> > -Mikhail
> >
> >
> > 2014-04-24 23:07 GMT-07:00 Henry Hung <YT...@winbond.com>:
> >
> > > Dear All,
> > >
> > > My current hbase environment is heavy write cluster with constant
> > > 2000+ insert rows / second spread to 10 region servers.
> > > Each day I also need to do data deletion, and that will add a lot
> > > of IO
> > to
> > > the cluster.
> > >
> > > The problem is sometimes after a week, one of the region server
> > > will
> > crash
> > > because
> > > 2014-04-10T10:17:47.200+0800: 1281486.956: [GC 1281486.956:
> > > [ParNew (promotion failed): 235959K->235959K(235968K), 0.0836790
> > secs]1281487.040:
> > > [CMS2014-04-10T10:21:14.957+0800: 1281694.712: [CMS-concurrent-sweep:
> > > 267.111/279.155 secs] [Times: user=334.79 sys=14.38, real=279.11
> > > secs] (concurrent mode failure): 13961950K->6802914K(16515072K),
> > > 209.9436660 secs] 14186496K->6802914K(16751040K), [CMS Perm :
> > 42864K->42859K(71816K)],
> > > 210.0274680 secs] [Times: user=210.18 sys=0.01, real=209.99 secs]
> > >
> > > I look into the gc log and usually find some information about CMS
> > > concurrent sweep that took a very long time to complete, such as:
> > > 2014-04-10T10:15:56.929+0800: 1281376.684: [CMS-concurrent-sweep:
> > > 48.834/58.027 secs] [Times: user=101.52 sys=11.82, real=58.02
> > > secs]
> > >
> > > I do a lot of google-ing and already read the Todd Lipcon avoiding
> > > full GC, or other blogs that sometimes tells me how to set jvm
> > > flags such as
> > > this:
> > > -XX:+UseParNewGC
> > > -XX:CMSInitiatingOccupancyFraction=70
> > > -Xmn256m
> > > -Xmx16384m
> > > -XX:+DisableExplicitGC
> > > -XX:+UseCompressedOops
> > > -XX:PermSize=160m
> > > -XX:MaxPermSize=160m
> > > -XX:GCTimeRatio=19
> > > -XX:SoftRefLRUPolicyMSPerMB=0
> > > -XX:SurvivorRatio=2
> > > -XX:MaxTenuringThreshold=1
> > > -XX:+UseFastAccessorMethods
> > > -XX:+UseParNewGC
> > > -XX:+UseConcMarkSweepGC
> > > -XX:+CMSParallelRemarkEnabled
> > > -XX:+UseCMSCompactAtFullCollection
> > > -XX:CMSFullGCsBeforeCompaction=0
> > > -XX:+CMSClassUnloadingEnabled
> > > -XX:CMSMaxAbortablePrecleanTime=300
> > > -XX:+CMSScavengeBeforeRemark
> > >
> > > But alas, the problem still exist.
> > >
> > > I also know that java 1.7 has a new G1GC that probably can be used
> > > to fix this problem, but I don't know if hbase 0.96 is ready to use it?
> > >
> > > I would really appreciate it if someone out there can share one or
> > > two things about jvm configuration to achieve a more stable region
> server.
> > >
> > > Best regards,
> > > Henry
> > >
> > > ________________________________
> > > The privileged confidential information contained in this email is
> > > intended for use only by the addressees as indicated by the
> > > original
> > sender
> > > of this email. If you are not the addressee indicated in this
> > > email or
> > are
> > > not responsible for delivery of the email to such a person, please
> > > kindly reply to the sender indicating this fact and delete all
> > > copies of it from your computer and network server immediately.
> > > Your cooperation is highly appreciated. It is advised that any
> > > unauthorized use of confidential information of Winbond is
> > > strictly prohibited; and any information in
> > this
> > > email irrelevant to the official business of Winbond shall be
> > > deemed as neither given nor endorsed by Winbond.
> > >
> >
> >
> >
> > --
> > Thanks,
> > Michael Antonov
> >
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended
> to be read only by the individual or entity to whom this message is
> addressed. If the reader of this message is not the intended recipient
> or an agent or designee of the intended recipient, please note that
> any review, use, disclosure or distribution of this message or its
> attachments, in any form, is strictly prohibited.  If you have
> received this message in error, please immediately notify the sender
> and/or Notifications@carrieriq.com and delete or destroy any copy of this message and its attachments.
>
> The privileged confidential information contained in this email is
> intended for use only by the addressees as indicated by the original
> sender of this email. If you are not the addressee indicated in this
> email or are not responsible for delivery of the email to such a
> person, please kindly reply to the sender indicating this fact and
> delete all copies of it from your computer and network server
> immediately. Your cooperation is highly appreciated. It is advised
> that any unauthorized use of confidential information of Winbond is
> strictly prohibited; and any information in this email irrelevant to
> the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.
>

The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

Re: suggestion for how eliminate memory problem in heavy-write hbase region server

Posted by Bryan Beaudreault <bb...@hubspot.com>.

Keep in mind the math for heap.  Memstore is 40%, and that is divided
across all regions for a RS.  With 2GB, that's under 30mb per region
assuming well distributed writes.

Are your regionservers starved for CPU?  Either way, I'd try the java7 G1
GC on one regionserver and report back.  We run with 25GB heaps and never
have long pauses, so 16GB should be fine with enough CPU.


On Sun, Apr 27, 2014 at 8:27 PM, Henry Hung <YT...@winbond.com> wrote:

> @Vladimir:
>
> My total region count is 330 right now, but I expect the number will
> surpass 1000 at the end of this year.
> Although current system is heavy-write, but I expect it to be also a read
> intense system, because I want to do a lot of data analysis in the future.
>
> About memory allocation, I assign 16 GB heap memory just because I have so
> many RAM space left over to allocate (each node has 32GB RAM installed).
> I never really thought that 2GB heap will be enough, but will give it a
> shot.
>
> Best regards,
> Henry
>
> -----Original Message-----
> From: Vladimir Rodionov [mailto:vrodionov@carrieriq.com]
> Sent: Saturday, April 26, 2014 1:48 AM
> To: user@hbase.apache.org
> Subject: RE: suggestion for how eliminate memory problem in heavy-write
> hbase region server
>
> I am just wondering why do you need large heaps on write - heavy cluster?
> How many regions per RS do you have? Usually large heaps need for read
> intensive applications to keep blocks in a cache, but now we have option to
> keep these blocks off heap (at least since 0.96) With MemSLAB for MemStore
> enabled you need 2MB (by default) per region, this is what you need to take
> into account first. Even 1000 regions won't eat more than 2GB of heap.
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>
> ________________________________________
> From: Ted Yu [yuzhihong@gmail.com]
> Sent: Friday, April 25, 2014 9:47 AM
> To: user@hbase.apache.org
> Subject: Re: suggestion for how eliminate memory problem in heavy-write
> hbase region server
>
> Henry:
> Please also take a look at the following thread:
>
>
> http://search-hadoop.com/m/51M4jeDMyy1/GC+recommendations+for+large+Region+Server+heaps&subj=RE+GC+recommendations+for+large+Region+Server+heaps
>
>
> On Thu, Apr 24, 2014 at 11:17 PM, Mikhail Antonov <olorinbant@gmail.com
> >wrote:
>
> > Henry,
> >
> > http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html -
> > that may give some insights.
> >
> > -Mikhail
> >
> >
> > 2014-04-24 23:07 GMT-07:00 Henry Hung <YT...@winbond.com>:
> >
> > > Dear All,
> > >
> > > My current hbase environment is heavy write cluster with constant
> > > 2000+ insert rows / second spread to 10 region servers.
> > > Each day I also need to do data deletion, and that will add a lot of
> > > IO
> > to
> > > the cluster.
> > >
> > > The problem is sometimes after a week, one of the region server will
> > crash
> > > because
> > > 2014-04-10T10:17:47.200+0800: 1281486.956: [GC 1281486.956: [ParNew
> > > (promotion failed): 235959K->235959K(235968K), 0.0836790
> > secs]1281487.040:
> > > [CMS2014-04-10T10:21:14.957+0800: 1281694.712: [CMS-concurrent-sweep:
> > > 267.111/279.155 secs] [Times: user=334.79 sys=14.38, real=279.11
> > > secs] (concurrent mode failure): 13961950K->6802914K(16515072K),
> > > 209.9436660 secs] 14186496K->6802914K(16751040K), [CMS Perm :
> > 42864K->42859K(71816K)],
> > > 210.0274680 secs] [Times: user=210.18 sys=0.01, real=209.99 secs]
> > >
> > > I look into the gc log and usually find some information about CMS
> > > concurrent sweep that took a very long time to complete, such as:
> > > 2014-04-10T10:15:56.929+0800: 1281376.684: [CMS-concurrent-sweep:
> > > 48.834/58.027 secs] [Times: user=101.52 sys=11.82, real=58.02 secs]
> > >
> > > I do a lot of google-ing and already read the Todd Lipcon avoiding
> > > full GC, or other blogs that sometimes tells me how to set jvm flags
> > > such as
> > > this:
> > > -XX:+UseParNewGC
> > > -XX:CMSInitiatingOccupancyFraction=70
> > > -Xmn256m
> > > -Xmx16384m
> > > -XX:+DisableExplicitGC
> > > -XX:+UseCompressedOops
> > > -XX:PermSize=160m
> > > -XX:MaxPermSize=160m
> > > -XX:GCTimeRatio=19
> > > -XX:SoftRefLRUPolicyMSPerMB=0
> > > -XX:SurvivorRatio=2
> > > -XX:MaxTenuringThreshold=1
> > > -XX:+UseFastAccessorMethods
> > > -XX:+UseParNewGC
> > > -XX:+UseConcMarkSweepGC
> > > -XX:+CMSParallelRemarkEnabled
> > > -XX:+UseCMSCompactAtFullCollection
> > > -XX:CMSFullGCsBeforeCompaction=0
> > > -XX:+CMSClassUnloadingEnabled
> > > -XX:CMSMaxAbortablePrecleanTime=300
> > > -XX:+CMSScavengeBeforeRemark
> > >
> > > But alas, the problem still exist.
> > >
> > > I also know that java 1.7 has a new G1GC that probably can be used
> > > to fix this problem, but I don't know if hbase 0.96 is ready to use it?
> > >
> > > I would really appreciate it if someone out there can share one or
> > > two things about jvm configuration to achieve a more stable region
> server.
> > >
> > > Best regards,
> > > Henry
> > >
> > > ________________________________
> > > The privileged confidential information contained in this email is
> > > intended for use only by the addressees as indicated by the original
> > sender
> > > of this email. If you are not the addressee indicated in this email
> > > or
> > are
> > > not responsible for delivery of the email to such a person, please
> > > kindly reply to the sender indicating this fact and delete all
> > > copies of it from your computer and network server immediately. Your
> > > cooperation is highly appreciated. It is advised that any
> > > unauthorized use of confidential information of Winbond is strictly
> > > prohibited; and any information in
> > this
> > > email irrelevant to the official business of Winbond shall be deemed
> > > as neither given nor endorsed by Winbond.
> > >
> >
> >
> >
> > --
> > Thanks,
> > Michael Antonov
> >
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended to be
> read only by the individual or entity to whom this message is addressed. If
> the reader of this message is not the intended recipient or an agent or
> designee of the intended recipient, please note that any review, use,
> disclosure or distribution of this message or its attachments, in any form,
> is strictly prohibited.  If you have received this message in error, please
> immediately notify the sender and/or Notifications@carrieriq.com and
> delete or destroy any copy of this message and its attachments.
>
> The privileged confidential information contained in this email is
> intended for use only by the addressees as indicated by the original sender
> of this email. If you are not the addressee indicated in this email or are
> not responsible for delivery of the email to such a person, please kindly
> reply to the sender indicating this fact and delete all copies of it from
> your computer and network server immediately. Your cooperation is highly
> appreciated. It is advised that any unauthorized use of confidential
> information of Winbond is strictly prohibited; and any information in this
> email irrelevant to the official business of Winbond shall be deemed as
> neither given nor endorsed by Winbond.
>

RE: suggestion for how eliminate memory problem in heavy-write hbase region server

Posted by Henry Hung <YT...@winbond.com>.

@Vladimir:

My total region count is 330 right now, but I expect the number will surpass 1000 at the end of this year.
Although current system is heavy-write, but I expect it to be also a read intense system, because I want to do a lot of data analysis in the future.

About memory allocation, I assign 16 GB heap memory just because I have so many RAM space left over to allocate (each node has 32GB RAM installed).
I never really thought that 2GB heap will be enough, but will give it a shot.

Best regards,
Henry

-----Original Message-----
From: Vladimir Rodionov [mailto:vrodionov@carrieriq.com]
Sent: Saturday, April 26, 2014 1:48 AM
To: user@hbase.apache.org
Subject: RE: suggestion for how eliminate memory problem in heavy-write hbase region server

I am just wondering why do you need large heaps on write - heavy cluster?
How many regions per RS do you have? Usually large heaps need for read intensive applications to keep blocks in a cache, but now we have option to keep these blocks off heap (at least since 0.96) With MemSLAB for MemStore enabled you need 2MB (by default) per region, this is what you need to take into account first. Even 1000 regions won't eat more than 2GB of heap.

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________
From: Ted Yu [yuzhihong@gmail.com]
Sent: Friday, April 25, 2014 9:47 AM
To: user@hbase.apache.org
Subject: Re: suggestion for how eliminate memory problem in heavy-write hbase region server

Henry:
Please also take a look at the following thread:

http://search-hadoop.com/m/51M4jeDMyy1/GC+recommendations+for+large+Region+Server+heaps&subj=RE+GC+recommendations+for+large+Region+Server+heaps

On Thu, Apr 24, 2014 at 11:17 PM, Mikhail Antonov <ol...@gmail.com>wrote:

> Henry,
>
> http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html -
> that may give some insights.
>
> -Mikhail
>
>
> 2014-04-24 23:07 GMT-07:00 Henry Hung <YT...@winbond.com>:
>
> > Dear All,
> >
> > My current hbase environment is heavy write cluster with constant
> > 2000+ insert rows / second spread to 10 region servers.
> > Each day I also need to do data deletion, and that will add a lot of
> > IO
> to
> > the cluster.
> >
> > The problem is sometimes after a week, one of the region server will
> crash
> > because
> > 2014-04-10T10:17:47.200+0800: 1281486.956: [GC 1281486.956: [ParNew
> > (promotion failed): 235959K->235959K(235968K), 0.0836790
> secs]1281487.040:
> > [CMS2014-04-10T10:21:14.957+0800: 1281694.712: [CMS-concurrent-sweep:
> > 267.111/279.155 secs] [Times: user=334.79 sys=14.38, real=279.11
> > secs] (concurrent mode failure): 13961950K->6802914K(16515072K),
> > 209.9436660 secs] 14186496K->6802914K(16751040K), [CMS Perm :
> 42864K->42859K(71816K)],
> > 210.0274680 secs] [Times: user=210.18 sys=0.01, real=209.99 secs]
> >
> > I look into the gc log and usually find some information about CMS
> > concurrent sweep that took a very long time to complete, such as:
> > 2014-04-10T10:15:56.929+0800: 1281376.684: [CMS-concurrent-sweep:
> > 48.834/58.027 secs] [Times: user=101.52 sys=11.82, real=58.02 secs]
> >
> > I do a lot of google-ing and already read the Todd Lipcon avoiding
> > full GC, or other blogs that sometimes tells me how to set jvm flags
> > such as
> > this:
> > -XX:+UseParNewGC
> > -XX:CMSInitiatingOccupancyFraction=70
> > -Xmn256m
> > -Xmx16384m
> > -XX:+DisableExplicitGC
> > -XX:+UseCompressedOops
> > -XX:PermSize=160m
> > -XX:MaxPermSize=160m
> > -XX:GCTimeRatio=19
> > -XX:SoftRefLRUPolicyMSPerMB=0
> > -XX:SurvivorRatio=2
> > -XX:MaxTenuringThreshold=1
> > -XX:+UseFastAccessorMethods
> > -XX:+UseParNewGC
> > -XX:+UseConcMarkSweepGC
> > -XX:+CMSParallelRemarkEnabled
> > -XX:+UseCMSCompactAtFullCollection
> > -XX:CMSFullGCsBeforeCompaction=0
> > -XX:+CMSClassUnloadingEnabled
> > -XX:CMSMaxAbortablePrecleanTime=300
> > -XX:+CMSScavengeBeforeRemark
> >
> > But alas, the problem still exist.
> >
> > I also know that java 1.7 has a new G1GC that probably can be used
> > to fix this problem, but I don't know if hbase 0.96 is ready to use it?
> >
> > I would really appreciate it if someone out there can share one or
> > two things about jvm configuration to achieve a more stable region server.
> >
> > Best regards,
> > Henry
> >
> > ________________________________
> > The privileged confidential information contained in this email is
> > intended for use only by the addressees as indicated by the original
> sender
> > of this email. If you are not the addressee indicated in this email
> > or
> are
> > not responsible for delivery of the email to such a person, please
> > kindly reply to the sender indicating this fact and delete all
> > copies of it from your computer and network server immediately. Your
> > cooperation is highly appreciated. It is advised that any
> > unauthorized use of confidential information of Winbond is strictly
> > prohibited; and any information in
> this
> > email irrelevant to the official business of Winbond shall be deemed
> > as neither given nor endorsed by Winbond.
> >
>
>
>
> --
> Thanks,
> Michael Antonov
>

Confidentiality Notice:  The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited.  If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com and delete or destroy any copy of this message and its attachments.

The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.

RE: suggestion for how eliminate memory problem in heavy-write hbase region server

Posted by Vladimir Rodionov <vr...@carrieriq.com>.

I am just wondering why do you need large heaps on write - heavy cluster?
How many regions per RS do you have? Usually large heaps need for read intensive applications
to keep blocks in a cache, but now we have option to keep these blocks off heap (at least since 0.96)
With MemSLAB for MemStore enabled you need 2MB (by default) per region, this is what you
need to take into account first. Even 1000 regions won't eat more than 2GB of heap.

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________
From: Ted Yu [yuzhihong@gmail.com]
Sent: Friday, April 25, 2014 9:47 AM
To: user@hbase.apache.org
Subject: Re: suggestion for how eliminate memory problem in heavy-write hbase region server

Henry:
Please also take a look at the following thread:

http://search-hadoop.com/m/51M4jeDMyy1/GC+recommendations+for+large+Region+Server+heaps&subj=RE+GC+recommendations+for+large+Region+Server+heaps


On Thu, Apr 24, 2014 at 11:17 PM, Mikhail Antonov <ol...@gmail.com>wrote:

> Henry,
>
> http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html - that
> may give some insights.
>
> -Mikhail
>
>
> 2014-04-24 23:07 GMT-07:00 Henry Hung <YT...@winbond.com>:
>
> > Dear All,
> >
> > My current hbase environment is heavy write cluster with constant 2000+
> > insert rows / second spread to 10 region servers.
> > Each day I also need to do data deletion, and that will add a lot of IO
> to
> > the cluster.
> >
> > The problem is sometimes after a week, one of the region server will
> crash
> > because
> > 2014-04-10T10:17:47.200+0800: 1281486.956: [GC 1281486.956: [ParNew
> > (promotion failed): 235959K->235959K(235968K), 0.0836790
> secs]1281487.040:
> > [CMS2014-04-10T10:21:14.957+0800: 1281694.712: [CMS-concurrent-sweep:
> > 267.111/279.155 secs] [Times: user=334.79 sys=14.38, real=279.11 secs]
> > (concurrent mode failure): 13961950K->6802914K(16515072K), 209.9436660
> > secs] 14186496K->6802914K(16751040K), [CMS Perm :
> 42864K->42859K(71816K)],
> > 210.0274680 secs] [Times: user=210.18 sys=0.01, real=209.99 secs]
> >
> > I look into the gc log and usually find some information about CMS
> > concurrent sweep that took a very long time to complete, such as:
> > 2014-04-10T10:15:56.929+0800: 1281376.684: [CMS-concurrent-sweep:
> > 48.834/58.027 secs] [Times: user=101.52 sys=11.82, real=58.02 secs]
> >
> > I do a lot of google-ing and already read the Todd Lipcon avoiding full
> > GC, or other blogs that sometimes tells me how to set jvm flags such as
> > this:
> > -XX:+UseParNewGC
> > -XX:CMSInitiatingOccupancyFraction=70
> > -Xmn256m
> > -Xmx16384m
> > -XX:+DisableExplicitGC
> > -XX:+UseCompressedOops
> > -XX:PermSize=160m
> > -XX:MaxPermSize=160m
> > -XX:GCTimeRatio=19
> > -XX:SoftRefLRUPolicyMSPerMB=0
> > -XX:SurvivorRatio=2
> > -XX:MaxTenuringThreshold=1
> > -XX:+UseFastAccessorMethods
> > -XX:+UseParNewGC
> > -XX:+UseConcMarkSweepGC
> > -XX:+CMSParallelRemarkEnabled
> > -XX:+UseCMSCompactAtFullCollection
> > -XX:CMSFullGCsBeforeCompaction=0
> > -XX:+CMSClassUnloadingEnabled
> > -XX:CMSMaxAbortablePrecleanTime=300
> > -XX:+CMSScavengeBeforeRemark
> >
> > But alas, the problem still exist.
> >
> > I also know that java 1.7 has a new G1GC that probably can be used to fix
> > this problem, but I don't know if hbase 0.96 is ready to use it?
> >
> > I would really appreciate it if someone out there can share one or two
> > things about jvm configuration to achieve a more stable region server.
> >
> > Best regards,
> > Henry
> >
> > ________________________________
> > The privileged confidential information contained in this email is
> > intended for use only by the addressees as indicated by the original
> sender
> > of this email. If you are not the addressee indicated in this email or
> are
> > not responsible for delivery of the email to such a person, please kindly
> > reply to the sender indicating this fact and delete all copies of it from
> > your computer and network server immediately. Your cooperation is highly
> > appreciated. It is advised that any unauthorized use of confidential
> > information of Winbond is strictly prohibited; and any information in
> this
> > email irrelevant to the official business of Winbond shall be deemed as
> > neither given nor endorsed by Winbond.
> >
>
>
>
> --
> Thanks,
> Michael Antonov
>

Confidentiality Notice:  The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited.  If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com and delete or destroy any copy of this message and its attachments.

Re: suggestion for how eliminate memory problem in heavy-write hbase region server

Posted by Bryan Beaudreault <bb...@hubspot.com>.

We've been using Java7 with HBase for a few months now.  Completely stable,
and helped our GC issues quite a bit.


On Fri, Apr 25, 2014 at 12:47 PM, Ted Yu <yu...@gmail.com> wrote:

> Henry:
> Please also take a look at the following thread:
>
>
> http://search-hadoop.com/m/51M4jeDMyy1/GC+recommendations+for+large+Region+Server+heaps&subj=RE+GC+recommendations+for+large+Region+Server+heaps
>
>
> On Thu, Apr 24, 2014 at 11:17 PM, Mikhail Antonov <olorinbant@gmail.com
> >wrote:
>
> > Henry,
> >
> > http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html -
> that
> > may give some insights.
> >
> > -Mikhail
> >
> >
> > 2014-04-24 23:07 GMT-07:00 Henry Hung <YT...@winbond.com>:
> >
> > > Dear All,
> > >
> > > My current hbase environment is heavy write cluster with constant 2000+
> > > insert rows / second spread to 10 region servers.
> > > Each day I also need to do data deletion, and that will add a lot of IO
> > to
> > > the cluster.
> > >
> > > The problem is sometimes after a week, one of the region server will
> > crash
> > > because
> > > 2014-04-10T10:17:47.200+0800: 1281486.956: [GC 1281486.956: [ParNew
> > > (promotion failed): 235959K->235959K(235968K), 0.0836790
> > secs]1281487.040:
> > > [CMS2014-04-10T10:21:14.957+0800: 1281694.712: [CMS-concurrent-sweep:
> > > 267.111/279.155 secs] [Times: user=334.79 sys=14.38, real=279.11 secs]
> > > (concurrent mode failure): 13961950K->6802914K(16515072K), 209.9436660
> > > secs] 14186496K->6802914K(16751040K), [CMS Perm :
> > 42864K->42859K(71816K)],
> > > 210.0274680 secs] [Times: user=210.18 sys=0.01, real=209.99 secs]
> > >
> > > I look into the gc log and usually find some information about CMS
> > > concurrent sweep that took a very long time to complete, such as:
> > > 2014-04-10T10:15:56.929+0800: 1281376.684: [CMS-concurrent-sweep:
> > > 48.834/58.027 secs] [Times: user=101.52 sys=11.82, real=58.02 secs]
> > >
> > > I do a lot of google-ing and already read the Todd Lipcon avoiding full
> > > GC, or other blogs that sometimes tells me how to set jvm flags such as
> > > this:
> > > -XX:+UseParNewGC
> > > -XX:CMSInitiatingOccupancyFraction=70
> > > -Xmn256m
> > > -Xmx16384m
> > > -XX:+DisableExplicitGC
> > > -XX:+UseCompressedOops
> > > -XX:PermSize=160m
> > > -XX:MaxPermSize=160m
> > > -XX:GCTimeRatio=19
> > > -XX:SoftRefLRUPolicyMSPerMB=0
> > > -XX:SurvivorRatio=2
> > > -XX:MaxTenuringThreshold=1
> > > -XX:+UseFastAccessorMethods
> > > -XX:+UseParNewGC
> > > -XX:+UseConcMarkSweepGC
> > > -XX:+CMSParallelRemarkEnabled
> > > -XX:+UseCMSCompactAtFullCollection
> > > -XX:CMSFullGCsBeforeCompaction=0
> > > -XX:+CMSClassUnloadingEnabled
> > > -XX:CMSMaxAbortablePrecleanTime=300
> > > -XX:+CMSScavengeBeforeRemark
> > >
> > > But alas, the problem still exist.
> > >
> > > I also know that java 1.7 has a new G1GC that probably can be used to
> fix
> > > this problem, but I don't know if hbase 0.96 is ready to use it?
> > >
> > > I would really appreciate it if someone out there can share one or two
> > > things about jvm configuration to achieve a more stable region server.
> > >
> > > Best regards,
> > > Henry
> > >
> > > ________________________________
> > > The privileged confidential information contained in this email is
> > > intended for use only by the addressees as indicated by the original
> > sender
> > > of this email. If you are not the addressee indicated in this email or
> > are
> > > not responsible for delivery of the email to such a person, please
> kindly
> > > reply to the sender indicating this fact and delete all copies of it
> from
> > > your computer and network server immediately. Your cooperation is
> highly
> > > appreciated. It is advised that any unauthorized use of confidential
> > > information of Winbond is strictly prohibited; and any information in
> > this
> > > email irrelevant to the official business of Winbond shall be deemed as
> > > neither given nor endorsed by Winbond.
> > >
> >
> >
> >
> > --
> > Thanks,
> > Michael Antonov
> >
>

Re: suggestion for how eliminate memory problem in heavy-write hbase region server

Posted by Ted Yu <yu...@gmail.com>.

Henry:
Please also take a look at the following thread:

http://search-hadoop.com/m/51M4jeDMyy1/GC+recommendations+for+large+Region+Server+heaps&subj=RE+GC+recommendations+for+large+Region+Server+heaps


On Thu, Apr 24, 2014 at 11:17 PM, Mikhail Antonov <ol...@gmail.com>wrote:

> Henry,
>
> http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html - that
> may give some insights.
>
> -Mikhail
>
>
> 2014-04-24 23:07 GMT-07:00 Henry Hung <YT...@winbond.com>:
>
> > Dear All,
> >
> > My current hbase environment is heavy write cluster with constant 2000+
> > insert rows / second spread to 10 region servers.
> > Each day I also need to do data deletion, and that will add a lot of IO
> to
> > the cluster.
> >
> > The problem is sometimes after a week, one of the region server will
> crash
> > because
> > 2014-04-10T10:17:47.200+0800: 1281486.956: [GC 1281486.956: [ParNew
> > (promotion failed): 235959K->235959K(235968K), 0.0836790
> secs]1281487.040:
> > [CMS2014-04-10T10:21:14.957+0800: 1281694.712: [CMS-concurrent-sweep:
> > 267.111/279.155 secs] [Times: user=334.79 sys=14.38, real=279.11 secs]
> > (concurrent mode failure): 13961950K->6802914K(16515072K), 209.9436660
> > secs] 14186496K->6802914K(16751040K), [CMS Perm :
> 42864K->42859K(71816K)],
> > 210.0274680 secs] [Times: user=210.18 sys=0.01, real=209.99 secs]
> >
> > I look into the gc log and usually find some information about CMS
> > concurrent sweep that took a very long time to complete, such as:
> > 2014-04-10T10:15:56.929+0800: 1281376.684: [CMS-concurrent-sweep:
> > 48.834/58.027 secs] [Times: user=101.52 sys=11.82, real=58.02 secs]
> >
> > I do a lot of google-ing and already read the Todd Lipcon avoiding full
> > GC, or other blogs that sometimes tells me how to set jvm flags such as
> > this:
> > -XX:+UseParNewGC
> > -XX:CMSInitiatingOccupancyFraction=70
> > -Xmn256m
> > -Xmx16384m
> > -XX:+DisableExplicitGC
> > -XX:+UseCompressedOops
> > -XX:PermSize=160m
> > -XX:MaxPermSize=160m
> > -XX:GCTimeRatio=19
> > -XX:SoftRefLRUPolicyMSPerMB=0
> > -XX:SurvivorRatio=2
> > -XX:MaxTenuringThreshold=1
> > -XX:+UseFastAccessorMethods
> > -XX:+UseParNewGC
> > -XX:+UseConcMarkSweepGC
> > -XX:+CMSParallelRemarkEnabled
> > -XX:+UseCMSCompactAtFullCollection
> > -XX:CMSFullGCsBeforeCompaction=0
> > -XX:+CMSClassUnloadingEnabled
> > -XX:CMSMaxAbortablePrecleanTime=300
> > -XX:+CMSScavengeBeforeRemark
> >
> > But alas, the problem still exist.
> >
> > I also know that java 1.7 has a new G1GC that probably can be used to fix
> > this problem, but I don't know if hbase 0.96 is ready to use it?
> >
> > I would really appreciate it if someone out there can share one or two
> > things about jvm configuration to achieve a more stable region server.
> >
> > Best regards,
> > Henry
> >
> > ________________________________
> > The privileged confidential information contained in this email is
> > intended for use only by the addressees as indicated by the original
> sender
> > of this email. If you are not the addressee indicated in this email or
> are
> > not responsible for delivery of the email to such a person, please kindly
> > reply to the sender indicating this fact and delete all copies of it from
> > your computer and network server immediately. Your cooperation is highly
> > appreciated. It is advised that any unauthorized use of confidential
> > information of Winbond is strictly prohibited; and any information in
> this
> > email irrelevant to the official business of Winbond shall be deemed as
> > neither given nor endorsed by Winbond.
> >
>
>
>
> --
> Thanks,
> Michael Antonov
>

Re: suggestion for how eliminate memory problem in heavy-write hbase region server

Posted by Mikhail Antonov <ol...@gmail.com>.

Henry,

http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html - that
may give some insights.

-Mikhail


2014-04-24 23:07 GMT-07:00 Henry Hung <YT...@winbond.com>:

> Dear All,
>
> My current hbase environment is heavy write cluster with constant 2000+
> insert rows / second spread to 10 region servers.
> Each day I also need to do data deletion, and that will add a lot of IO to
> the cluster.
>
> The problem is sometimes after a week, one of the region server will crash
> because
> 2014-04-10T10:17:47.200+0800: 1281486.956: [GC 1281486.956: [ParNew
> (promotion failed): 235959K->235959K(235968K), 0.0836790 secs]1281487.040:
> [CMS2014-04-10T10:21:14.957+0800: 1281694.712: [CMS-concurrent-sweep:
> 267.111/279.155 secs] [Times: user=334.79 sys=14.38, real=279.11 secs]
> (concurrent mode failure): 13961950K->6802914K(16515072K), 209.9436660
> secs] 14186496K->6802914K(16751040K), [CMS Perm : 42864K->42859K(71816K)],
> 210.0274680 secs] [Times: user=210.18 sys=0.01, real=209.99 secs]
>
> I look into the gc log and usually find some information about CMS
> concurrent sweep that took a very long time to complete, such as:
> 2014-04-10T10:15:56.929+0800: 1281376.684: [CMS-concurrent-sweep:
> 48.834/58.027 secs] [Times: user=101.52 sys=11.82, real=58.02 secs]
>
> I do a lot of google-ing and already read the Todd Lipcon avoiding full
> GC, or other blogs that sometimes tells me how to set jvm flags such as
> this:
> -XX:+UseParNewGC
> -XX:CMSInitiatingOccupancyFraction=70
> -Xmn256m
> -Xmx16384m
> -XX:+DisableExplicitGC
> -XX:+UseCompressedOops
> -XX:PermSize=160m
> -XX:MaxPermSize=160m
> -XX:GCTimeRatio=19
> -XX:SoftRefLRUPolicyMSPerMB=0
> -XX:SurvivorRatio=2
> -XX:MaxTenuringThreshold=1
> -XX:+UseFastAccessorMethods
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> -XX:+UseCMSCompactAtFullCollection
> -XX:CMSFullGCsBeforeCompaction=0
> -XX:+CMSClassUnloadingEnabled
> -XX:CMSMaxAbortablePrecleanTime=300
> -XX:+CMSScavengeBeforeRemark
>
> But alas, the problem still exist.
>
> I also know that java 1.7 has a new G1GC that probably can be used to fix
> this problem, but I don't know if hbase 0.96 is ready to use it?
>
> I would really appreciate it if someone out there can share one or two
> things about jvm configuration to achieve a more stable region server.
>
> Best regards,
> Henry
>
> ________________________________
> The privileged confidential information contained in this email is
> intended for use only by the addressees as indicated by the original sender
> of this email. If you are not the addressee indicated in this email or are
> not responsible for delivery of the email to such a person, please kindly
> reply to the sender indicating this fact and delete all copies of it from
> your computer and network server immediately. Your cooperation is highly
> appreciated. It is advised that any unauthorized use of confidential
> information of Winbond is strictly prohibited; and any information in this
> email irrelevant to the official business of Winbond shall be deemed as
> neither given nor endorsed by Winbond.
>



-- 
Thanks,
Michael Antonov