You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Annette Newton <an...@servicetick.com> on 2012/11/28 17:06:00 UTC
Permanently Full Old Generation...
Hi,
I'm hoping someone can help me with an issue we are encountering with solr
cloud..
We are seeing strange gc behaviour after running solr cloud under quite
heavy insert load for a period of time. The old generation becomes full and
no amount of garbage collection will free up the memory. I have attached a
memory profile, as you can see it gets progressively worse as the day goes
on to the point where we are always doing full garbage collections all the
time. The only way I have found to resolve this issue is to reload the
core, then subsequent garbage collections reclaim the used space, that's
what happened at 3pm on the Memory profile. All the nodes eventually
display the same behaviour.
We have multiple threads running adding batches of upto 100 documents at a
time. I have also attached our Schema and Config.
We are running 4 shards each with a single replica, have a 3 node zookeeper
setup and the 8 solr boxes instances are aws High-Memory Double Extra Large
with 34.2 GB Memory, 4 Virtual cores.
Thanks in advance
Annette Newton
Re: Permanently Full Old Generation...
Posted by Walter Underwood <wu...@wunderwood.org>.
We are running 1.6 update 37. That was released on the same day as your version, so it should have the same bug fixes. We use these options in production, it is very stable:
export CATALINA_OPTS="$CATALINA_OPTS -d64"
export CATALINA_OPTS="$CATALINA_OPTS -Xms4096m -Xmx6144m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:MaxPermSize=256m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:NewSize=2048m"
export CATALINA_OPTS="$CATALINA_OPTS -XX:+UseConcMarkSweepGC -XX:+DoEscapeAnalysis -XX:+UseCompressedOops"
export CATALINA_OPTS="$CATALINA_OPTS -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled"
export CATALINA_OPTS="$CATALINA_OPTS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"
export CATALINA_OPTS="$CATALINA_OPTS -XX:-TraceClassUnloading"
We do indexing and searching on separate machines. At Netflix, I found that the indexing load had a big effect on search speed, so I've separated the functions since then.
wunder
Search Guy, Chegg.com
On Nov 30, 2012, at 8:43 AM, Andy Kershaw wrote:
> We are currently operating at reduced load which is why the ParNew
> collections are not a problem. I don't know how long they were taking
> before though. Thanks for the warning about index formats.
>
> Our JVM is:
>
> Java(TM) SE Runtime Environment (build 1.7.0_09-b05)
> Java HotSpot(TM) 64-Bit Server VM (build 23.5-b02, mixed mode)
>
> We are currently running more tests but it takes a while before the issues
> become apparent.
>
> Andy Kershaw
>
> On 29 November 2012 18:31, Walter Underwood <wu...@wunderwood.org> wrote:
>
>> Several suggestions.
>>
>> 1. Adjust the traffic load for about 75% CPU. When you hit 100%, you are
>> already in an overload state and the variance of the response times goes
>> way up. You'll have very noisy benchmark data.
>>
>> 2. Do not force manual GCs during a benchmark.
>>
>> 3. Do not force merge (optimise). That is a very expensive operation and
>> will cause slowdowns.
>>
>> 4. Make eden big enough to hold all data allocated during a request for
>> all simultaneous requests. All that stuff is garbage after the end of the
>> request. If eden fills up, it will be allocated from the tenured space and
>> cause that to grow unnecessarily. We use an 8GB heap and 2GB eden. I like
>> setting the size better than setting ratios.
>>
>> 5. What version of the JVM are you using?
>>
>> wunder
>>
>> On Nov 29, 2012, at 10:15 AM, Shawn Heisey wrote:
>>
>>> On 11/29/2012 10:44 AM, Andy Kershaw wrote:
>>>> Annette is away until Monday so I am looking into this in the meantime.
>>>> Looking at the times of the Full GC entries at the end of the log, I
>> think
>>>> they are collections we started manually through jconsole to try and
>> reduce
>>>> the size of the old generation. This only seemed to have an effect when
>> we
>>>> reloaded the core first though.
>>>>
>>>> It is my understanding that the eden size is deliberately smaller to
>> keep
>>>> the ParNew collection time down. If it takes too long then the node is
>>>> flagged as down.
>>>
>>> Your ParNew collections are taking less than 1 second (some WAY less
>> than one second) to complete and the CMS collections are taking far longer
>> -- 6 seconds seems to be a common number in the GC log. GC is unavoidable
>> with Java, so if there has to be a collection, you definitely want it to be
>> on the young generation (ParNew).
>>>
>>> Controversial idea coming up, nothing concrete to back it up. This
>> means that you might want to wait for a committer to weigh in: I have seen
>> a lot of recent development work relating to SolrCloud and shard stability.
>> You may want to check out branch_4x from SVN and build that, rather than
>> use 4.0. I don't have any idea what the timeline for 4.1 is, but based on
>> what I saw for 3.x releases, it should be released relatively soon.
>>>
>>> The above advice is a bad idea if you have to be able to upgrade from
>> one 4.1 snapshot to a later one without reindexing. There is a possibility
>> that the 4.1 index format will change before release and require a reindex,
>> it has happened at least twice already.
>>>
>>> Thanks,
>>> Shawn
>>>
>>
>> --
>> Walter Underwood
>> wunder@wunderwood.org
>>
>>
>>
>>
>
>
> --
> Andy Kershaw
>
> Technical Developer
>
> ServiceTick Ltd
>
>
>
> T: +44(0)1603 618326
>
> M: +44 (0)7876 556833
>
>
>
> Seebohm House, 2-4 Queen Street, Norwich, England, NR2 4SQ
>
> www.ServiceTick.com <http://www.servicetick.com/>
>
> www.SessionCam.com <http://www.sessioncam.com/>
>
>
>
> *This message is confidential and is intended to be read solely by the
> addressee. If you have received this message by mistake, please delete it
> and do not copy it to anyone else. Internet communications are not secure
> and may be intercepted or changed after they are sent. ServiceTick Ltd does
> not accept liability for any such changes.*
--
Walter Underwood
wunder@wunderwood.org
Re: Permanently Full Old Generation...
Posted by Andy Kershaw <an...@servicetick.com>.
We are currently operating at reduced load which is why the ParNew
collections are not a problem. I don't know how long they were taking
before though. Thanks for the warning about index formats.
Our JVM is:
Java(TM) SE Runtime Environment (build 1.7.0_09-b05)
Java HotSpot(TM) 64-Bit Server VM (build 23.5-b02, mixed mode)
We are currently running more tests but it takes a while before the issues
become apparent.
Andy Kershaw
On 29 November 2012 18:31, Walter Underwood <wu...@wunderwood.org> wrote:
> Several suggestions.
>
> 1. Adjust the traffic load for about 75% CPU. When you hit 100%, you are
> already in an overload state and the variance of the response times goes
> way up. You'll have very noisy benchmark data.
>
> 2. Do not force manual GCs during a benchmark.
>
> 3. Do not force merge (optimise). That is a very expensive operation and
> will cause slowdowns.
>
> 4. Make eden big enough to hold all data allocated during a request for
> all simultaneous requests. All that stuff is garbage after the end of the
> request. If eden fills up, it will be allocated from the tenured space and
> cause that to grow unnecessarily. We use an 8GB heap and 2GB eden. I like
> setting the size better than setting ratios.
>
> 5. What version of the JVM are you using?
>
> wunder
>
> On Nov 29, 2012, at 10:15 AM, Shawn Heisey wrote:
>
> > On 11/29/2012 10:44 AM, Andy Kershaw wrote:
> >> Annette is away until Monday so I am looking into this in the meantime.
> >> Looking at the times of the Full GC entries at the end of the log, I
> think
> >> they are collections we started manually through jconsole to try and
> reduce
> >> the size of the old generation. This only seemed to have an effect when
> we
> >> reloaded the core first though.
> >>
> >> It is my understanding that the eden size is deliberately smaller to
> keep
> >> the ParNew collection time down. If it takes too long then the node is
> >> flagged as down.
> >
> > Your ParNew collections are taking less than 1 second (some WAY less
> than one second) to complete and the CMS collections are taking far longer
> -- 6 seconds seems to be a common number in the GC log. GC is unavoidable
> with Java, so if there has to be a collection, you definitely want it to be
> on the young generation (ParNew).
> >
> > Controversial idea coming up, nothing concrete to back it up. This
> means that you might want to wait for a committer to weigh in: I have seen
> a lot of recent development work relating to SolrCloud and shard stability.
> You may want to check out branch_4x from SVN and build that, rather than
> use 4.0. I don't have any idea what the timeline for 4.1 is, but based on
> what I saw for 3.x releases, it should be released relatively soon.
> >
> > The above advice is a bad idea if you have to be able to upgrade from
> one 4.1 snapshot to a later one without reindexing. There is a possibility
> that the 4.1 index format will change before release and require a reindex,
> it has happened at least twice already.
> >
> > Thanks,
> > Shawn
> >
>
> --
> Walter Underwood
> wunder@wunderwood.org
>
>
>
>
--
Andy Kershaw
Technical Developer
ServiceTick Ltd
T: +44(0)1603 618326
M: +44 (0)7876 556833
Seebohm House, 2-4 Queen Street, Norwich, England, NR2 4SQ
www.ServiceTick.com <http://www.servicetick.com/>
www.SessionCam.com <http://www.sessioncam.com/>
*This message is confidential and is intended to be read solely by the
addressee. If you have received this message by mistake, please delete it
and do not copy it to anyone else. Internet communications are not secure
and may be intercepted or changed after they are sent. ServiceTick Ltd does
not accept liability for any such changes.*
Re: Permanently Full Old Generation...
Posted by Walter Underwood <wu...@wunderwood.org>.
Several suggestions.
1. Adjust the traffic load for about 75% CPU. When you hit 100%, you are already in an overload state and the variance of the response times goes way up. You'll have very noisy benchmark data.
2. Do not force manual GCs during a benchmark.
3. Do not force merge (optimise). That is a very expensive operation and will cause slowdowns.
4. Make eden big enough to hold all data allocated during a request for all simultaneous requests. All that stuff is garbage after the end of the request. If eden fills up, it will be allocated from the tenured space and cause that to grow unnecessarily. We use an 8GB heap and 2GB eden. I like setting the size better than setting ratios.
5. What version of the JVM are you using?
wunder
On Nov 29, 2012, at 10:15 AM, Shawn Heisey wrote:
> On 11/29/2012 10:44 AM, Andy Kershaw wrote:
>> Annette is away until Monday so I am looking into this in the meantime.
>> Looking at the times of the Full GC entries at the end of the log, I think
>> they are collections we started manually through jconsole to try and reduce
>> the size of the old generation. This only seemed to have an effect when we
>> reloaded the core first though.
>>
>> It is my understanding that the eden size is deliberately smaller to keep
>> the ParNew collection time down. If it takes too long then the node is
>> flagged as down.
>
> Your ParNew collections are taking less than 1 second (some WAY less than one second) to complete and the CMS collections are taking far longer -- 6 seconds seems to be a common number in the GC log. GC is unavoidable with Java, so if there has to be a collection, you definitely want it to be on the young generation (ParNew).
>
> Controversial idea coming up, nothing concrete to back it up. This means that you might want to wait for a committer to weigh in: I have seen a lot of recent development work relating to SolrCloud and shard stability. You may want to check out branch_4x from SVN and build that, rather than use 4.0. I don't have any idea what the timeline for 4.1 is, but based on what I saw for 3.x releases, it should be released relatively soon.
>
> The above advice is a bad idea if you have to be able to upgrade from one 4.1 snapshot to a later one without reindexing. There is a possibility that the 4.1 index format will change before release and require a reindex, it has happened at least twice already.
>
> Thanks,
> Shawn
>
--
Walter Underwood
wunder@wunderwood.org
Re: Permanently Full Old Generation...
Posted by Shawn Heisey <so...@elyograg.org>.
On 11/29/2012 10:44 AM, Andy Kershaw wrote:
> Annette is away until Monday so I am looking into this in the meantime.
> Looking at the times of the Full GC entries at the end of the log, I think
> they are collections we started manually through jconsole to try and reduce
> the size of the old generation. This only seemed to have an effect when we
> reloaded the core first though.
>
> It is my understanding that the eden size is deliberately smaller to keep
> the ParNew collection time down. If it takes too long then the node is
> flagged as down.
Your ParNew collections are taking less than 1 second (some WAY less
than one second) to complete and the CMS collections are taking far
longer -- 6 seconds seems to be a common number in the GC log. GC is
unavoidable with Java, so if there has to be a collection, you
definitely want it to be on the young generation (ParNew).
Controversial idea coming up, nothing concrete to back it up. This
means that you might want to wait for a committer to weigh in: I have
seen a lot of recent development work relating to SolrCloud and shard
stability. You may want to check out branch_4x from SVN and build that,
rather than use 4.0. I don't have any idea what the timeline for 4.1
is, but based on what I saw for 3.x releases, it should be released
relatively soon.
The above advice is a bad idea if you have to be able to upgrade from
one 4.1 snapshot to a later one without reindexing. There is a
possibility that the 4.1 index format will change before release and
require a reindex, it has happened at least twice already.
Thanks,
Shawn
Re: Permanently Full Old Generation...
Posted by Andy Kershaw <an...@servicetick.com>.
Thanks for responding Shawn.
Annette is away until Monday so I am looking into this in the meantime.
Looking at the times of the Full GC entries at the end of the log, I think
they are collections we started manually through jconsole to try and reduce
the size of the old generation. This only seemed to have an effect when we
reloaded the core first though.
It is my understanding that the eden size is deliberately smaller to keep
the ParNew collection time down. If it takes too long then the node is
flagged as down.
On 29 November 2012 15:28, Shawn Heisey <so...@elyograg.org> wrote:
> > My jvm settings:
> >
> >
> > -Xmx8192M -Xms8192M -XX:+CMSScavengeBeforeRemark -XX:NewRatio=2
> > -XX:+CMSParallelRemarkEnabled -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> > -XX:+AggressiveOpts -XX:CMSInitiatingOccupancyFraction=70
> > -XX:+UseCMSInitiatingOccupancyOnly -XX:-CMSIncrementalPacing
> > -XX:CMSIncrementalDutyCycle=75
> >
> > I turned off IncrementalPacing, and enabled
> > CMSInitiatingOccupancyFraction,
> > after issues with nodes being reported as down due to large Garbage
> > collection pauses. The problem with the memory profile was visible
> before
> > the drop down to 1.2GB (this was when I reloaded the core), my concern
> was
> > that the collection of the old generation didn't seem to free any of the
> > heap, and we went from occasionally collecting to always collecting the
> > old
> > gen.
> >
> > Please see the attached gc log.
>
> I am on the train for my morning commute, so I have some time, but no
> access to the log or graph.
>
> Confession time: GC logs make me go glassy eyed and babble incoherently,
> but I did take a look at it. I saw 18 CMS collections and three entries
> near the end that saif Full GC. It looks like these collections take 6 to
> 8 seconds. That is pretty nasty, but probably unavoidable, so the goal is
> to make them happen extremely infrequently - do young generation
> collections instead.
>
> The thing that seems to make GC less of a problem for solr is maximizing
> the young generation memory pool. Based on the available info, I would
> start with making NewRatio 1 instead of 2. This will increase the eden
> size and decrease the old gen size. You may even want to use an explicit
> -Xmn of 6144. If that doesn't help, you might actually need 6GB or so of
> old gen heap, so try increasing the overall heap size to 9 or 10 GB and
> going back to a NewRatio of 2.
>
> Thanks,
> Shawn
>
RE: Permanently Full Old Generation...
Posted by Shawn Heisey <so...@elyograg.org>.
> My jvm settings:
>
>
> -Xmx8192M -Xms8192M -XX:+CMSScavengeBeforeRemark -XX:NewRatio=2
> -XX:+CMSParallelRemarkEnabled -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:+AggressiveOpts -XX:CMSInitiatingOccupancyFraction=70
> -XX:+UseCMSInitiatingOccupancyOnly -XX:-CMSIncrementalPacing
> -XX:CMSIncrementalDutyCycle=75
>
> I turned off IncrementalPacing, and enabled
> CMSInitiatingOccupancyFraction,
> after issues with nodes being reported as down due to large Garbage
> collection pauses. The problem with the memory profile was visible before
> the drop down to 1.2GB (this was when I reloaded the core), my concern was
> that the collection of the old generation didn't seem to free any of the
> heap, and we went from occasionally collecting to always collecting the
> old
> gen.
>
> Please see the attached gc log.
I am on the train for my morning commute, so I have some time, but no
access to the log or graph.
Confession time: GC logs make me go glassy eyed and babble incoherently,
but I did take a look at it. I saw 18 CMS collections and three entries
near the end that saif Full GC. It looks like these collections take 6 to
8 seconds. That is pretty nasty, but probably unavoidable, so the goal is
to make them happen extremely infrequently - do young generation
collections instead.
The thing that seems to make GC less of a problem for solr is maximizing
the young generation memory pool. Based on the available info, I would
start with making NewRatio 1 instead of 2. This will increase the eden
size and decrease the old gen size. You may even want to use an explicit
-Xmn of 6144. If that doesn't help, you might actually need 6GB or so of
old gen heap, so try increasing the overall heap size to 9 or 10 GB and
going back to a NewRatio of 2.
Thanks,
Shawn
RE: Permanently Full Old Generation...
Posted by Annette Newton <an...@servicetick.com>.
My jvm settings:
-Xmx8192M -Xms8192M -XX:+CMSScavengeBeforeRemark -XX:NewRatio=2
-XX:+CMSParallelRemarkEnabled -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:+AggressiveOpts -XX:CMSInitiatingOccupancyFraction=70
-XX:+UseCMSInitiatingOccupancyOnly -XX:-CMSIncrementalPacing
-XX:CMSIncrementalDutyCycle=75
I turned off IncrementalPacing, and enabled CMSInitiatingOccupancyFraction,
after issues with nodes being reported as down due to large Garbage
collection pauses. The problem with the memory profile was visible before
the drop down to 1.2GB (this was when I reloaded the core), my concern was
that the collection of the old generation didn't seem to free any of the
heap, and we went from occasionally collecting to always collecting the old
gen.
Please see the attached gc log.
-----Original Message-----
From: Shawn Heisey [mailto:solr@elyograg.org]
Sent: 28 November 2012 16:48
To: solr-user@lucene.apache.org
Subject: Re: Permanently Full Old Generation...
On 11/28/2012 9:06 AM, Annette Newton wrote:
> We are seeing strange gc behaviour after running solr cloud under
> quite heavy insert load for a period of time. The old generation
> becomes full and no amount of garbage collection will free up the
> memory. I have attached a memory profile, as you can see it gets
> progressively worse as the day goes on to the point where we are
> always doing full garbage collections all the time. The only way I
> have found to resolve this issue is to reload the core, then
> subsequent garbage collections reclaim the used space, that's what
> happened at 3pm on the Memory profile. All the nodes eventually
> display the same behaviour.
>
> We have multiple threads running adding batches of upto 100 documents
> at a time. I have also attached our Schema and Config.
>
> We are running 4 shards each with a single replica, have a 3 node
> zookeeper setup and the 8 solr boxes instances are aws High-Memory
> Double Extra Large with 34.2 GB Memory, 4 Virtual cores.
>
Looking at that jconsole graph, I am not seeing a problem. It looks fairly
normal to me, especially for heavy indexing.
Solr has been up for at least 29 hours, just based on the graph I can see,
which may not reflect the full JVM uptime - click on VM Summary to see that.
In that time, you've only spent a total of two minutes in garbage
collection, which does not seem problematic to me. Also, your Old Gen is not
full - it's only using 1.2GB out of 5.5GB available. A full GC
(ConcurrentMarkSweep) would only be automatically triggered when the Old Gen
reaches that 5.5GB mark.
Are you seeing actual performance problems, or are you just concerned about
what you see when you watch the memory?
General note about memory and Solr: I have had very good luck with the
following java memory options. Each machine handles three 22GB index shards
on one JVM.
-Xms4096M
-Xmx8192M
-XX:NewRatio=1
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
Thanks,
Shawn
Re: Permanently Full Old Generation...
Posted by Shawn Heisey <so...@elyograg.org>.
On 11/28/2012 9:06 AM, Annette Newton wrote:
> We are seeing strange gc behaviour after running solr cloud under
> quite heavy insert load for a period of time. The old generation
> becomes full and no amount of garbage collection will free up the
> memory. I have attached a memory profile, as you can see it gets
> progressively worse as the day goes on to the point where we are
> always doing full garbage collections all the time. The only way I
> have found to resolve this issue is to reload the core, then
> subsequent garbage collections reclaim the used space, that’s what
> happened at 3pm on the Memory profile. All the nodes eventually
> display the same behaviour.
>
> We have multiple threads running adding batches of upto 100 documents
> at a time. I have also attached our Schema and Config.
>
> We are running 4 shards each with a single replica, have a 3 node
> zookeeper setup and the 8 solr boxes instances are aws High-Memory
> Double Extra Large with 34.2 GB Memory, 4 Virtual cores.
>
Looking at that jconsole graph, I am not seeing a problem. It looks
fairly normal to me, especially for heavy indexing.
Solr has been up for at least 29 hours, just based on the graph I can
see, which may not reflect the full JVM uptime - click on VM Summary to
see that. In that time, you've only spent a total of two minutes in
garbage collection, which does not seem problematic to me. Also, your
Old Gen is not full - it's only using 1.2GB out of 5.5GB available. A
full GC (ConcurrentMarkSweep) would only be automatically triggered when
the Old Gen reaches that 5.5GB mark.
Are you seeing actual performance problems, or are you just concerned
about what you see when you watch the memory?
General note about memory and Solr: I have had very good luck with the
following java memory options. Each machine handles three 22GB index
shards on one JVM.
-Xms4096M
-Xmx8192M
-XX:NewRatio=1
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
Thanks,
Shawn
RE: Permanently Full Old Generation...
Posted by Annette Newton <an...@servicetick.com>.
Hi,
Class <http://localhost:7000/histo/class>
Instance Count <http://localhost:7000/histo/count>
Total Size <http://localhost:7000/histo/size>
class [C <http://localhost:7000/class/0x73ae500c0>
3410607
699552656
class [Lorg.apache.lucene.util.fst.FST$Arc; <http://localhost:7000/class/0x73b6ff2c0>
319813
332605520
class [Ljava.lang.Object; <http://localhost:7000/class/0x73ae96900>
898433
170462152
class [Ljava.util.HashMap$Entry; <http://localhost:7000/class/0x73b60b0e8>
856551
149091216
class java.util.HashMap$Entry <http://localhost:7000/class/0x73b14d060>
2802560
100892160
class java.lang.String <http://localhost:7000/class/0x73ae91a80>
3295405
52726480
class java.util.HashMap <http://localhost:7000/class/0x73b3f2270>
750750
42042000
class org.apache.lucene.util.fst.FST <http://localhost:7000/class/0x73aee36d8>
319896
39027312
class org.apache.lucene.index.FieldInfo <http://localhost:7000/class/0x73afd0af0>
748801
35942448
class [B <http://localhost:7000/class/0x73ae96890>
1032862
28932353
class java.util.LinkedHashMap$Entry <http://localhost:7000/class/0x73b545a08>
516680
26867360
class org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader <http://localhost:7000/class/0x73afd0fa0>
319831
24307156
class java.util.Collections$UnmodifiableMap <http://localhost:7000/class/0x73b016400>
749522
23984704
class java.util.HashMap$FrontCache <http://localhost:7000/class/0x73afda618>
863640
17272800
class org.apache.lucene.util.BytesRef <http://localhost:7000/class/0x73b09d2a8>
709550
11352800
class java.util.LinkedHashMap <http://localhost:7000/class/0x73afd8908>
109810
7137650
class [I <http://localhost:7000/class/0x73ae50130>
320987
5268068
class org.apache.lucene.analysis.core.WhitespaceTokenizer <http://localhost:7000/class/0x73b3b3b90>
59540
5239520
class java.lang.Integer <http://localhost:7000/class/0x73ae0e718>
1149625
4598500
class org.apache.lucene.util.AttributeSource$State <http://localhost:7000/class/0x73b3b6440>
169168
2706688
class java.util.TreeMap$Node <http://localhost:7000/class/0x73ae4c290>
19876
1371444
class [Lorg.apache.lucene.util.AttributeSource$State; <http://localhost:7000/class/0x73b60a978>
56394
1353456
class org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl <http://localhost:7000/class/0x73afcd418>
56394
1127880
class org.apache.lucene.analysis.util.CharacterUtils$CharacterBuffer <http://localhost:7000/class/0x73b3b6600>
52888
951984
class org.apache.lucene.analysis.Analyzer$TokenStreamComponents <http://localhost:7000/class/0x73b3b6670>
56404
902464
class java.lang.Class <http://localhost:7000/class/0x73ae162f0>
5706
867312
class org.apache.lucene.util.fst.FST$Arc <http://localhost:7000/class/0x73b3b6a60>
14017
686833
class java.util.HashMap$Values <http://localhost:7000/class/0x73b605ce8>
56409
451272
class org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl <http://localhost:7000/class/0x73b3823b8>
56380
451040
class org.apache.lucene.analysis.tokenattributes.PositionIncrementAttributeImpl <http://localhost:7000/class/0x73b382498>
56396
225584
class [Ljava.lang.Integer; <http://localhost:7000/class/0x73ae502f0>
1
161048
class java.util.concurrent.ConcurrentHashMap$HashEntry <http://localhost:7000/class/0x73ae8db40>
3456
96768
class [[C <http://localhost:7000/class/0x73b60cf18>
This sample of the heap dump I took when we encountered the problem previously..
-----Original Message-----
From: Jack Krupansky [mailto:jack@basetechnology.com]
Sent: 28 November 2012 16:23
To: solr-user@lucene.apache.org
Subject: Re: Permanently Full Old Generation...
Have you done a Java heap dump to see what the most common objects are?
-- Jack Krupansky
From: Annette Newton
Sent: Wednesday, November 28, 2012 11:06 AM
To: <ma...@lucene.apache.org> solr-user@lucene.apache.org
Cc: Andy Kershaw
Subject: Permanently Full Old Generation...
Hi,
I’m hoping someone can help me with an issue we are encountering with solr cloud..
We are seeing strange gc behaviour after running solr cloud under quite heavy insert load for a period of time. The old generation becomes full and no amount of garbage collection will free up the memory. I have attached a memory profile, as you can see it gets progressively worse as the day goes on to the point where we are always doing full garbage collections all the time. The only way I have found to resolve this issue is to reload the core, then subsequent garbage collections reclaim the used space, that’s what happened at 3pm on the Memory profile. All the nodes eventually display the same behaviour.
We have multiple threads running adding batches of upto 100 documents at a time. I have also attached our Schema and Config.
We are running 4 shards each with a single replica, have a 3 node zookeeper setup and the 8 solr boxes instances are aws High-Memory Double Extra Large with 34.2 GB Memory, 4 Virtual cores.
Thanks in advance
Annette Newton
Re: Permanently Full Old Generation...
Posted by Jack Krupansky <ja...@basetechnology.com>.
Have you done a Java heap dump to see what the most common objects are?
-- Jack Krupansky
From: Annette Newton
Sent: Wednesday, November 28, 2012 11:06 AM
To: solr-user@lucene.apache.org
Cc: Andy Kershaw
Subject: Permanently Full Old Generation...
Hi,
I’m hoping someone can help me with an issue we are encountering with solr cloud..
We are seeing strange gc behaviour after running solr cloud under quite heavy insert load for a period of time. The old generation becomes full and no amount of garbage collection will free up the memory. I have attached a memory profile, as you can see it gets progressively worse as the day goes on to the point where we are always doing full garbage collections all the time. The only way I have found to resolve this issue is to reload the core, then subsequent garbage collections reclaim the used space, that’s what happened at 3pm on the Memory profile. All the nodes eventually display the same behaviour.
We have multiple threads running adding batches of upto 100 documents at a time. I have also attached our Schema and Config.
We are running 4 shards each with a single replica, have a 3 node zookeeper setup and the 8 solr boxes instances are aws High-Memory Double Extra Large with 34.2 GB Memory, 4 Virtual cores.
Thanks in advance
Annette Newton