You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by siping liu <si...@hotmail.com> on 2009/10/02 16:04:59 UTC

RE: Solr and Garbage Collection

Hi,

I read pretty much all posts on this thread (before and after this one). Looks like the main suggestion from you and others is to keep max heap size (-Xmx) as small as possible (as long as you don't see OOM exception). This brings more questions than answers (for me at least. I'm new to Solr).

 

First, our environment and problem encountered: Solr1.4 (nightly build, downloaded about 2 months ago), Sun JDK1.6, Tomcat 5.5, running on Solaris(multi-cpu/cores). The cache setting is from the default solrconfig.xml (looks very small). At first we used minimum JAVA_OPTS and quickly run into the problem similar to the one orignal poster reported -- long pause (seconds to minutes) under load test. jconsole showed that it pauses on GC. So more JAVA_OPTS get added: "-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:SurvivorRatio=2 -XX:NewSize=128m -XX:MaxNewSize=512m -XX:MaxGCPauseMillis=200", the thinking is with mutile-cpu/cores we can get over with GC as quickly as possibe. With the new setup, it works fine until Tomcat reaches heap size, then it blocks and takes minutes on "full GC" to get more space from "tenure generation". We tried different Xmx (from very small to large), no difference in long GC time. We never run into OOM.

 

Questions:

* In general various cachings are good for performance, we have more RAM to use and want to use more caching to boost performance, isn't your suggestion (of lowering heap limit) going against that?

* Looks like Solr caching made its way into tenure-generation on heap, that's good. But why they get GC'ed eventually?? I did a quick check of Solr code (Solr 1.3, not 1.4), and see a single instance of using WeakReference. Is that what is causing all this? This seems to suggest a design flaw in Solr's memory management strategy (or just my ignorance about Solr?). I mean, wouldn't this be the "right" way of doing it -- you allow user to specify the cache size in solrconfig.xml, then user can set up heap limit in JAVA_OPTS accordingly, and no need to use WeakReference (BTW, why not SoftReference)??

* Right now I have a single Tomcat hosting Solr and other applications. I guess now it's better to have Solr on its own Tomcat, given that it's tricky to adjust the java options.

 

thanks.


 
> From: wunder@wunderwood.org
> To: solr-user@lucene.apache.org
> Subject: RE: Solr and Garbage Collection
> Date: Fri, 25 Sep 2009 09:51:29 -0700
> 
> 30ms is not better or worse than 1s until you look at the service
> requirements. For many applications, it is worth dedicating 10% of your
> processing time to GC if that makes the worst-case pause short.
> 
> On the other hand, my experience with the IBM JVM was that the maximum query
> rate was 2-3X better with the concurrent generational GC compared to any of
> their other GC algorithms, so we got the best throughput along with the
> shortest pauses.
> 
> Solr garbage generation (for queries) seems to have two major components:
> per-request garbage and cache evictions. With a generational collector,
> these two are handled by separate parts of the collector. Per-request
> garbage should completely fit in the short-term heap (nursery), so that it
> can be collected rapidly and returned to use for further requests. If the
> nursery is too small, the per-request allocations will be made in tenured
> space and sit there until the next major GC. Cache evictions are almost
> always in long-term storage (tenured space) because an LRU algorithm
> guarantees that the garbage will be old.
> 
> Check the growth rate of tenured space (under constant load, of course)
> while increasing the size of the nursery. That rate should drop when the
> nursery gets big enough, then not drop much further as it is increased more.
> 
> After that, reduce the size of tenured space until major GCs start happening
> "too often" (a judgment call). A bigger tenured space means longer major GCs
> and thus longer pauses, so you don't want it oversized by too much.
> 
> Also check the hit rates of your caches. If the hit rate is low, say 20% or
> less, make that cache much bigger or set it to zero. Either one will reduce
> the number of cache evictions. If you have an HTTP cache in front of Solr,
> zero may be the right choice, since the HTTP cache is cherry-picking the
> easily cacheable requests.
> 
> Note that a commit nearly doubles the memory required, because you have two
> live Searcher objects with all their caches. Make sure you have headroom for
> a commit.
> 
> If you want to test the tenured space usage, you must test with real world
> queries. Those are the only way to get accurate cache eviction rates.
> 
> wunder
 		 	   		  
_________________________________________________________________
Bing™  brings you maps, menus, and reviews organized in one place.   Try it now.
http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1

RE: Solr and Garbage Collection

Posted by Fuad Efendi <fu...@efendi.ca>.
Master-Slave replica: new caches will be warmed&prepopulated _before_ making
new IndexReader available for _new_ requests and _before_ discarding old one
- it means that theoretical sizing for FieldCache (which is defined by
number of docs in an index and cardinality of a field) should be doubled...
of course we need to play with GC options too for performance tuning
(mostly) 


> > I read pretty much all posts on this thread (before and after this one).
> Looks
> > like the main suggestion from you and others is to keep max heap size
> (-Xmx)
> > as small as possible (as long as you don't see OOM exception).
> 
> 
> I suggested absolute opposite; please note also that "as small as
possible"
> does not have any meaning in multiuser environment of Tomcat. It depends
on
> query types (10 documents per request? OR, may be 10000???) AND it depends
> on average server loading (one concurrent request? Or, may be 200 threads
> trying to deal with 2000 concurrent requests?) AND it depends on whether
it
> is Master (used for updates - parses tons of docs in a single file???) -
and
> it depends on unpredictable memory fragmentation - it all depends on use
> case too(!!!), additionally to schema / index size.
> 
> 
> Please note also, such staff depends on JVM vendor too: what if it
> precompiles everything into CPU native code (including memory dealloc
after
> each call)? Some do!
> 
> -Fuad
> http://www.linkedin.com/in/liferay
> 
> 
> ...but 'core' constantly disagrees with me :)
> 
> 
> 




RE: Solr and Garbage Collection

Posted by Fuad Efendi <fu...@efendi.ca>.
> I read pretty much all posts on this thread (before and after this one).
Looks
> like the main suggestion from you and others is to keep max heap size
(-Xmx)
> as small as possible (as long as you don't see OOM exception). 


I suggested absolute opposite; please note also that "as small as possible"
does not have any meaning in multiuser environment of Tomcat. It depends on
query types (10 documents per request? OR, may be 10000???) AND it depends
on average server loading (one concurrent request? Or, may be 200 threads
trying to deal with 2000 concurrent requests?) AND it depends on whether it
is Master (used for updates - parses tons of docs in a single file???) - and
it depends on unpredictable memory fragmentation - it all depends on use
case too(!!!), additionally to schema / index size.


Please note also, such staff depends on JVM vendor too: what if it
precompiles everything into CPU native code (including memory dealloc after
each call)? Some do!

-Fuad
http://www.linkedin.com/in/liferay


...but 'core' constantly disagrees with me :)





Re: Solr and Garbage Collection

Posted by Mark Miller <ma...@gmail.com>.
Just went back and looked - its even worse than that - it wasn't just
slated for OpenJDK, it was already in OpenJDK at the time and released
as GPL :)

Granted, Sun's mistake was worse than mine (they used "permitted"), but
the post that seemed to cause Sun to "reword" their release was all
worried that because Oracle bought Sun, they were now going to screw
OpenJDK and charge for the collector that was supposedly going to
replace all collectors - but they just didn't do any research - it had
long been in OpenJDK 7 at that time and fully GPL.

Mark Miller wrote:
> Actually, now as I am remembering, I think the main give away away, as
> someone mentioned back when Slashdot had that misleading post, was that
> it was slated for OpenJDK - which is open source :) Typical Slashdot though.
>
> Mark Miller wrote:
>   
>> Yup - I know - I remember the Slashdot discussion on it well - I didn't
>> mean it that way myself. It caused quite a stir, but most people figured
>> out what they meant before they released any further info from what I
>> could tell. I just made the same mistake they did :)
>>
>> Bill Au wrote:
>>   
>>     
>>> SUN's initial release notes actually pretty much said that it was
>>> "unsupported unless you pay".  They had since revised the release notes to
>>> clear up the confusion.
>>> Bill
>>>
>>> On Sat, Oct 3, 2009 at 2:51 PM, Mark Miller <ma...@gmail.com> wrote:
>>>
>>>   
>>>     
>>>       
>>>> Ah, yes - thanks for the clarification. Didn't pay attention to how
>>>> ambiguously I was using "supported" there :)
>>>>
>>>> Bill Au wrote:
>>>>     
>>>>       
>>>>         
>>>>> SUN has recently clarify the issue regarding "unsupported unless you pay"
>>>>> for the G1 garbage collector. Here is the updated release of Java 6
>>>>>       
>>>>>         
>>>>>           
>>>> update
>>>>     
>>>>       
>>>>         
>>>>> 14:
>>>>> http://java.sun.com/javase/6/webnotes/6u14.html
>>>>>
>>>>>
>>>>> G1 will be part of Java 7, fully supported without pay.  The version
>>>>> included in Java 6 update 14 is a beta release.  Since it is beta, SUN
>>>>>       
>>>>>         
>>>>>           
>>>> does
>>>>     
>>>>       
>>>>         
>>>>> not recommend using it unless you have a support contract because as with
>>>>> any beta software there will be bugs.  Non paying customers may very well
>>>>> have to wait for the official version in Java 7 for bug fixes.
>>>>>
>>>>> Here is more info on the G1 garbage collector:
>>>>>
>>>>> http://java.sun.com/javase/technologies/hotspot/gc/g1_intro.jsp
>>>>>
>>>>>
>>>>> Bill
>>>>>
>>>>> On Sat, Oct 3, 2009 at 1:28 PM, Mark Miller <ma...@gmail.com>
>>>>>       
>>>>>         
>>>>>           
>>>> wrote:
>>>>     
>>>>       
>>>>         
>>>>>       
>>>>>         
>>>>>           
>>>>>> Another option of course, if you're using a recent version of Java 6:
>>>>>>
>>>>>> try out the beta-ish, unsupported unless you pay, G1 garbage collector.
>>>>>> I've only recently started playing with it, but its supposed to be much
>>>>>> better than CMS. Its supposedly got much better throughput, its much
>>>>>> better at dealing with fragmentation issues (CMS is actually pretty bad
>>>>>> with fragmentation come to find out), and overall its just supposed to
>>>>>> be a very nice leap ahead in GC. Havn't had a chance to play with it
>>>>>> much myself, but its supposed to be fantastic. A whole new approach to
>>>>>> generational collection for Sun, and much closer to the "real time" GC's
>>>>>> available from some other vendors.
>>>>>>
>>>>>> Mark Miller wrote:
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>> siping liu wrote:
>>>>>>>
>>>>>>>
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I read pretty much all posts on this thread (before and after this
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>> one).
>>>>     
>>>>       
>>>>         
>>>>>> Looks like the main suggestion from you and others is to keep max heap
>>>>>>         
>>>>>>           
>>>>>>             
>>>> size
>>>>     
>>>>       
>>>>         
>>>>>> (-Xmx) as small as possible (as long as you don't see OOM exception).
>>>>>>         
>>>>>>           
>>>>>>             
>>>> This
>>>>     
>>>>       
>>>>         
>>>>>> brings more questions than answers (for me at least. I'm new to Solr).
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>> First, our environment and problem encountered: Solr1.4 (nightly
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>> build,
>>>>     
>>>>       
>>>>         
>>>>>> downloaded about 2 months ago), Sun JDK1.6, Tomcat 5.5, running on
>>>>>> Solaris(multi-cpu/cores). The cache setting is from the default
>>>>>> solrconfig.xml (looks very small). At first we used minimum JAVA_OPTS
>>>>>>         
>>>>>>           
>>>>>>             
>>>> and
>>>>     
>>>>       
>>>>         
>>>>>> quickly run into the problem similar to the one orignal poster reported
>>>>>>         
>>>>>>           
>>>>>>             
>>>> --
>>>>     
>>>>       
>>>>         
>>>>>> long pause (seconds to minutes) under load test. jconsole showed that it
>>>>>> pauses on GC. So more JAVA_OPTS get added: "-XX:+UseConcMarkSweepGC
>>>>>> -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:SurvivorRatio=2
>>>>>> -XX:NewSize=128m -XX:MaxNewSize=512m -XX:MaxGCPauseMillis=200", the
>>>>>>         
>>>>>>           
>>>>>>             
>>>> thinking
>>>>     
>>>>       
>>>>         
>>>>>> is with mutile-cpu/cores we can get over with GC as quickly as possibe.
>>>>>>         
>>>>>>           
>>>>>>             
>>>> With
>>>>     
>>>>       
>>>>         
>>>>>> the new setup, it works fine until Tomcat reaches heap size, then it
>>>>>>         
>>>>>>           
>>>>>>             
>>>> blocks
>>>>     
>>>>       
>>>>         
>>>>>> and takes minutes on "full GC" to get more space from "tenure
>>>>>>         
>>>>>>           
>>>>>>             
>>>> generation".
>>>>     
>>>>       
>>>>         
>>>>>> We tried different Xmx (from very small to large), no difference in long
>>>>>>         
>>>>>>           
>>>>>>             
>>>> GC
>>>>     
>>>>       
>>>>         
>>>>>> time. We never run into OOM.
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>> MaxGCPauseMillis doesnt work with UseConcMarkSweepGC - its for use with
>>>>>>> the Parallel collector. That also doesnt look like a good
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>> survivorratio.
>>>>     
>>>>       
>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> Questions:
>>>>>>>>
>>>>>>>> * In general various cachings are good for performance, we have more
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>> RAM
>>>>     
>>>>       
>>>>         
>>>>>> to use and want to use more caching to boost performance, isn't your
>>>>>> suggestion (of lowering heap limit) going against that?
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>> Leaving RAM for the FileSystem cache is also very important. But you
>>>>>>> should also have enough RAM for your Solr caches of course.
>>>>>>>
>>>>>>>
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> * Looks like Solr caching made its way into tenure-generation on heap,
>>>>>>>>
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>> that's good. But why they get GC'ed eventually?? I did a quick check of
>>>>>>         
>>>>>>           
>>>>>>             
>>>> Solr
>>>>     
>>>>       
>>>>         
>>>>>> code (Solr 1.3, not 1.4), and see a single instance of using
>>>>>>         
>>>>>>           
>>>>>>             
>>>> WeakReference.
>>>>     
>>>>       
>>>>         
>>>>>> Is that what is causing all this? This seems to suggest a design flaw in
>>>>>> Solr's memory management strategy (or just my ignorance about Solr?). I
>>>>>> mean, wouldn't this be the "right" way of doing it -- you allow user to
>>>>>> specify the cache size in solrconfig.xml, then user can set up heap
>>>>>>         
>>>>>>           
>>>>>>             
>>>> limit in
>>>>     
>>>>       
>>>>         
>>>>>> JAVA_OPTS accordingly, and no need to use WeakReference (BTW, why not
>>>>>> SoftReference)??
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>> Do you see concurrent mode failure when looking at your gc logs? ie:
>>>>>>>
>>>>>>> 174.445: [GC 174.446: [ParNew: 66408K->66408K(66416K), 0.0000618
>>>>>>> secs]174.446: [CMS (concurrent mode failure):
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>> 161928K->162118K(175104K),
>>>>     
>>>>       
>>>>         
>>>>>>> 4.0975124 secs] 228336K->162118K(241520K)
>>>>>>>
>>>>>>> That means you have still getting major collections with CMS, and you
>>>>>>> don't want that. You might try kicking GC off earlier with something
>>>>>>> like: -XX:CMSInitiatingOccupancyFraction=50
>>>>>>>
>>>>>>>
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> * Right now I have a single Tomcat hosting Solr and other
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>> applications.
>>>>     
>>>>       
>>>>         
>>>>>> I guess now it's better to have Solr on its own Tomcat, given that it's
>>>>>> tricky to adjust the java options.
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>> thanks.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>>>> From: wunder@wunderwood.org
>>>>>>>>> To: solr-user@lucene.apache.org
>>>>>>>>> Subject: RE: Solr and Garbage Collection
>>>>>>>>> Date: Fri, 25 Sep 2009 09:51:29 -0700
>>>>>>>>>
>>>>>>>>> 30ms is not better or worse than 1s until you look at the service
>>>>>>>>> requirements. For many applications, it is worth dedicating 10% of
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>> your
>>>>     
>>>>       
>>>>         
>>>>>>>>> processing time to GC if that makes the worst-case pause short.
>>>>>>>>>
>>>>>>>>> On the other hand, my experience with the IBM JVM was that the
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>> maximum
>>>>     
>>>>       
>>>>         
>>>>>> query
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> rate was 2-3X better with the concurrent generational GC compared to
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> any of
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> their other GC algorithms, so we got the best throughput along with
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>> the
>>>>     
>>>>       
>>>>         
>>>>>>>>> shortest pauses.
>>>>>>>>>
>>>>>>>>> Solr garbage generation (for queries) seems to have two major
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> components:
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> per-request garbage and cache evictions. With a generational
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>> collector,
>>>>     
>>>>       
>>>>         
>>>>>>>>> these two are handled by separate parts of the collector. Per-request
>>>>>>>>> garbage should completely fit in the short-term heap (nursery), so
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>> that
>>>>     
>>>>       
>>>>         
>>>>>> it
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> can be collected rapidly and returned to use for further requests. If
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> the
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> nursery is too small, the per-request allocations will be made in
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> tenured
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> space and sit there until the next major GC. Cache evictions are
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>> almost
>>>>     
>>>>       
>>>>         
>>>>>>>>> always in long-term storage (tenured space) because an LRU algorithm
>>>>>>>>> guarantees that the garbage will be old.
>>>>>>>>>
>>>>>>>>> Check the growth rate of tenured space (under constant load, of
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>> course)
>>>>     
>>>>       
>>>>         
>>>>>>>>> while increasing the size of the nursery. That rate should drop when
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> the
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> nursery gets big enough, then not drop much further as it is
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>> increased
>>>>     
>>>>       
>>>>         
>>>>>> more.
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> After that, reduce the size of tenured space until major GCs start
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> happening
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> "too often" (a judgment call). A bigger tenured space means longer
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> major GCs
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> and thus longer pauses, so you don't want it oversized by too much.
>>>>>>>>>
>>>>>>>>> Also check the hit rates of your caches. If the hit rate is low, say
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> 20% or
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> less, make that cache much bigger or set it to zero. Either one will
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> reduce
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> the number of cache evictions. If you have an HTTP cache in front of
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> Solr,
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> zero may be the right choice, since the HTTP cache is cherry-picking
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> the
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> easily cacheable requests.
>>>>>>>>>
>>>>>>>>> Note that a commit nearly doubles the memory required, because you
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>> have
>>>>     
>>>>       
>>>>         
>>>>>> two
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> live Searcher objects with all their caches. Make sure you have
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> headroom for
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> a commit.
>>>>>>>>>
>>>>>>>>> If you want to test the tenured space usage, you must test with real
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>> world
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>>> queries. Those are the only way to get accurate cache eviction rates.
>>>>>>>>>
>>>>>>>>> wunder
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>                 
>>>>>>>>>                   
>>>>>>>> _________________________________________________________________
>>>>>>>> Bing™  brings you maps, menus, and reviews organized in one place.
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>> Try
>>>>     
>>>>       
>>>>         
>>>>>> it now.
>>>>>>
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>> http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1
>>>>     
>>>>       
>>>>         
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>> --
>>>>>> - Mark
>>>>>>
>>>>>> http://www.lucidimagination.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>       
>>>>>         
>>>>>           
>>>> --
>>>> - Mark
>>>>
>>>> http://www.lucidimagination.com
>>>>
>>>>
>>>>
>>>>
>>>>     
>>>>       
>>>>         
>>>   
>>>     
>>>       
>>   
>>     
>
>
>   


-- 
- Mark

http://www.lucidimagination.com




Re: Solr and Garbage Collection

Posted by Mark Miller <ma...@gmail.com>.
Actually, now as I am remembering, I think the main give away away, as
someone mentioned back when Slashdot had that misleading post, was that
it was slated for OpenJDK - which is open source :) Typical Slashdot though.

Mark Miller wrote:
> Yup - I know - I remember the Slashdot discussion on it well - I didn't
> mean it that way myself. It caused quite a stir, but most people figured
> out what they meant before they released any further info from what I
> could tell. I just made the same mistake they did :)
>
> Bill Au wrote:
>   
>> SUN's initial release notes actually pretty much said that it was
>> "unsupported unless you pay".  They had since revised the release notes to
>> clear up the confusion.
>> Bill
>>
>> On Sat, Oct 3, 2009 at 2:51 PM, Mark Miller <ma...@gmail.com> wrote:
>>
>>   
>>     
>>> Ah, yes - thanks for the clarification. Didn't pay attention to how
>>> ambiguously I was using "supported" there :)
>>>
>>> Bill Au wrote:
>>>     
>>>       
>>>> SUN has recently clarify the issue regarding "unsupported unless you pay"
>>>> for the G1 garbage collector. Here is the updated release of Java 6
>>>>       
>>>>         
>>> update
>>>     
>>>       
>>>> 14:
>>>> http://java.sun.com/javase/6/webnotes/6u14.html
>>>>
>>>>
>>>> G1 will be part of Java 7, fully supported without pay.  The version
>>>> included in Java 6 update 14 is a beta release.  Since it is beta, SUN
>>>>       
>>>>         
>>> does
>>>     
>>>       
>>>> not recommend using it unless you have a support contract because as with
>>>> any beta software there will be bugs.  Non paying customers may very well
>>>> have to wait for the official version in Java 7 for bug fixes.
>>>>
>>>> Here is more info on the G1 garbage collector:
>>>>
>>>> http://java.sun.com/javase/technologies/hotspot/gc/g1_intro.jsp
>>>>
>>>>
>>>> Bill
>>>>
>>>> On Sat, Oct 3, 2009 at 1:28 PM, Mark Miller <ma...@gmail.com>
>>>>       
>>>>         
>>> wrote:
>>>     
>>>       
>>>>       
>>>>         
>>>>> Another option of course, if you're using a recent version of Java 6:
>>>>>
>>>>> try out the beta-ish, unsupported unless you pay, G1 garbage collector.
>>>>> I've only recently started playing with it, but its supposed to be much
>>>>> better than CMS. Its supposedly got much better throughput, its much
>>>>> better at dealing with fragmentation issues (CMS is actually pretty bad
>>>>> with fragmentation come to find out), and overall its just supposed to
>>>>> be a very nice leap ahead in GC. Havn't had a chance to play with it
>>>>> much myself, but its supposed to be fantastic. A whole new approach to
>>>>> generational collection for Sun, and much closer to the "real time" GC's
>>>>> available from some other vendors.
>>>>>
>>>>> Mark Miller wrote:
>>>>>
>>>>>         
>>>>>           
>>>>>> siping liu wrote:
>>>>>>
>>>>>>
>>>>>>           
>>>>>>             
>>>>>>> Hi,
>>>>>>>
>>>>>>> I read pretty much all posts on this thread (before and after this
>>>>>>>             
>>>>>>>               
>>> one).
>>>     
>>>       
>>>>> Looks like the main suggestion from you and others is to keep max heap
>>>>>         
>>>>>           
>>> size
>>>     
>>>       
>>>>> (-Xmx) as small as possible (as long as you don't see OOM exception).
>>>>>         
>>>>>           
>>> This
>>>     
>>>       
>>>>> brings more questions than answers (for me at least. I'm new to Solr).
>>>>>
>>>>>         
>>>>>           
>>>>>>> First, our environment and problem encountered: Solr1.4 (nightly
>>>>>>>             
>>>>>>>               
>>> build,
>>>     
>>>       
>>>>> downloaded about 2 months ago), Sun JDK1.6, Tomcat 5.5, running on
>>>>> Solaris(multi-cpu/cores). The cache setting is from the default
>>>>> solrconfig.xml (looks very small). At first we used minimum JAVA_OPTS
>>>>>         
>>>>>           
>>> and
>>>     
>>>       
>>>>> quickly run into the problem similar to the one orignal poster reported
>>>>>         
>>>>>           
>>> --
>>>     
>>>       
>>>>> long pause (seconds to minutes) under load test. jconsole showed that it
>>>>> pauses on GC. So more JAVA_OPTS get added: "-XX:+UseConcMarkSweepGC
>>>>> -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:SurvivorRatio=2
>>>>> -XX:NewSize=128m -XX:MaxNewSize=512m -XX:MaxGCPauseMillis=200", the
>>>>>         
>>>>>           
>>> thinking
>>>     
>>>       
>>>>> is with mutile-cpu/cores we can get over with GC as quickly as possibe.
>>>>>         
>>>>>           
>>> With
>>>     
>>>       
>>>>> the new setup, it works fine until Tomcat reaches heap size, then it
>>>>>         
>>>>>           
>>> blocks
>>>     
>>>       
>>>>> and takes minutes on "full GC" to get more space from "tenure
>>>>>         
>>>>>           
>>> generation".
>>>     
>>>       
>>>>> We tried different Xmx (from very small to large), no difference in long
>>>>>         
>>>>>           
>>> GC
>>>     
>>>       
>>>>> time. We never run into OOM.
>>>>>
>>>>>         
>>>>>           
>>>>>> MaxGCPauseMillis doesnt work with UseConcMarkSweepGC - its for use with
>>>>>> the Parallel collector. That also doesnt look like a good
>>>>>>           
>>>>>>             
>>> survivorratio.
>>>     
>>>       
>>>>>>           
>>>>>>             
>>>>>>> Questions:
>>>>>>>
>>>>>>> * In general various cachings are good for performance, we have more
>>>>>>>             
>>>>>>>               
>>> RAM
>>>     
>>>       
>>>>> to use and want to use more caching to boost performance, isn't your
>>>>> suggestion (of lowering heap limit) going against that?
>>>>>
>>>>>         
>>>>>           
>>>>>> Leaving RAM for the FileSystem cache is also very important. But you
>>>>>> should also have enough RAM for your Solr caches of course.
>>>>>>
>>>>>>
>>>>>>           
>>>>>>             
>>>>>>> * Looks like Solr caching made its way into tenure-generation on heap,
>>>>>>>
>>>>>>>             
>>>>>>>               
>>>>> that's good. But why they get GC'ed eventually?? I did a quick check of
>>>>>         
>>>>>           
>>> Solr
>>>     
>>>       
>>>>> code (Solr 1.3, not 1.4), and see a single instance of using
>>>>>         
>>>>>           
>>> WeakReference.
>>>     
>>>       
>>>>> Is that what is causing all this? This seems to suggest a design flaw in
>>>>> Solr's memory management strategy (or just my ignorance about Solr?). I
>>>>> mean, wouldn't this be the "right" way of doing it -- you allow user to
>>>>> specify the cache size in solrconfig.xml, then user can set up heap
>>>>>         
>>>>>           
>>> limit in
>>>     
>>>       
>>>>> JAVA_OPTS accordingly, and no need to use WeakReference (BTW, why not
>>>>> SoftReference)??
>>>>>
>>>>>         
>>>>>           
>>>>>> Do you see concurrent mode failure when looking at your gc logs? ie:
>>>>>>
>>>>>> 174.445: [GC 174.446: [ParNew: 66408K->66408K(66416K), 0.0000618
>>>>>> secs]174.446: [CMS (concurrent mode failure):
>>>>>>           
>>>>>>             
>>> 161928K->162118K(175104K),
>>>     
>>>       
>>>>>> 4.0975124 secs] 228336K->162118K(241520K)
>>>>>>
>>>>>> That means you have still getting major collections with CMS, and you
>>>>>> don't want that. You might try kicking GC off earlier with something
>>>>>> like: -XX:CMSInitiatingOccupancyFraction=50
>>>>>>
>>>>>>
>>>>>>           
>>>>>>             
>>>>>>> * Right now I have a single Tomcat hosting Solr and other
>>>>>>>             
>>>>>>>               
>>> applications.
>>>     
>>>       
>>>>> I guess now it's better to have Solr on its own Tomcat, given that it's
>>>>> tricky to adjust the java options.
>>>>>
>>>>>         
>>>>>           
>>>>>>> thanks.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>             
>>>>>>>               
>>>>>>>> From: wunder@wunderwood.org
>>>>>>>> To: solr-user@lucene.apache.org
>>>>>>>> Subject: RE: Solr and Garbage Collection
>>>>>>>> Date: Fri, 25 Sep 2009 09:51:29 -0700
>>>>>>>>
>>>>>>>> 30ms is not better or worse than 1s until you look at the service
>>>>>>>> requirements. For many applications, it is worth dedicating 10% of
>>>>>>>>               
>>>>>>>>                 
>>> your
>>>     
>>>       
>>>>>>>> processing time to GC if that makes the worst-case pause short.
>>>>>>>>
>>>>>>>> On the other hand, my experience with the IBM JVM was that the
>>>>>>>>               
>>>>>>>>                 
>>> maximum
>>>     
>>>       
>>>>> query
>>>>>
>>>>>         
>>>>>           
>>>>>>>> rate was 2-3X better with the concurrent generational GC compared to
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> any of
>>>>>
>>>>>         
>>>>>           
>>>>>>>> their other GC algorithms, so we got the best throughput along with
>>>>>>>>               
>>>>>>>>                 
>>> the
>>>     
>>>       
>>>>>>>> shortest pauses.
>>>>>>>>
>>>>>>>> Solr garbage generation (for queries) seems to have two major
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> components:
>>>>>
>>>>>         
>>>>>           
>>>>>>>> per-request garbage and cache evictions. With a generational
>>>>>>>>               
>>>>>>>>                 
>>> collector,
>>>     
>>>       
>>>>>>>> these two are handled by separate parts of the collector. Per-request
>>>>>>>> garbage should completely fit in the short-term heap (nursery), so
>>>>>>>>               
>>>>>>>>                 
>>> that
>>>     
>>>       
>>>>> it
>>>>>
>>>>>         
>>>>>           
>>>>>>>> can be collected rapidly and returned to use for further requests. If
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> the
>>>>>
>>>>>         
>>>>>           
>>>>>>>> nursery is too small, the per-request allocations will be made in
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> tenured
>>>>>
>>>>>         
>>>>>           
>>>>>>>> space and sit there until the next major GC. Cache evictions are
>>>>>>>>               
>>>>>>>>                 
>>> almost
>>>     
>>>       
>>>>>>>> always in long-term storage (tenured space) because an LRU algorithm
>>>>>>>> guarantees that the garbage will be old.
>>>>>>>>
>>>>>>>> Check the growth rate of tenured space (under constant load, of
>>>>>>>>               
>>>>>>>>                 
>>> course)
>>>     
>>>       
>>>>>>>> while increasing the size of the nursery. That rate should drop when
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> the
>>>>>
>>>>>         
>>>>>           
>>>>>>>> nursery gets big enough, then not drop much further as it is
>>>>>>>>               
>>>>>>>>                 
>>> increased
>>>     
>>>       
>>>>> more.
>>>>>
>>>>>         
>>>>>           
>>>>>>>> After that, reduce the size of tenured space until major GCs start
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> happening
>>>>>
>>>>>         
>>>>>           
>>>>>>>> "too often" (a judgment call). A bigger tenured space means longer
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> major GCs
>>>>>
>>>>>         
>>>>>           
>>>>>>>> and thus longer pauses, so you don't want it oversized by too much.
>>>>>>>>
>>>>>>>> Also check the hit rates of your caches. If the hit rate is low, say
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> 20% or
>>>>>
>>>>>         
>>>>>           
>>>>>>>> less, make that cache much bigger or set it to zero. Either one will
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> reduce
>>>>>
>>>>>         
>>>>>           
>>>>>>>> the number of cache evictions. If you have an HTTP cache in front of
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> Solr,
>>>>>
>>>>>         
>>>>>           
>>>>>>>> zero may be the right choice, since the HTTP cache is cherry-picking
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> the
>>>>>
>>>>>         
>>>>>           
>>>>>>>> easily cacheable requests.
>>>>>>>>
>>>>>>>> Note that a commit nearly doubles the memory required, because you
>>>>>>>>               
>>>>>>>>                 
>>> have
>>>     
>>>       
>>>>> two
>>>>>
>>>>>         
>>>>>           
>>>>>>>> live Searcher objects with all their caches. Make sure you have
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> headroom for
>>>>>
>>>>>         
>>>>>           
>>>>>>>> a commit.
>>>>>>>>
>>>>>>>> If you want to test the tenured space usage, you must test with real
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>> world
>>>>>
>>>>>         
>>>>>           
>>>>>>>> queries. Those are the only way to get accurate cache eviction rates.
>>>>>>>>
>>>>>>>> wunder
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>               
>>>>>>>>                 
>>>>>>> _________________________________________________________________
>>>>>>> Bing™  brings you maps, menus, and reviews organized in one place.
>>>>>>>             
>>>>>>>               
>>> Try
>>>     
>>>       
>>>>> it now.
>>>>>
>>>>>
>>>>>         
>>>>>           
>>> http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1
>>>     
>>>       
>>>>>>           
>>>>>>             
>>>>> --
>>>>> - Mark
>>>>>
>>>>> http://www.lucidimagination.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>         
>>>>>           
>>>>       
>>>>         
>>> --
>>> - Mark
>>>
>>> http://www.lucidimagination.com
>>>
>>>
>>>
>>>
>>>     
>>>       
>>   
>>     
>
>
>   


-- 
- Mark

http://www.lucidimagination.com




Re: Solr and Garbage Collection

Posted by Mark Miller <ma...@gmail.com>.
Yup - I know - I remember the Slashdot discussion on it well - I didn't
mean it that way myself. It caused quite a stir, but most people figured
out what they meant before they released any further info from what I
could tell. I just made the same mistake they did :)

Bill Au wrote:
> SUN's initial release notes actually pretty much said that it was
> "unsupported unless you pay".  They had since revised the release notes to
> clear up the confusion.
> Bill
>
> On Sat, Oct 3, 2009 at 2:51 PM, Mark Miller <ma...@gmail.com> wrote:
>
>   
>> Ah, yes - thanks for the clarification. Didn't pay attention to how
>> ambiguously I was using "supported" there :)
>>
>> Bill Au wrote:
>>     
>>> SUN has recently clarify the issue regarding "unsupported unless you pay"
>>> for the G1 garbage collector. Here is the updated release of Java 6
>>>       
>> update
>>     
>>> 14:
>>> http://java.sun.com/javase/6/webnotes/6u14.html
>>>
>>>
>>> G1 will be part of Java 7, fully supported without pay.  The version
>>> included in Java 6 update 14 is a beta release.  Since it is beta, SUN
>>>       
>> does
>>     
>>> not recommend using it unless you have a support contract because as with
>>> any beta software there will be bugs.  Non paying customers may very well
>>> have to wait for the official version in Java 7 for bug fixes.
>>>
>>> Here is more info on the G1 garbage collector:
>>>
>>> http://java.sun.com/javase/technologies/hotspot/gc/g1_intro.jsp
>>>
>>>
>>> Bill
>>>
>>> On Sat, Oct 3, 2009 at 1:28 PM, Mark Miller <ma...@gmail.com>
>>>       
>> wrote:
>>     
>>>       
>>>> Another option of course, if you're using a recent version of Java 6:
>>>>
>>>> try out the beta-ish, unsupported unless you pay, G1 garbage collector.
>>>> I've only recently started playing with it, but its supposed to be much
>>>> better than CMS. Its supposedly got much better throughput, its much
>>>> better at dealing with fragmentation issues (CMS is actually pretty bad
>>>> with fragmentation come to find out), and overall its just supposed to
>>>> be a very nice leap ahead in GC. Havn't had a chance to play with it
>>>> much myself, but its supposed to be fantastic. A whole new approach to
>>>> generational collection for Sun, and much closer to the "real time" GC's
>>>> available from some other vendors.
>>>>
>>>> Mark Miller wrote:
>>>>
>>>>         
>>>>> siping liu wrote:
>>>>>
>>>>>
>>>>>           
>>>>>> Hi,
>>>>>>
>>>>>> I read pretty much all posts on this thread (before and after this
>>>>>>             
>> one).
>>     
>>>> Looks like the main suggestion from you and others is to keep max heap
>>>>         
>> size
>>     
>>>> (-Xmx) as small as possible (as long as you don't see OOM exception).
>>>>         
>> This
>>     
>>>> brings more questions than answers (for me at least. I'm new to Solr).
>>>>
>>>>         
>>>>>> First, our environment and problem encountered: Solr1.4 (nightly
>>>>>>             
>> build,
>>     
>>>> downloaded about 2 months ago), Sun JDK1.6, Tomcat 5.5, running on
>>>> Solaris(multi-cpu/cores). The cache setting is from the default
>>>> solrconfig.xml (looks very small). At first we used minimum JAVA_OPTS
>>>>         
>> and
>>     
>>>> quickly run into the problem similar to the one orignal poster reported
>>>>         
>> --
>>     
>>>> long pause (seconds to minutes) under load test. jconsole showed that it
>>>> pauses on GC. So more JAVA_OPTS get added: "-XX:+UseConcMarkSweepGC
>>>> -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:SurvivorRatio=2
>>>> -XX:NewSize=128m -XX:MaxNewSize=512m -XX:MaxGCPauseMillis=200", the
>>>>         
>> thinking
>>     
>>>> is with mutile-cpu/cores we can get over with GC as quickly as possibe.
>>>>         
>> With
>>     
>>>> the new setup, it works fine until Tomcat reaches heap size, then it
>>>>         
>> blocks
>>     
>>>> and takes minutes on "full GC" to get more space from "tenure
>>>>         
>> generation".
>>     
>>>> We tried different Xmx (from very small to large), no difference in long
>>>>         
>> GC
>>     
>>>> time. We never run into OOM.
>>>>
>>>>         
>>>>> MaxGCPauseMillis doesnt work with UseConcMarkSweepGC - its for use with
>>>>> the Parallel collector. That also doesnt look like a good
>>>>>           
>> survivorratio.
>>     
>>>>>           
>>>>>> Questions:
>>>>>>
>>>>>> * In general various cachings are good for performance, we have more
>>>>>>             
>> RAM
>>     
>>>> to use and want to use more caching to boost performance, isn't your
>>>> suggestion (of lowering heap limit) going against that?
>>>>
>>>>         
>>>>> Leaving RAM for the FileSystem cache is also very important. But you
>>>>> should also have enough RAM for your Solr caches of course.
>>>>>
>>>>>
>>>>>           
>>>>>> * Looks like Solr caching made its way into tenure-generation on heap,
>>>>>>
>>>>>>             
>>>> that's good. But why they get GC'ed eventually?? I did a quick check of
>>>>         
>> Solr
>>     
>>>> code (Solr 1.3, not 1.4), and see a single instance of using
>>>>         
>> WeakReference.
>>     
>>>> Is that what is causing all this? This seems to suggest a design flaw in
>>>> Solr's memory management strategy (or just my ignorance about Solr?). I
>>>> mean, wouldn't this be the "right" way of doing it -- you allow user to
>>>> specify the cache size in solrconfig.xml, then user can set up heap
>>>>         
>> limit in
>>     
>>>> JAVA_OPTS accordingly, and no need to use WeakReference (BTW, why not
>>>> SoftReference)??
>>>>
>>>>         
>>>>> Do you see concurrent mode failure when looking at your gc logs? ie:
>>>>>
>>>>> 174.445: [GC 174.446: [ParNew: 66408K->66408K(66416K), 0.0000618
>>>>> secs]174.446: [CMS (concurrent mode failure):
>>>>>           
>> 161928K->162118K(175104K),
>>     
>>>>> 4.0975124 secs] 228336K->162118K(241520K)
>>>>>
>>>>> That means you have still getting major collections with CMS, and you
>>>>> don't want that. You might try kicking GC off earlier with something
>>>>> like: -XX:CMSInitiatingOccupancyFraction=50
>>>>>
>>>>>
>>>>>           
>>>>>> * Right now I have a single Tomcat hosting Solr and other
>>>>>>             
>> applications.
>>     
>>>> I guess now it's better to have Solr on its own Tomcat, given that it's
>>>> tricky to adjust the java options.
>>>>
>>>>         
>>>>>> thanks.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> From: wunder@wunderwood.org
>>>>>>> To: solr-user@lucene.apache.org
>>>>>>> Subject: RE: Solr and Garbage Collection
>>>>>>> Date: Fri, 25 Sep 2009 09:51:29 -0700
>>>>>>>
>>>>>>> 30ms is not better or worse than 1s until you look at the service
>>>>>>> requirements. For many applications, it is worth dedicating 10% of
>>>>>>>               
>> your
>>     
>>>>>>> processing time to GC if that makes the worst-case pause short.
>>>>>>>
>>>>>>> On the other hand, my experience with the IBM JVM was that the
>>>>>>>               
>> maximum
>>     
>>>> query
>>>>
>>>>         
>>>>>>> rate was 2-3X better with the concurrent generational GC compared to
>>>>>>>
>>>>>>>               
>>>> any of
>>>>
>>>>         
>>>>>>> their other GC algorithms, so we got the best throughput along with
>>>>>>>               
>> the
>>     
>>>>>>> shortest pauses.
>>>>>>>
>>>>>>> Solr garbage generation (for queries) seems to have two major
>>>>>>>
>>>>>>>               
>>>> components:
>>>>
>>>>         
>>>>>>> per-request garbage and cache evictions. With a generational
>>>>>>>               
>> collector,
>>     
>>>>>>> these two are handled by separate parts of the collector. Per-request
>>>>>>> garbage should completely fit in the short-term heap (nursery), so
>>>>>>>               
>> that
>>     
>>>> it
>>>>
>>>>         
>>>>>>> can be collected rapidly and returned to use for further requests. If
>>>>>>>
>>>>>>>               
>>>> the
>>>>
>>>>         
>>>>>>> nursery is too small, the per-request allocations will be made in
>>>>>>>
>>>>>>>               
>>>> tenured
>>>>
>>>>         
>>>>>>> space and sit there until the next major GC. Cache evictions are
>>>>>>>               
>> almost
>>     
>>>>>>> always in long-term storage (tenured space) because an LRU algorithm
>>>>>>> guarantees that the garbage will be old.
>>>>>>>
>>>>>>> Check the growth rate of tenured space (under constant load, of
>>>>>>>               
>> course)
>>     
>>>>>>> while increasing the size of the nursery. That rate should drop when
>>>>>>>
>>>>>>>               
>>>> the
>>>>
>>>>         
>>>>>>> nursery gets big enough, then not drop much further as it is
>>>>>>>               
>> increased
>>     
>>>> more.
>>>>
>>>>         
>>>>>>> After that, reduce the size of tenured space until major GCs start
>>>>>>>
>>>>>>>               
>>>> happening
>>>>
>>>>         
>>>>>>> "too often" (a judgment call). A bigger tenured space means longer
>>>>>>>
>>>>>>>               
>>>> major GCs
>>>>
>>>>         
>>>>>>> and thus longer pauses, so you don't want it oversized by too much.
>>>>>>>
>>>>>>> Also check the hit rates of your caches. If the hit rate is low, say
>>>>>>>
>>>>>>>               
>>>> 20% or
>>>>
>>>>         
>>>>>>> less, make that cache much bigger or set it to zero. Either one will
>>>>>>>
>>>>>>>               
>>>> reduce
>>>>
>>>>         
>>>>>>> the number of cache evictions. If you have an HTTP cache in front of
>>>>>>>
>>>>>>>               
>>>> Solr,
>>>>
>>>>         
>>>>>>> zero may be the right choice, since the HTTP cache is cherry-picking
>>>>>>>
>>>>>>>               
>>>> the
>>>>
>>>>         
>>>>>>> easily cacheable requests.
>>>>>>>
>>>>>>> Note that a commit nearly doubles the memory required, because you
>>>>>>>               
>> have
>>     
>>>> two
>>>>
>>>>         
>>>>>>> live Searcher objects with all their caches. Make sure you have
>>>>>>>
>>>>>>>               
>>>> headroom for
>>>>
>>>>         
>>>>>>> a commit.
>>>>>>>
>>>>>>> If you want to test the tenured space usage, you must test with real
>>>>>>>
>>>>>>>               
>>>> world
>>>>
>>>>         
>>>>>>> queries. Those are the only way to get accurate cache eviction rates.
>>>>>>>
>>>>>>> wunder
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>> _________________________________________________________________
>>>>>> Bing™  brings you maps, menus, and reviews organized in one place.
>>>>>>             
>> Try
>>     
>>>> it now.
>>>>
>>>>
>>>>         
>> http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1
>>     
>>>>>           
>>>> --
>>>> - Mark
>>>>
>>>> http://www.lucidimagination.com
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         
>>>       
>> --
>> - Mark
>>
>> http://www.lucidimagination.com
>>
>>
>>
>>
>>     
>
>   


-- 
- Mark

http://www.lucidimagination.com




Re: Solr and Garbage Collection

Posted by Bill Au <bi...@gmail.com>.
SUN's initial release notes actually pretty much said that it was
"unsupported unless you pay".  They had since revised the release notes to
clear up the confusion.
Bill

On Sat, Oct 3, 2009 at 2:51 PM, Mark Miller <ma...@gmail.com> wrote:

> Ah, yes - thanks for the clarification. Didn't pay attention to how
> ambiguously I was using "supported" there :)
>
> Bill Au wrote:
> > SUN has recently clarify the issue regarding "unsupported unless you pay"
> > for the G1 garbage collector. Here is the updated release of Java 6
> update
> > 14:
> > http://java.sun.com/javase/6/webnotes/6u14.html
> >
> >
> > G1 will be part of Java 7, fully supported without pay.  The version
> > included in Java 6 update 14 is a beta release.  Since it is beta, SUN
> does
> > not recommend using it unless you have a support contract because as with
> > any beta software there will be bugs.  Non paying customers may very well
> > have to wait for the official version in Java 7 for bug fixes.
> >
> > Here is more info on the G1 garbage collector:
> >
> > http://java.sun.com/javase/technologies/hotspot/gc/g1_intro.jsp
> >
> >
> > Bill
> >
> > On Sat, Oct 3, 2009 at 1:28 PM, Mark Miller <ma...@gmail.com>
> wrote:
> >
> >
> >> Another option of course, if you're using a recent version of Java 6:
> >>
> >> try out the beta-ish, unsupported unless you pay, G1 garbage collector.
> >> I've only recently started playing with it, but its supposed to be much
> >> better than CMS. Its supposedly got much better throughput, its much
> >> better at dealing with fragmentation issues (CMS is actually pretty bad
> >> with fragmentation come to find out), and overall its just supposed to
> >> be a very nice leap ahead in GC. Havn't had a chance to play with it
> >> much myself, but its supposed to be fantastic. A whole new approach to
> >> generational collection for Sun, and much closer to the "real time" GC's
> >> available from some other vendors.
> >>
> >> Mark Miller wrote:
> >>
> >>> siping liu wrote:
> >>>
> >>>
> >>>> Hi,
> >>>>
> >>>> I read pretty much all posts on this thread (before and after this
> one).
> >>>>
> >> Looks like the main suggestion from you and others is to keep max heap
> size
> >> (-Xmx) as small as possible (as long as you don't see OOM exception).
> This
> >> brings more questions than answers (for me at least. I'm new to Solr).
> >>
> >>>>
> >>>> First, our environment and problem encountered: Solr1.4 (nightly
> build,
> >>>>
> >> downloaded about 2 months ago), Sun JDK1.6, Tomcat 5.5, running on
> >> Solaris(multi-cpu/cores). The cache setting is from the default
> >> solrconfig.xml (looks very small). At first we used minimum JAVA_OPTS
> and
> >> quickly run into the problem similar to the one orignal poster reported
> --
> >> long pause (seconds to minutes) under load test. jconsole showed that it
> >> pauses on GC. So more JAVA_OPTS get added: "-XX:+UseConcMarkSweepGC
> >> -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:SurvivorRatio=2
> >> -XX:NewSize=128m -XX:MaxNewSize=512m -XX:MaxGCPauseMillis=200", the
> thinking
> >> is with mutile-cpu/cores we can get over with GC as quickly as possibe.
> With
> >> the new setup, it works fine until Tomcat reaches heap size, then it
> blocks
> >> and takes minutes on "full GC" to get more space from "tenure
> generation".
> >> We tried different Xmx (from very small to large), no difference in long
> GC
> >> time. We never run into OOM.
> >>
> >>>>
> >>> MaxGCPauseMillis doesnt work with UseConcMarkSweepGC - its for use with
> >>> the Parallel collector. That also doesnt look like a good
> survivorratio.
> >>>
> >>>
> >>>> Questions:
> >>>>
> >>>> * In general various cachings are good for performance, we have more
> RAM
> >>>>
> >> to use and want to use more caching to boost performance, isn't your
> >> suggestion (of lowering heap limit) going against that?
> >>
> >>>>
> >>> Leaving RAM for the FileSystem cache is also very important. But you
> >>> should also have enough RAM for your Solr caches of course.
> >>>
> >>>
> >>>> * Looks like Solr caching made its way into tenure-generation on heap,
> >>>>
> >> that's good. But why they get GC'ed eventually?? I did a quick check of
> Solr
> >> code (Solr 1.3, not 1.4), and see a single instance of using
> WeakReference.
> >> Is that what is causing all this? This seems to suggest a design flaw in
> >> Solr's memory management strategy (or just my ignorance about Solr?). I
> >> mean, wouldn't this be the "right" way of doing it -- you allow user to
> >> specify the cache size in solrconfig.xml, then user can set up heap
> limit in
> >> JAVA_OPTS accordingly, and no need to use WeakReference (BTW, why not
> >> SoftReference)??
> >>
> >>>>
> >>> Do you see concurrent mode failure when looking at your gc logs? ie:
> >>>
> >>> 174.445: [GC 174.446: [ParNew: 66408K->66408K(66416K), 0.0000618
> >>> secs]174.446: [CMS (concurrent mode failure):
> 161928K->162118K(175104K),
> >>> 4.0975124 secs] 228336K->162118K(241520K)
> >>>
> >>> That means you have still getting major collections with CMS, and you
> >>> don't want that. You might try kicking GC off earlier with something
> >>> like: -XX:CMSInitiatingOccupancyFraction=50
> >>>
> >>>
> >>>> * Right now I have a single Tomcat hosting Solr and other
> applications.
> >>>>
> >> I guess now it's better to have Solr on its own Tomcat, given that it's
> >> tricky to adjust the java options.
> >>
> >>>>
> >>>> thanks.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> From: wunder@wunderwood.org
> >>>>> To: solr-user@lucene.apache.org
> >>>>> Subject: RE: Solr and Garbage Collection
> >>>>> Date: Fri, 25 Sep 2009 09:51:29 -0700
> >>>>>
> >>>>> 30ms is not better or worse than 1s until you look at the service
> >>>>> requirements. For many applications, it is worth dedicating 10% of
> your
> >>>>> processing time to GC if that makes the worst-case pause short.
> >>>>>
> >>>>> On the other hand, my experience with the IBM JVM was that the
> maximum
> >>>>>
> >> query
> >>
> >>>>> rate was 2-3X better with the concurrent generational GC compared to
> >>>>>
> >> any of
> >>
> >>>>> their other GC algorithms, so we got the best throughput along with
> the
> >>>>> shortest pauses.
> >>>>>
> >>>>> Solr garbage generation (for queries) seems to have two major
> >>>>>
> >> components:
> >>
> >>>>> per-request garbage and cache evictions. With a generational
> collector,
> >>>>> these two are handled by separate parts of the collector. Per-request
> >>>>> garbage should completely fit in the short-term heap (nursery), so
> that
> >>>>>
> >> it
> >>
> >>>>> can be collected rapidly and returned to use for further requests. If
> >>>>>
> >> the
> >>
> >>>>> nursery is too small, the per-request allocations will be made in
> >>>>>
> >> tenured
> >>
> >>>>> space and sit there until the next major GC. Cache evictions are
> almost
> >>>>> always in long-term storage (tenured space) because an LRU algorithm
> >>>>> guarantees that the garbage will be old.
> >>>>>
> >>>>> Check the growth rate of tenured space (under constant load, of
> course)
> >>>>> while increasing the size of the nursery. That rate should drop when
> >>>>>
> >> the
> >>
> >>>>> nursery gets big enough, then not drop much further as it is
> increased
> >>>>>
> >> more.
> >>
> >>>>> After that, reduce the size of tenured space until major GCs start
> >>>>>
> >> happening
> >>
> >>>>> "too often" (a judgment call). A bigger tenured space means longer
> >>>>>
> >> major GCs
> >>
> >>>>> and thus longer pauses, so you don't want it oversized by too much.
> >>>>>
> >>>>> Also check the hit rates of your caches. If the hit rate is low, say
> >>>>>
> >> 20% or
> >>
> >>>>> less, make that cache much bigger or set it to zero. Either one will
> >>>>>
> >> reduce
> >>
> >>>>> the number of cache evictions. If you have an HTTP cache in front of
> >>>>>
> >> Solr,
> >>
> >>>>> zero may be the right choice, since the HTTP cache is cherry-picking
> >>>>>
> >> the
> >>
> >>>>> easily cacheable requests.
> >>>>>
> >>>>> Note that a commit nearly doubles the memory required, because you
> have
> >>>>>
> >> two
> >>
> >>>>> live Searcher objects with all their caches. Make sure you have
> >>>>>
> >> headroom for
> >>
> >>>>> a commit.
> >>>>>
> >>>>> If you want to test the tenured space usage, you must test with real
> >>>>>
> >> world
> >>
> >>>>> queries. Those are the only way to get accurate cache eviction rates.
> >>>>>
> >>>>> wunder
> >>>>>
> >>>>>
> >>>>>
> >>>> _________________________________________________________________
> >>>> Bing™  brings you maps, menus, and reviews organized in one place.
> Try
> >>>>
> >> it now.
> >>
> >>
> http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1
> >>
> >>>>
> >>>
> >>>
> >> --
> >> - Mark
> >>
> >> http://www.lucidimagination.com
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>

Re: Solr and Garbage Collection

Posted by Mark Miller <ma...@gmail.com>.
Ah, yes - thanks for the clarification. Didn't pay attention to how
ambiguously I was using "supported" there :)

Bill Au wrote:
> SUN has recently clarify the issue regarding "unsupported unless you pay"
> for the G1 garbage collector. Here is the updated release of Java 6 update
> 14:
> http://java.sun.com/javase/6/webnotes/6u14.html
>
>
> G1 will be part of Java 7, fully supported without pay.  The version
> included in Java 6 update 14 is a beta release.  Since it is beta, SUN does
> not recommend using it unless you have a support contract because as with
> any beta software there will be bugs.  Non paying customers may very well
> have to wait for the official version in Java 7 for bug fixes.
>
> Here is more info on the G1 garbage collector:
>
> http://java.sun.com/javase/technologies/hotspot/gc/g1_intro.jsp
>
>
> Bill
>
> On Sat, Oct 3, 2009 at 1:28 PM, Mark Miller <ma...@gmail.com> wrote:
>
>   
>> Another option of course, if you're using a recent version of Java 6:
>>
>> try out the beta-ish, unsupported unless you pay, G1 garbage collector.
>> I've only recently started playing with it, but its supposed to be much
>> better than CMS. Its supposedly got much better throughput, its much
>> better at dealing with fragmentation issues (CMS is actually pretty bad
>> with fragmentation come to find out), and overall its just supposed to
>> be a very nice leap ahead in GC. Havn't had a chance to play with it
>> much myself, but its supposed to be fantastic. A whole new approach to
>> generational collection for Sun, and much closer to the "real time" GC's
>> available from some other vendors.
>>
>> Mark Miller wrote:
>>     
>>> siping liu wrote:
>>>
>>>       
>>>> Hi,
>>>>
>>>> I read pretty much all posts on this thread (before and after this one).
>>>>         
>> Looks like the main suggestion from you and others is to keep max heap size
>> (-Xmx) as small as possible (as long as you don't see OOM exception). This
>> brings more questions than answers (for me at least. I'm new to Solr).
>>     
>>>>
>>>> First, our environment and problem encountered: Solr1.4 (nightly build,
>>>>         
>> downloaded about 2 months ago), Sun JDK1.6, Tomcat 5.5, running on
>> Solaris(multi-cpu/cores). The cache setting is from the default
>> solrconfig.xml (looks very small). At first we used minimum JAVA_OPTS and
>> quickly run into the problem similar to the one orignal poster reported --
>> long pause (seconds to minutes) under load test. jconsole showed that it
>> pauses on GC. So more JAVA_OPTS get added: "-XX:+UseConcMarkSweepGC
>> -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:SurvivorRatio=2
>> -XX:NewSize=128m -XX:MaxNewSize=512m -XX:MaxGCPauseMillis=200", the thinking
>> is with mutile-cpu/cores we can get over with GC as quickly as possibe. With
>> the new setup, it works fine until Tomcat reaches heap size, then it blocks
>> and takes minutes on "full GC" to get more space from "tenure generation".
>> We tried different Xmx (from very small to large), no difference in long GC
>> time. We never run into OOM.
>>     
>>>>         
>>> MaxGCPauseMillis doesnt work with UseConcMarkSweepGC - its for use with
>>> the Parallel collector. That also doesnt look like a good survivorratio.
>>>
>>>       
>>>> Questions:
>>>>
>>>> * In general various cachings are good for performance, we have more RAM
>>>>         
>> to use and want to use more caching to boost performance, isn't your
>> suggestion (of lowering heap limit) going against that?
>>     
>>>>         
>>> Leaving RAM for the FileSystem cache is also very important. But you
>>> should also have enough RAM for your Solr caches of course.
>>>
>>>       
>>>> * Looks like Solr caching made its way into tenure-generation on heap,
>>>>         
>> that's good. But why they get GC'ed eventually?? I did a quick check of Solr
>> code (Solr 1.3, not 1.4), and see a single instance of using WeakReference.
>> Is that what is causing all this? This seems to suggest a design flaw in
>> Solr's memory management strategy (or just my ignorance about Solr?). I
>> mean, wouldn't this be the "right" way of doing it -- you allow user to
>> specify the cache size in solrconfig.xml, then user can set up heap limit in
>> JAVA_OPTS accordingly, and no need to use WeakReference (BTW, why not
>> SoftReference)??
>>     
>>>>         
>>> Do you see concurrent mode failure when looking at your gc logs? ie:
>>>
>>> 174.445: [GC 174.446: [ParNew: 66408K->66408K(66416K), 0.0000618
>>> secs]174.446: [CMS (concurrent mode failure): 161928K->162118K(175104K),
>>> 4.0975124 secs] 228336K->162118K(241520K)
>>>
>>> That means you have still getting major collections with CMS, and you
>>> don't want that. You might try kicking GC off earlier with something
>>> like: -XX:CMSInitiatingOccupancyFraction=50
>>>
>>>       
>>>> * Right now I have a single Tomcat hosting Solr and other applications.
>>>>         
>> I guess now it's better to have Solr on its own Tomcat, given that it's
>> tricky to adjust the java options.
>>     
>>>>
>>>> thanks.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         
>>>>> From: wunder@wunderwood.org
>>>>> To: solr-user@lucene.apache.org
>>>>> Subject: RE: Solr and Garbage Collection
>>>>> Date: Fri, 25 Sep 2009 09:51:29 -0700
>>>>>
>>>>> 30ms is not better or worse than 1s until you look at the service
>>>>> requirements. For many applications, it is worth dedicating 10% of your
>>>>> processing time to GC if that makes the worst-case pause short.
>>>>>
>>>>> On the other hand, my experience with the IBM JVM was that the maximum
>>>>>           
>> query
>>     
>>>>> rate was 2-3X better with the concurrent generational GC compared to
>>>>>           
>> any of
>>     
>>>>> their other GC algorithms, so we got the best throughput along with the
>>>>> shortest pauses.
>>>>>
>>>>> Solr garbage generation (for queries) seems to have two major
>>>>>           
>> components:
>>     
>>>>> per-request garbage and cache evictions. With a generational collector,
>>>>> these two are handled by separate parts of the collector. Per-request
>>>>> garbage should completely fit in the short-term heap (nursery), so that
>>>>>           
>> it
>>     
>>>>> can be collected rapidly and returned to use for further requests. If
>>>>>           
>> the
>>     
>>>>> nursery is too small, the per-request allocations will be made in
>>>>>           
>> tenured
>>     
>>>>> space and sit there until the next major GC. Cache evictions are almost
>>>>> always in long-term storage (tenured space) because an LRU algorithm
>>>>> guarantees that the garbage will be old.
>>>>>
>>>>> Check the growth rate of tenured space (under constant load, of course)
>>>>> while increasing the size of the nursery. That rate should drop when
>>>>>           
>> the
>>     
>>>>> nursery gets big enough, then not drop much further as it is increased
>>>>>           
>> more.
>>     
>>>>> After that, reduce the size of tenured space until major GCs start
>>>>>           
>> happening
>>     
>>>>> "too often" (a judgment call). A bigger tenured space means longer
>>>>>           
>> major GCs
>>     
>>>>> and thus longer pauses, so you don't want it oversized by too much.
>>>>>
>>>>> Also check the hit rates of your caches. If the hit rate is low, say
>>>>>           
>> 20% or
>>     
>>>>> less, make that cache much bigger or set it to zero. Either one will
>>>>>           
>> reduce
>>     
>>>>> the number of cache evictions. If you have an HTTP cache in front of
>>>>>           
>> Solr,
>>     
>>>>> zero may be the right choice, since the HTTP cache is cherry-picking
>>>>>           
>> the
>>     
>>>>> easily cacheable requests.
>>>>>
>>>>> Note that a commit nearly doubles the memory required, because you have
>>>>>           
>> two
>>     
>>>>> live Searcher objects with all their caches. Make sure you have
>>>>>           
>> headroom for
>>     
>>>>> a commit.
>>>>>
>>>>> If you want to test the tenured space usage, you must test with real
>>>>>           
>> world
>>     
>>>>> queries. Those are the only way to get accurate cache eviction rates.
>>>>>
>>>>> wunder
>>>>>
>>>>>
>>>>>           
>>>> _________________________________________________________________
>>>> Bing™  brings you maps, menus, and reviews organized in one place.   Try
>>>>         
>> it now.
>>     
>> http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1
>>     
>>>>         
>>>
>>>       
>> --
>> - Mark
>>
>> http://www.lucidimagination.com
>>
>>
>>
>>
>>
>>     
>
>   


-- 
- Mark

http://www.lucidimagination.com




Re: Solr and Garbage Collection

Posted by Bill Au <bi...@gmail.com>.
SUN has recently clarify the issue regarding "unsupported unless you pay"
for the G1 garbage collector. Here is the updated release of Java 6 update
14:
http://java.sun.com/javase/6/webnotes/6u14.html


G1 will be part of Java 7, fully supported without pay.  The version
included in Java 6 update 14 is a beta release.  Since it is beta, SUN does
not recommend using it unless you have a support contract because as with
any beta software there will be bugs.  Non paying customers may very well
have to wait for the official version in Java 7 for bug fixes.

Here is more info on the G1 garbage collector:

http://java.sun.com/javase/technologies/hotspot/gc/g1_intro.jsp


Bill

On Sat, Oct 3, 2009 at 1:28 PM, Mark Miller <ma...@gmail.com> wrote:

> Another option of course, if you're using a recent version of Java 6:
>
> try out the beta-ish, unsupported unless you pay, G1 garbage collector.
> I've only recently started playing with it, but its supposed to be much
> better than CMS. Its supposedly got much better throughput, its much
> better at dealing with fragmentation issues (CMS is actually pretty bad
> with fragmentation come to find out), and overall its just supposed to
> be a very nice leap ahead in GC. Havn't had a chance to play with it
> much myself, but its supposed to be fantastic. A whole new approach to
> generational collection for Sun, and much closer to the "real time" GC's
> available from some other vendors.
>
> Mark Miller wrote:
> > siping liu wrote:
> >
> >> Hi,
> >>
> >> I read pretty much all posts on this thread (before and after this one).
> Looks like the main suggestion from you and others is to keep max heap size
> (-Xmx) as small as possible (as long as you don't see OOM exception). This
> brings more questions than answers (for me at least. I'm new to Solr).
> >>
> >>
> >>
> >> First, our environment and problem encountered: Solr1.4 (nightly build,
> downloaded about 2 months ago), Sun JDK1.6, Tomcat 5.5, running on
> Solaris(multi-cpu/cores). The cache setting is from the default
> solrconfig.xml (looks very small). At first we used minimum JAVA_OPTS and
> quickly run into the problem similar to the one orignal poster reported --
> long pause (seconds to minutes) under load test. jconsole showed that it
> pauses on GC. So more JAVA_OPTS get added: "-XX:+UseConcMarkSweepGC
> -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:SurvivorRatio=2
> -XX:NewSize=128m -XX:MaxNewSize=512m -XX:MaxGCPauseMillis=200", the thinking
> is with mutile-cpu/cores we can get over with GC as quickly as possibe. With
> the new setup, it works fine until Tomcat reaches heap size, then it blocks
> and takes minutes on "full GC" to get more space from "tenure generation".
> We tried different Xmx (from very small to large), no difference in long GC
> time. We never run into OOM.
> >>
> >>
> > MaxGCPauseMillis doesnt work with UseConcMarkSweepGC - its for use with
> > the Parallel collector. That also doesnt look like a good survivorratio.
> >
> >>
> >>
> >> Questions:
> >>
> >> * In general various cachings are good for performance, we have more RAM
> to use and want to use more caching to boost performance, isn't your
> suggestion (of lowering heap limit) going against that?
> >>
> >>
> > Leaving RAM for the FileSystem cache is also very important. But you
> > should also have enough RAM for your Solr caches of course.
> >
> >> * Looks like Solr caching made its way into tenure-generation on heap,
> that's good. But why they get GC'ed eventually?? I did a quick check of Solr
> code (Solr 1.3, not 1.4), and see a single instance of using WeakReference.
> Is that what is causing all this? This seems to suggest a design flaw in
> Solr's memory management strategy (or just my ignorance about Solr?). I
> mean, wouldn't this be the "right" way of doing it -- you allow user to
> specify the cache size in solrconfig.xml, then user can set up heap limit in
> JAVA_OPTS accordingly, and no need to use WeakReference (BTW, why not
> SoftReference)??
> >>
> >>
> > Do you see concurrent mode failure when looking at your gc logs? ie:
> >
> > 174.445: [GC 174.446: [ParNew: 66408K->66408K(66416K), 0.0000618
> > secs]174.446: [CMS (concurrent mode failure): 161928K->162118K(175104K),
> > 4.0975124 secs] 228336K->162118K(241520K)
> >
> > That means you have still getting major collections with CMS, and you
> > don't want that. You might try kicking GC off earlier with something
> > like: -XX:CMSInitiatingOccupancyFraction=50
> >
> >> * Right now I have a single Tomcat hosting Solr and other applications.
> I guess now it's better to have Solr on its own Tomcat, given that it's
> tricky to adjust the java options.
> >>
> >>
> >>
> >> thanks.
> >>
> >>
> >>
> >>
> >>
> >>> From: wunder@wunderwood.org
> >>> To: solr-user@lucene.apache.org
> >>> Subject: RE: Solr and Garbage Collection
> >>> Date: Fri, 25 Sep 2009 09:51:29 -0700
> >>>
> >>> 30ms is not better or worse than 1s until you look at the service
> >>> requirements. For many applications, it is worth dedicating 10% of your
> >>> processing time to GC if that makes the worst-case pause short.
> >>>
> >>> On the other hand, my experience with the IBM JVM was that the maximum
> query
> >>> rate was 2-3X better with the concurrent generational GC compared to
> any of
> >>> their other GC algorithms, so we got the best throughput along with the
> >>> shortest pauses.
> >>>
> >>> Solr garbage generation (for queries) seems to have two major
> components:
> >>> per-request garbage and cache evictions. With a generational collector,
> >>> these two are handled by separate parts of the collector. Per-request
> >>> garbage should completely fit in the short-term heap (nursery), so that
> it
> >>> can be collected rapidly and returned to use for further requests. If
> the
> >>> nursery is too small, the per-request allocations will be made in
> tenured
> >>> space and sit there until the next major GC. Cache evictions are almost
> >>> always in long-term storage (tenured space) because an LRU algorithm
> >>> guarantees that the garbage will be old.
> >>>
> >>> Check the growth rate of tenured space (under constant load, of course)
> >>> while increasing the size of the nursery. That rate should drop when
> the
> >>> nursery gets big enough, then not drop much further as it is increased
> more.
> >>>
> >>> After that, reduce the size of tenured space until major GCs start
> happening
> >>> "too often" (a judgment call). A bigger tenured space means longer
> major GCs
> >>> and thus longer pauses, so you don't want it oversized by too much.
> >>>
> >>> Also check the hit rates of your caches. If the hit rate is low, say
> 20% or
> >>> less, make that cache much bigger or set it to zero. Either one will
> reduce
> >>> the number of cache evictions. If you have an HTTP cache in front of
> Solr,
> >>> zero may be the right choice, since the HTTP cache is cherry-picking
> the
> >>> easily cacheable requests.
> >>>
> >>> Note that a commit nearly doubles the memory required, because you have
> two
> >>> live Searcher objects with all their caches. Make sure you have
> headroom for
> >>> a commit.
> >>>
> >>> If you want to test the tenured space usage, you must test with real
> world
> >>> queries. Those are the only way to get accurate cache eviction rates.
> >>>
> >>> wunder
> >>>
> >>>
> >>
> >> _________________________________________________________________
> >> Bing™  brings you maps, menus, and reviews organized in one place.   Try
> it now.
> >>
> http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1
> >>
> >>
> >
> >
> >
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
>

Re: Solr and Garbage Collection

Posted by Mark Miller <ma...@gmail.com>.
Another option of course, if you're using a recent version of Java 6:

try out the beta-ish, unsupported unless you pay, G1 garbage collector.
I've only recently started playing with it, but its supposed to be much
better than CMS. Its supposedly got much better throughput, its much
better at dealing with fragmentation issues (CMS is actually pretty bad
with fragmentation come to find out), and overall its just supposed to
be a very nice leap ahead in GC. Havn't had a chance to play with it
much myself, but its supposed to be fantastic. A whole new approach to
generational collection for Sun, and much closer to the "real time" GC's
available from some other vendors.

Mark Miller wrote:
> siping liu wrote:
>   
>> Hi,
>>
>> I read pretty much all posts on this thread (before and after this one). Looks like the main suggestion from you and others is to keep max heap size (-Xmx) as small as possible (as long as you don't see OOM exception). This brings more questions than answers (for me at least. I'm new to Solr).
>>
>>  
>>
>> First, our environment and problem encountered: Solr1.4 (nightly build, downloaded about 2 months ago), Sun JDK1.6, Tomcat 5.5, running on Solaris(multi-cpu/cores). The cache setting is from the default solrconfig.xml (looks very small). At first we used minimum JAVA_OPTS and quickly run into the problem similar to the one orignal poster reported -- long pause (seconds to minutes) under load test. jconsole showed that it pauses on GC. So more JAVA_OPTS get added: "-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:SurvivorRatio=2 -XX:NewSize=128m -XX:MaxNewSize=512m -XX:MaxGCPauseMillis=200", the thinking is with mutile-cpu/cores we can get over with GC as quickly as possibe. With the new setup, it works fine until Tomcat reaches heap size, then it blocks and takes minutes on "full GC" to get more space from "tenure generation". We tried different Xmx (from very small to large), no difference in long GC time. We never run into OOM.
>>   
>>     
> MaxGCPauseMillis doesnt work with UseConcMarkSweepGC - its for use with
> the Parallel collector. That also doesnt look like a good survivorratio.
>   
>>  
>>
>> Questions:
>>
>> * In general various cachings are good for performance, we have more RAM to use and want to use more caching to boost performance, isn't your suggestion (of lowering heap limit) going against that?
>>   
>>     
> Leaving RAM for the FileSystem cache is also very important. But you
> should also have enough RAM for your Solr caches of course.
>   
>> * Looks like Solr caching made its way into tenure-generation on heap, that's good. But why they get GC'ed eventually?? I did a quick check of Solr code (Solr 1.3, not 1.4), and see a single instance of using WeakReference. Is that what is causing all this? This seems to suggest a design flaw in Solr's memory management strategy (or just my ignorance about Solr?). I mean, wouldn't this be the "right" way of doing it -- you allow user to specify the cache size in solrconfig.xml, then user can set up heap limit in JAVA_OPTS accordingly, and no need to use WeakReference (BTW, why not SoftReference)??
>>   
>>     
> Do you see concurrent mode failure when looking at your gc logs? ie:
>
> 174.445: [GC 174.446: [ParNew: 66408K->66408K(66416K), 0.0000618
> secs]174.446: [CMS (concurrent mode failure): 161928K->162118K(175104K),
> 4.0975124 secs] 228336K->162118K(241520K)
>
> That means you have still getting major collections with CMS, and you
> don't want that. You might try kicking GC off earlier with something
> like: -XX:CMSInitiatingOccupancyFraction=50
>   
>> * Right now I have a single Tomcat hosting Solr and other applications. I guess now it's better to have Solr on its own Tomcat, given that it's tricky to adjust the java options.
>>
>>  
>>
>> thanks.
>>
>>
>>  
>>   
>>     
>>> From: wunder@wunderwood.org
>>> To: solr-user@lucene.apache.org
>>> Subject: RE: Solr and Garbage Collection
>>> Date: Fri, 25 Sep 2009 09:51:29 -0700
>>>
>>> 30ms is not better or worse than 1s until you look at the service
>>> requirements. For many applications, it is worth dedicating 10% of your
>>> processing time to GC if that makes the worst-case pause short.
>>>
>>> On the other hand, my experience with the IBM JVM was that the maximum query
>>> rate was 2-3X better with the concurrent generational GC compared to any of
>>> their other GC algorithms, so we got the best throughput along with the
>>> shortest pauses.
>>>
>>> Solr garbage generation (for queries) seems to have two major components:
>>> per-request garbage and cache evictions. With a generational collector,
>>> these two are handled by separate parts of the collector. Per-request
>>> garbage should completely fit in the short-term heap (nursery), so that it
>>> can be collected rapidly and returned to use for further requests. If the
>>> nursery is too small, the per-request allocations will be made in tenured
>>> space and sit there until the next major GC. Cache evictions are almost
>>> always in long-term storage (tenured space) because an LRU algorithm
>>> guarantees that the garbage will be old.
>>>
>>> Check the growth rate of tenured space (under constant load, of course)
>>> while increasing the size of the nursery. That rate should drop when the
>>> nursery gets big enough, then not drop much further as it is increased more.
>>>
>>> After that, reduce the size of tenured space until major GCs start happening
>>> "too often" (a judgment call). A bigger tenured space means longer major GCs
>>> and thus longer pauses, so you don't want it oversized by too much.
>>>
>>> Also check the hit rates of your caches. If the hit rate is low, say 20% or
>>> less, make that cache much bigger or set it to zero. Either one will reduce
>>> the number of cache evictions. If you have an HTTP cache in front of Solr,
>>> zero may be the right choice, since the HTTP cache is cherry-picking the
>>> easily cacheable requests.
>>>
>>> Note that a commit nearly doubles the memory required, because you have two
>>> live Searcher objects with all their caches. Make sure you have headroom for
>>> a commit.
>>>
>>> If you want to test the tenured space usage, you must test with real world
>>> queries. Those are the only way to get accurate cache eviction rates.
>>>
>>> wunder
>>>     
>>>       
>>  		 	   		  
>> _________________________________________________________________
>> Bing™  brings you maps, menus, and reviews organized in one place.   Try it now.
>> http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1
>>   
>>     
>
>
>   


-- 
- Mark

http://www.lucidimagination.com





Re: Solr and Garbage Collection

Posted by Mark Miller <ma...@gmail.com>.
siping liu wrote:
> Hi,
>
> I read pretty much all posts on this thread (before and after this one). Looks like the main suggestion from you and others is to keep max heap size (-Xmx) as small as possible (as long as you don't see OOM exception). This brings more questions than answers (for me at least. I'm new to Solr).
>
>  
>
> First, our environment and problem encountered: Solr1.4 (nightly build, downloaded about 2 months ago), Sun JDK1.6, Tomcat 5.5, running on Solaris(multi-cpu/cores). The cache setting is from the default solrconfig.xml (looks very small). At first we used minimum JAVA_OPTS and quickly run into the problem similar to the one orignal poster reported -- long pause (seconds to minutes) under load test. jconsole showed that it pauses on GC. So more JAVA_OPTS get added: "-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:SurvivorRatio=2 -XX:NewSize=128m -XX:MaxNewSize=512m -XX:MaxGCPauseMillis=200", the thinking is with mutile-cpu/cores we can get over with GC as quickly as possibe. With the new setup, it works fine until Tomcat reaches heap size, then it blocks and takes minutes on "full GC" to get more space from "tenure generation". We tried different Xmx (from very small to large), no difference in long GC time. We never run into OOM.
>   
MaxGCPauseMillis doesnt work with UseConcMarkSweepGC - its for use with
the Parallel collector. That also doesnt look like a good survivorratio.
>  
>
> Questions:
>
> * In general various cachings are good for performance, we have more RAM to use and want to use more caching to boost performance, isn't your suggestion (of lowering heap limit) going against that?
>   
Leaving RAM for the FileSystem cache is also very important. But you
should also have enough RAM for your Solr caches of course.
> * Looks like Solr caching made its way into tenure-generation on heap, that's good. But why they get GC'ed eventually?? I did a quick check of Solr code (Solr 1.3, not 1.4), and see a single instance of using WeakReference. Is that what is causing all this? This seems to suggest a design flaw in Solr's memory management strategy (or just my ignorance about Solr?). I mean, wouldn't this be the "right" way of doing it -- you allow user to specify the cache size in solrconfig.xml, then user can set up heap limit in JAVA_OPTS accordingly, and no need to use WeakReference (BTW, why not SoftReference)??
>   
Do you see concurrent mode failure when looking at your gc logs? ie:

174.445: [GC 174.446: [ParNew: 66408K->66408K(66416K), 0.0000618
secs]174.446: [CMS (concurrent mode failure): 161928K->162118K(175104K),
4.0975124 secs] 228336K->162118K(241520K)

That means you have still getting major collections with CMS, and you
don't want that. You might try kicking GC off earlier with something
like: -XX:CMSInitiatingOccupancyFraction=50
> * Right now I have a single Tomcat hosting Solr and other applications. I guess now it's better to have Solr on its own Tomcat, given that it's tricky to adjust the java options.
>
>  
>
> thanks.
>
>
>  
>   
>> From: wunder@wunderwood.org
>> To: solr-user@lucene.apache.org
>> Subject: RE: Solr and Garbage Collection
>> Date: Fri, 25 Sep 2009 09:51:29 -0700
>>
>> 30ms is not better or worse than 1s until you look at the service
>> requirements. For many applications, it is worth dedicating 10% of your
>> processing time to GC if that makes the worst-case pause short.
>>
>> On the other hand, my experience with the IBM JVM was that the maximum query
>> rate was 2-3X better with the concurrent generational GC compared to any of
>> their other GC algorithms, so we got the best throughput along with the
>> shortest pauses.
>>
>> Solr garbage generation (for queries) seems to have two major components:
>> per-request garbage and cache evictions. With a generational collector,
>> these two are handled by separate parts of the collector. Per-request
>> garbage should completely fit in the short-term heap (nursery), so that it
>> can be collected rapidly and returned to use for further requests. If the
>> nursery is too small, the per-request allocations will be made in tenured
>> space and sit there until the next major GC. Cache evictions are almost
>> always in long-term storage (tenured space) because an LRU algorithm
>> guarantees that the garbage will be old.
>>
>> Check the growth rate of tenured space (under constant load, of course)
>> while increasing the size of the nursery. That rate should drop when the
>> nursery gets big enough, then not drop much further as it is increased more.
>>
>> After that, reduce the size of tenured space until major GCs start happening
>> "too often" (a judgment call). A bigger tenured space means longer major GCs
>> and thus longer pauses, so you don't want it oversized by too much.
>>
>> Also check the hit rates of your caches. If the hit rate is low, say 20% or
>> less, make that cache much bigger or set it to zero. Either one will reduce
>> the number of cache evictions. If you have an HTTP cache in front of Solr,
>> zero may be the right choice, since the HTTP cache is cherry-picking the
>> easily cacheable requests.
>>
>> Note that a commit nearly doubles the memory required, because you have two
>> live Searcher objects with all their caches. Make sure you have headroom for
>> a commit.
>>
>> If you want to test the tenured space usage, you must test with real world
>> queries. Those are the only way to get accurate cache eviction rates.
>>
>> wunder
>>     
>  		 	   		  
> _________________________________________________________________
> Bing™  brings you maps, menus, and reviews organized in one place.   Try it now.
> http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1
>   


-- 
- Mark

http://www.lucidimagination.com