You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Martin Grotzke <ma...@googlemail.com> on 2011/01/25 11:54:55 UTC

Recommendation on RAM-/Cache configuration

Hi,

recently we're experiencing OOMEs (GC overhead limit exceeded) in our
searches. Therefore I want to get some clarification on heap and cache
configuration.

This is the situation:
- Solr 1.4.1 running on tomcat 6, Sun JVM 1.6.0_13 64bit
- JVM Heap Params: -Xmx8G -XX:MaxPermSize=256m -XX:NewSize=2G
-XX:MaxNewSize=2G -XX:SurvivorRatio=6 -XX:+UseParallelOldGC
-XX:+UseParallelGC
- The machine has 32 GB RAM
- Currently there are 4 processors/cores in the machine, this shall be
changed to 2 cores in the future.
- The index size in the filesystem is ~9.5 GB
- The index contains ~ 5.500.000 documents
- 1.500.000 of those docs are available for searches/queries, the rest are
inactive docs that are excluded from searches (via a flag/field), but
they're still stored in the index as need to be available by id (solr is the
main document store in this app)
- Caches are configured with a big size (the idea was to prevent filesystem
access / disk i/o as much as possible):
  - filterCache (solr.LRUCache): size=200000, initialSize=30000,
autowarmCount=1000, actual size =~ 60.000, hitratio =~ 0.99
  - documentCache (solr.LRUCache): size=200000, initialSize=100000,
autowarmCount=0, actual size =~ 160.000 - 190.000, hitratio =~ 0.74
  - queryResultCache (solr.LRUCache): size=200000, initialSize=30000,
autowarmCount=10000, actual size =~ 10.000 - 60.000, hitratio =~ 0.71
- Searches are performed using a catchall text field using standard request
handler, all fields are fetched (no fl specified)
- Normally ~ 5 concurrent requests, peaks up to 30 or 40 (mostly during GC)
- Recently we also added a feature that adds weighted search for special
fields, so that the query might become s.th. like this
  q=(some query) OR name_weighted:(some query)^2.0 OR brand_weighted:(some
query)^4.0 OR longDescription_weighted:(some query)^0.5
  (it seemed as if this was the cause of the OOMEs, but IMHO it only
increased RAM usage so that now GC could not free enough RAM)

The OOMEs that we get are of type "GC overhead limit exceeded", one of the
OOMEs was thrown during auto-warming.

I checked two different heapdumps, the first one autogenerated
(by -XX:+HeapDumpOnOutOfMemoryError) the second one generated manually via
jmap.
These show the following distribution of used memory - the autogenerated
dump:
 - documentCache: 56% (size ~ 195.000)
- filterCache: 15% (size ~ 60.000)
- queryResultCache: 8% (size ~ 61.000)
- fieldCache: 6% (fieldCache referenced  by WebappClassLoader)
- SolrIndexSearcher: 2%

The manually generated dump:
- documentCache: 48% (size ~ 195.000)
- filterCache: 20% (size ~ 60.000)
- fieldCache: 11% (fieldCache hängt am WebappClassLoader)
- queryResultCache: 7% (size ~ 61.000)
- fieldValueCache: 3%

We are also running two search engines with 17GB heap, these don't run into
OOMEs. Though, with these bigger heap sizes the longest requests are even
longer due to longer stop-the-world gc cycles.
Therefore my goal is to run with a smaller heap, IMHO even smaller than 8GB
would be good to reduce the time needed for full gc.

So what's the right path to follow now? What would you recommend to change
on the configuration (solr/jvm)?

Would you say it is ok to reduce the cache sizes? Would this increase disk
i/o, or would the index be hold in the OS's disk cache?

Do have other recommendations to follow / questions?

Thanx && cheers,
Martin

Re: Recommendation on RAM-/Cache configuration

Posted by Martin Grotzke <ma...@googlemail.com>.
On Tue, Jan 25, 2011 at 2:06 PM, Markus Jelsma
<ma...@openindex.io>wrote:

> On Tuesday 25 January 2011 11:54:55 Martin Grotzke wrote:
> > Hi,
> >
> > recently we're experiencing OOMEs (GC overhead limit exceeded) in our
> > searches. Therefore I want to get some clarification on heap and cache
> > configuration.
> >
> > This is the situation:
> > - Solr 1.4.1 running on tomcat 6, Sun JVM 1.6.0_13 64bit
> > - JVM Heap Params: -Xmx8G -XX:MaxPermSize=256m -XX:NewSize=2G
> > -XX:MaxNewSize=2G -XX:SurvivorRatio=6 -XX:+UseParallelOldGC
> > -XX:+UseParallelGC
>
> Consider switching to HotSpot JVM, use the -server as the first switch.

The jvm options I mentioned were not all, we're running the jvm with -server
(of course).


>
> > - The machine has 32 GB RAM
> > - Currently there are 4 processors/cores in the machine, this shall be
> > changed to 2 cores in the future.
> > - The index size in the filesystem is ~9.5 GB
> > - The index contains ~ 5.500.000 documents
> > - 1.500.000 of those docs are available for searches/queries, the rest
> are
> > inactive docs that are excluded from searches (via a flag/field), but
> > they're still stored in the index as need to be available by id (solr is
> > the main document store in this app)
>
> How do you exclude them? It should use filter queries.

The docs are indexed with a field "findable" on which we do a filter query.


> I also remember (but i
> just cannot find it back so please correct my if i'm wrong) that in 1.4.x
> sorting is done before filtering. It should be an improvement if filtering
> is
> done before sorting.
>
Hmm, I cannot imagine a case where it makes sense to sort before filtering.
Can't believe that solr does it like this.
Can anyone shed a light on this?


> If you use sorting, it takes up a huge amount of RAM if filtering is not
> done
> first.
>
> > - Caches are configured with a big size (the idea was to prevent
> filesystem
> > access / disk i/o as much as possible):
>
> There is only disk I/O if the kernel can't keep the index (or parts) in its
> page cache.
>
Yes, I'll keep an eye on disk I/O.



> >   - filterCache (solr.LRUCache): size=200000, initialSize=30000,
> > autowarmCount=1000, actual size =~ 60.000, hitratio =~ 0.99
> >   - documentCache (solr.LRUCache): size=200000, initialSize=100000,
> > autowarmCount=0, actual size =~ 160.000 - 190.000, hitratio =~ 0.74
> >   - queryResultCache (solr.LRUCache): size=200000, initialSize=30000,
> > autowarmCount=10000, actual size =~ 10.000 - 60.000, hitratio =~ 0.71
>
> You should decrease the initialSize values. But your hitratio's seem very
> nice.
>
Does the initialSize have a real impact? According to
http://wiki.apache.org/solr/SolrCaching#initialSize it's the initial size of
the HashMap backing the cache.
What would you say are reasonable values for size/initialSize/autowarmCount?

Cheers,
Martin

Re: Recommendation on RAM-/Cache configuration

Posted by Markus Jelsma <ma...@openindex.io>.
On Tuesday 25 January 2011 11:54:55 Martin Grotzke wrote:
> Hi,
> 
> recently we're experiencing OOMEs (GC overhead limit exceeded) in our
> searches. Therefore I want to get some clarification on heap and cache
> configuration.
> 
> This is the situation:
> - Solr 1.4.1 running on tomcat 6, Sun JVM 1.6.0_13 64bit
> - JVM Heap Params: -Xmx8G -XX:MaxPermSize=256m -XX:NewSize=2G
> -XX:MaxNewSize=2G -XX:SurvivorRatio=6 -XX:+UseParallelOldGC
> -XX:+UseParallelGC

Consider switching to HotSpot JVM, use the -server as the first switch.

> - The machine has 32 GB RAM
> - Currently there are 4 processors/cores in the machine, this shall be
> changed to 2 cores in the future.
> - The index size in the filesystem is ~9.5 GB
> - The index contains ~ 5.500.000 documents
> - 1.500.000 of those docs are available for searches/queries, the rest are
> inactive docs that are excluded from searches (via a flag/field), but
> they're still stored in the index as need to be available by id (solr is
> the main document store in this app)

How do you exclude them? It should use filter queries. I also remember (but i 
just cannot find it back so please correct my if i'm wrong) that in 1.4.x 
sorting is done before filtering. It should be an improvement if filtering is 
done before sorting.
If you use sorting, it takes up a huge amount of RAM if filtering is not done 
first.

> - Caches are configured with a big size (the idea was to prevent filesystem
> access / disk i/o as much as possible):

There is only disk I/O if the kernel can't keep the index (or parts) in its 
page cache.

>   - filterCache (solr.LRUCache): size=200000, initialSize=30000,
> autowarmCount=1000, actual size =~ 60.000, hitratio =~ 0.99
>   - documentCache (solr.LRUCache): size=200000, initialSize=100000,
> autowarmCount=0, actual size =~ 160.000 - 190.000, hitratio =~ 0.74
>   - queryResultCache (solr.LRUCache): size=200000, initialSize=30000,
> autowarmCount=10000, actual size =~ 10.000 - 60.000, hitratio =~ 0.71

You should decrease the initialSize values. But your hitratio's seem very 
nice.

> - Searches are performed using a catchall text field using standard request
> handler, all fields are fetched (no fl specified)
> - Normally ~ 5 concurrent requests, peaks up to 30 or 40 (mostly during GC)
> - Recently we also added a feature that adds weighted search for special
> fields, so that the query might become s.th. like this
>   q=(some query) OR name_weighted:(some query)^2.0 OR brand_weighted:(some
> query)^4.0 OR longDescription_weighted:(some query)^0.5
>   (it seemed as if this was the cause of the OOMEs, but IMHO it only
> increased RAM usage so that now GC could not free enough RAM)
> 
> The OOMEs that we get are of type "GC overhead limit exceeded", one of the
> OOMEs was thrown during auto-warming.

Warming takes additional RAM. The current searcher still has its caches full 
and newSearcher is getting filled up. Decreasing sizes might help.

> 
> I checked two different heapdumps, the first one autogenerated
> (by -XX:+HeapDumpOnOutOfMemoryError) the second one generated manually via
> jmap.
> These show the following distribution of used memory - the autogenerated
> dump:
>  - documentCache: 56% (size ~ 195.000)
> - filterCache: 15% (size ~ 60.000)
> - queryResultCache: 8% (size ~ 61.000)
> - fieldCache: 6% (fieldCache referenced  by WebappClassLoader)
> - SolrIndexSearcher: 2%
> 
> The manually generated dump:
> - documentCache: 48% (size ~ 195.000)
> - filterCache: 20% (size ~ 60.000)
> - fieldCache: 11% (fieldCache hängt am WebappClassLoader)
> - queryResultCache: 7% (size ~ 61.000)
> - fieldValueCache: 3%
> 
> We are also running two search engines with 17GB heap, these don't run into
> OOMEs. Though, with these bigger heap sizes the longest requests are even
> longer due to longer stop-the-world gc cycles.
> Therefore my goal is to run with a smaller heap, IMHO even smaller than 8GB
> would be good to reduce the time needed for full gc.
> 
> So what's the right path to follow now? What would you recommend to change
> on the configuration (solr/jvm)?

Try tuning the GC
http://java.sun.com/performance/reference/whitepapers/tuning.html
http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html

> 
> Would you say it is ok to reduce the cache sizes? Would this increase disk
> i/o, or would the index be hold in the OS's disk cache?

Yes! If you also allocate less RAM to the JVM then there is more for the OS to 
cache.

> 
> Do have other recommendations to follow / questions?
> 
> Thanx && cheers,
> Martin

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350