You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Matthew A. Wagner" <ma...@ttwagner.com> on 2009/02/02 23:12:30 UTC

Understanding Solr memory usage

I apologize in advance for what's probably a foolish question, but I'm
trying to get a feel for how much memory a properly-configured Solr
instance should be using.

I have an index with 2.5 million documents. The documents aren't all that
large. Our index is 25GB, and optimized fairly often.

We're consistently running out of memory. Sometimes it's a heap space
error, and other times the machine will run into swap. (The latter may not
be directly related to Solr, but nothing else is running on the box.)

We have four dedicated servers for this, each a quad Xeon with 16GB RAM. We
have one master that receives all updates, and three slaves that handle
queries. The three slaves have Tomcat configured for a 14GB heap. There
really isn't a lot of disk activity.

The machines seem underloaded to me, receiving less than one query per
second on average. Requests are served in about 300ms average, so it's not
as if we have many concurrent queries backing up.

We do use multi-field faceting in some searches. I'm having a hard time
figuring out how big of an impact this may have.

None of our caches (filter, auto-warming, etc.) are set for more than 512
documents.

Obviously, memory usage is going to be very variable, but what I'm
wondering is:
a.) Does this sound like a sane configuration, or is something seriously
wrong? It seems that many people are able to run considerably larger
indexes with considerably less resources.
b.) Is there any documentation on how the memory is being used? Is Solr
attempting to cram as much of the 25GB index into memory as possible? Maybe
I just overlooked something, but I don't know how to begin calculating
Solr's memory requirements.
c.) Does anything in the description of my Solr setup jump out at you as a
potential source of memory problems? We've increased the heap space
considerably, up to the current 14GB, and we're still running out of heap
space periodically.

Thanks in advance for any help!
-- Matt Wagner


Re: Understanding Solr memory usage

Posted by Lance Norskog <go...@gmail.com>.
How many total values are in the faceted fields? Not just in the faceted
query, but the entire index? A facet query builds a counter array for the
entire space of field values.  This can take much more ram than normal
queries. Sorting is also a memory-eater.

On Mon, Feb 2, 2009 at 2:19 PM, Mark Miller <ma...@gmail.com> wrote:

> You shouldn't need and dont want to give tomcat anywhere near 14 of GB or
> RAM. You also should certainly not being running out of memory with that
> much RAM and that few documents. Not even close.
>
> You want to leave plenty of RAM for the filesystem cache - so that a lot of
> that 25 gig can be cached in RAM - especially with indexes that large (25
> gig is somewhat large by index size, 2.5 million documents is not). You are
> likely starving the filesystem cache and OS of RAM. And running into swap
> just because you have given the JVM so much RAM.
>
> You probably do want to tune your cache sizes, but thats not your problem
> here.
>
> Trying giving tomcat a few gig rather than 14 - the rest won't go to waste.
>
> - Mark
>
>
> Matthew A. Wagner wrote:
>
>> I apologize in advance for what's probably a foolish question, but I'm
>> trying to get a feel for how much memory a properly-configured Solr
>> instance should be using.
>>
>> I have an index with 2.5 million documents. The documents aren't all that
>> large. Our index is 25GB, and optimized fairly often.
>>
>> We're consistently running out of memory. Sometimes it's a heap space
>> error, and other times the machine will run into swap. (The latter may not
>> be directly related to Solr, but nothing else is running on the box.)
>>
>> We have four dedicated servers for this, each a quad Xeon with 16GB RAM.
>> We
>> have one master that receives all updates, and three slaves that handle
>> queries. The three slaves have Tomcat configured for a 14GB heap. There
>> really isn't a lot of disk activity.
>>
>> The machines seem underloaded to me, receiving less than one query per
>> second on average. Requests are served in about 300ms average, so it's not
>> as if we have many concurrent queries backing up.
>>
>> We do use multi-field faceting in some searches. I'm having a hard time
>> figuring out how big of an impact this may have.
>>
>> None of our caches (filter, auto-warming, etc.) are set for more than 512
>> documents.
>>
>> Obviously, memory usage is going to be very variable, but what I'm
>> wondering is:
>> a.) Does this sound like a sane configuration, or is something seriously
>> wrong? It seems that many people are able to run considerably larger
>> indexes with considerably less resources.
>> b.) Is there any documentation on how the memory is being used? Is Solr
>> attempting to cram as much of the 25GB index into memory as possible?
>> Maybe
>> I just overlooked something, but I don't know how to begin calculating
>> Solr's memory requirements.
>> c.) Does anything in the description of my Solr setup jump out at you as a
>> potential source of memory problems? We've increased the heap space
>> considerably, up to the current 14GB, and we're still running out of heap
>> space periodically.
>>
>> Thanks in advance for any help!
>> -- Matt Wagner
>>
>>
>>
>
>


-- 
Lance Norskog
goksron@gmail.com
650-922-8831 (US)

Re: Understanding Solr memory usage

Posted by Mark Miller <ma...@gmail.com>.
You shouldn't need and dont want to give tomcat anywhere near 14 of GB 
or RAM. You also should certainly not being running out of memory with 
that much RAM and that few documents. Not even close.

You want to leave plenty of RAM for the filesystem cache - so that a lot 
of that 25 gig can be cached in RAM - especially with indexes that large 
(25 gig is somewhat large by index size, 2.5 million documents is not). 
You are likely starving the filesystem cache and OS of RAM. And running 
into swap just because you have given the JVM so much RAM.

You probably do want to tune your cache sizes, but thats not your 
problem here.

Trying giving tomcat a few gig rather than 14 - the rest won't go to waste.

- Mark

Matthew A. Wagner wrote:
> I apologize in advance for what's probably a foolish question, but I'm
> trying to get a feel for how much memory a properly-configured Solr
> instance should be using.
>
> I have an index with 2.5 million documents. The documents aren't all that
> large. Our index is 25GB, and optimized fairly often.
>
> We're consistently running out of memory. Sometimes it's a heap space
> error, and other times the machine will run into swap. (The latter may not
> be directly related to Solr, but nothing else is running on the box.)
>
> We have four dedicated servers for this, each a quad Xeon with 16GB RAM. We
> have one master that receives all updates, and three slaves that handle
> queries. The three slaves have Tomcat configured for a 14GB heap. There
> really isn't a lot of disk activity.
>
> The machines seem underloaded to me, receiving less than one query per
> second on average. Requests are served in about 300ms average, so it's not
> as if we have many concurrent queries backing up.
>
> We do use multi-field faceting in some searches. I'm having a hard time
> figuring out how big of an impact this may have.
>
> None of our caches (filter, auto-warming, etc.) are set for more than 512
> documents.
>
> Obviously, memory usage is going to be very variable, but what I'm
> wondering is:
> a.) Does this sound like a sane configuration, or is something seriously
> wrong? It seems that many people are able to run considerably larger
> indexes with considerably less resources.
> b.) Is there any documentation on how the memory is being used? Is Solr
> attempting to cram as much of the 25GB index into memory as possible? Maybe
> I just overlooked something, but I don't know how to begin calculating
> Solr's memory requirements.
> c.) Does anything in the description of my Solr setup jump out at you as a
> potential source of memory problems? We've increased the heap space
> considerably, up to the current 14GB, and we're still running out of heap
> space periodically.
>
> Thanks in advance for any help!
> -- Matt Wagner
>
>