You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Cameron Hurst <ca...@gmail.com> on 2010/12/14 05:46:34 UTC

RAM usage issues

hello all,

I am a new user to Solr and I am having a few issues with the setup and
wondering if anyone had some suggestions. I am currently running this as
just a test environment before I go into production. I am using a
tomcat6 environment for my servlet and solr 1.4.1 as the solr build. I
set up the instructions following the guide here.
http://wiki.apache.org/solr/SolrTomcat    The issue that I am having is
that the memory usages seems high for the settings I have.

When i start the server I am using about 90MB of RAM which is fine and
from the google searches I found that is normal. The issue comes when I
start indexing data. In my solrconf.xml file that my maximum RAM buffer
is 32MB. In my mind that means that the maximum RAM being used by the
servlet should be 122MB, but increasing to 150MB isn't out of my reach.
When I start indexing data and calling searches my memory usages slowly
keeps on increasing. The odd thing about it is that when I reindex the
exact same data set the memory usage increases every time but no new
data has been entered to be indexed. I stopped increasing as I went over
350MB of RAM.

So my question in all of this is if this is normal and why the RAM
buffer isn't being observed? Are my expectations unreasonable and
flawed? Or could there be something else in my settings that is causing
the memory usage to increase like this.

Thanks for the help,

Cameron

Re: RAM usage issues

Posted by Erick Erickson <er...@gmail.com>.
Several observations:
1> If by RAM buffer size you're referring to the value in solrconfig.xml,
<ramBufferSizeMB>,
    that is a limit on the size of the internal buffer while indexing. When
that limit is reached
    the data is flushed to disk. It is irrelevant to searching.
2> When you run searches, various internal caches are populated. If you wish
to limit
     these, see solrconfig.xml. Look for the word "cache". These are
search-time caches.
3> When you reindex, if you do NOT have a <uniqueKey> defined (schema.xml),
then
     you'll have multiple copies of the same document, which could account
for your
     index size increase.
4> even if you do have <uniqueKey> defined, the underlying operation is that
the
     document is marked for deletion, it is NOT physically removed. In
particular,
     the terms associated with the deleted document are still kept around
until you do
     an optimize. See the admin page (stats as I recall) and see if there's
a
     difference between numDocs and maxDocs to see if this is the case.
5> What are you using to look at memory consumption? You could just be
     seeing memory that hasn't been garbage collected yet.

You should expect a limit to be reached as GC kicks in. jConsole may help
you if
you're not using that already.

Best
Erick

On Mon, Dec 13, 2010 at 11:46 PM, Cameron Hurst
<ca...@gmail.com>wrote:

> hello all,
>
> I am a new user to Solr and I am having a few issues with the setup and
> wondering if anyone had some suggestions. I am currently running this as
> just a test environment before I go into production. I am using a
> tomcat6 environment for my servlet and solr 1.4.1 as the solr build. I
> set up the instructions following the guide here.
> http://wiki.apache.org/solr/SolrTomcat    The issue that I am having is
> that the memory usages seems high for the settings I have.
>
> When i start the server I am using about 90MB of RAM which is fine and
> from the google searches I found that is normal. The issue comes when I
> start indexing data. In my solrconf.xml file that my maximum RAM buffer
> is 32MB. In my mind that means that the maximum RAM being used by the
> servlet should be 122MB, but increasing to 150MB isn't out of my reach.
> When I start indexing data and calling searches my memory usages slowly
> keeps on increasing. The odd thing about it is that when I reindex the
> exact same data set the memory usage increases every time but no new
> data has been entered to be indexed. I stopped increasing as I went over
> 350MB of RAM.
>
> So my question in all of this is if this is normal and why the RAM
> buffer isn't being observed? Are my expectations unreasonable and
> flawed? Or could there be something else in my settings that is causing
> the memory usage to increase like this.
>
> Thanks for the help,
>
> Cameron
>

Re: RAM usage issues

Posted by Shawn Heisey <so...@elyograg.org>.
On 12/13/2010 9:46 PM, Cameron Hurst wrote:
> When i start the server I am using about 90MB of RAM which is fine and
> from the google searches I found that is normal. The issue comes when I
> start indexing data. In my solrconf.xml file that my maximum RAM buffer
> is 32MB. In my mind that means that the maximum RAM being used by the
> servlet should be 122MB, but increasing to 150MB isn't out of my reach.
> When I start indexing data and calling searches my memory usages slowly
> keeps on increasing. The odd thing about it is that when I reindex the
> exact same data set the memory usage increases every time but no new
> data has been entered to be indexed. I stopped increasing as I went over
> 350MB of RAM.

There could be large gaps in my understanding here, but one thing I have 
noticed about Java is that memory usage on a program will increase until 
it nearly fills the max heap size it has been allocated.  In order to 
increase performance, garbage collection seems to be rather lazy, until 
a large percentage of the max heap size is allocated.  I've got a 2GB 
max heap size passed to Jetty when I start Solr.  Memory usage hovers 
around 1.4GB, and it doesn't take very long for it to get there.

Solr's search functionality, especially if you give it a sort parameter, 
is memory hungry.  For each field you sort on, Solr creates a large 
filter cache entry.  The other caches are also filled quickly.  If you 
are storing a large amount of data in Solr for each document, the 
documentCache in particular will get quite large.  Every time you do a 
reindex, you are creating a new searcher with new caches.  The old one 
is eventually removed, but I'm pretty sure that until garbage collection 
runs, the memory is not actually reclaimed.

I don't know what your heap size is set to, but I'd be surprised if it's 
less than 1GB.  Java is not going to be concerned about memory usage 
when it's only using 350MB of that, so I don't think it'll even try to 
run garbage collection.

Shawn