You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Rogowski, Britta" <BR...@wolterskluwer.de> on 2013/05/06 09:32:38 UTC

Memory problems with HttpSolrServer

Hi!

When I write from our database to a HttpSolrServer, (using a LinkedBlockingQueue to write just one document at a time), I run into memory problems (due to various constraints, I have to remain on a 32-bit system, so I can use at most 2 GB RAM).

If I use an EmbeddedSolrServer (to write locally), I have no such problems. Just now, I tried out ConcurrentUpdateSolrServer (with a queue size of 1, but 3 threads to be safe), and that worked out fine too. I played around with various GC options and monitored memory with jconsole and jmap, but only found out that there's lots of byte arrays, SolrInputFields and Strings hanging around.

Since ConcurrentUpdateSolrServer works, I'm happy, but I was wondering if people were aware of the memory issue around HttpSolrServer.

Regards,

Britta Rogowski



__________________________________

Britta Rogoswki
Senior Developer

Wolters Kluwer Deutschland
Online Product Development
Feldstiege 100
48161 M?nster

Tel +49 (2533) 9300-251
Fax
BRogowski@wolterskluwer.de

Wolters Kluwer Deutschland GmbH | Feldstiege 100 | D-48161 M?nster |
HRB 58843 Amtsgericht K?ln | Gesch?ftsf?hrer: Dr. Ulrich Hermann (Vorsitz), Michael Gloss, Christian Lindemann, Frank Schellmann | USt.-ID.Nr. DE188836808


Re: Memory problems with HttpSolrServer

Posted by Shawn Heisey <so...@elyograg.org>.
On 5/6/2013 1:32 AM, Rogowski, Britta wrote:
> Hi!
> 
> When I write from our database to a HttpSolrServer, (using a LinkedBlockingQueue to write just one document at a time), I run into memory problems (due to various constraints, I have to remain on a 32-bit system, so I can use at most 2 GB RAM).
> 
> If I use an EmbeddedSolrServer (to write locally), I have no such problems. Just now, I tried out ConcurrentUpdateSolrServer (with a queue size of 1, but 3 threads to be safe), and that worked out fine too. I played around with various GC options and monitored memory with jconsole and jmap, but only found out that there's lots of byte arrays, SolrInputFields and Strings hanging around.
> 
> Since ConcurrentUpdateSolrServer works, I'm happy, but I was wondering if people were aware of the memory issue around HttpSolrServer.

Is it memory usage within the JVM, or OS allocation for the java process
that you are looking at?

There are no known memory problems with current versions of SolrJ, and
none that I know about with older versions.  At the time you wrote this,
4.2.1 was the latest version, but now several hours later, 4.3.0 has
been released.

I have a SolrJ app that I've been using since 3.5.0, currently using
4.2.1.  It creates 32 separate HttpSolrServer instances, to keep all my
shards up to date.  It runs for weeks or months at a time and is
currently using about 25MB of RAM within the JVM.  When special reindex
requests happen, memory usage may briefly go up to a few hundred MB.  It
will typically allocate the entire 1GB heap at the OS level, but I could
run it with a smaller heap and have no trouble.

After I gathered those numbers, I restarted the application.  Memory
usage is still low, and the OS shows only 106MB in use.

I suspect that your java code may have a memory leak.  I'm not sure why
the leak isn't happening with the concurrent object, that's very very
weird.  ConcurrentUpdateSolrServer uses HttpSolrServer internally.  When
you use HttpSolrServer, are you reusing one object or creating a new one
for every request?  You should create one HttpSolrServer object for
every separate Solr core and then use that object for the life of your
application.  It is completely thread safe.

There is a large caveat with ConcurrentUpdateSolrServer.  If you are
using try/catch blocks to trap request errors and take action, you
should be aware that this object will never throw an error.  Even if a
request fails or your Solr server is down, your application will never know.

Why do I need 32 HttpSolrServer objects? I have 2 index chains, 7 shards
per chain, with a live core and a build core per shard.  That is 28
separate cores.  There are four Solr servers, so I need four additional
objects for CoreAdmin requests.

Thanks,
Shawn


Re: Memory problems with HttpSolrServer

Posted by Andre Bois-Crettez <an...@kelkoo.com>.
On 05/06/2013 09:32 AM, Rogowski, Britta wrote:
> Hi!
>
> When I write from our database to a HttpSolrServer, (using a LinkedBlockingQueue to write just one document at a time), I run into memory problems (due to various constraints, I have to remain on a 32-bit system, so I can use at most 2 GB RAM).
>
> If I use an EmbeddedSolrServer (to write locally), I have no such problems. Just now, I tried out ConcurrentUpdateSolrServer (with a queue size of 1, but 3 threads to be safe), and that worked out fine too. I played around with various GC options and monitored memory with jconsole and jmap, but only found out that there's lots of byte arrays, SolrInputFields and Strings hanging around.
>
> Since ConcurrentUpdateSolrServer works, I'm happy, but I was wondering if people were aware of the memory issue around HttpSolrServer.
>
> Regards,
>
> Britta Rogowski
We are not memory constrained so can not confirm the problem with
HttpSolrServer, but how often do you commit ?
Having an autocommit set to a few minutes may help reduce memory usage
during indexation.

Is the memory usage on the Solr server side, or in your feeder code ?

--
André Bois-Crettez

Search technology, Kelkoo
http://www.kelkoo.com/


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.