You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Maximilian Hütter <mh...@blue-elephant-systems.com> on 2007/03/12 09:34:39 UTC

Commit after how many updates?

Hi,

I have a question regarding Solr's behaviour, in the standard
installation. When use the start.jar with a rather complex schema and I
do about 1000 updates and then try to commit, I get this:

<result status="1">java.lang.OutOfMemoryError: Java heap space
</result>

I know I can fix it by giving the VM a larger heap size, but still I
wonder what a good number of updates would be?

What are your experiences?

-- 
Maximilian Hütter
blue elephant systems GmbH
Wollgrasweg 49
D-70599 Stuttgart

Tel            :  (+49) 0711 - 45 10 17 578
Fax            :  (+49) 0711 - 45 10 17 573
e-mail         :  max.huetter@blue-elephant-systems.com
Sitz           :  Stuttgart, Amtsgericht Stuttgart, HRB 24106
Geschäftsführer:  Joachim Hörnle, Thomas Gentsch, Holger Dietrich

Re: Commit after how many updates?

Posted by Mike Klaas <mi...@gmail.com>.
On 3/16/07, Chris Hostetter <ho...@fucit.org> wrote:
>
> : I thought so, but hoped there would be some experiences with heap space
> : settings for Solr. But I guess I have to try for myself.
>
> there's lots of experience, but it's hard to translate to generic rules
> ... there's so many variables involved that it's hard to even recognize
> what the equation is.
>
> My advice: throw as much ram as you've got at it, slam it with realistic
> load, watch your GC logs/graphs and dial it back as much as you can
> without hurting things..

I'd temper this by suggesting that you always leave a healthy amount
for the OS disk cache as well--you definitely don't want Solr
occupying _all_ the memory on a machine.

-Mike

Re: Commit after how many updates?

Posted by Chris Hostetter <ho...@fucit.org>.
: I thought so, but hoped there would be some experiences with heap space
: settings for Solr. But I guess I have to try for myself.

there's lots of experience, but it's hard to translate to generic rules
... there's so many variables involved that it's hard to even recognize
what the equation is.

My advice: throw as much ram as you've got at it, slam it with realistic
load, watch your GC logs/graphs and dial it back as much as you can
without hurting things..



-Hoss


Re: Commit after how many updates?

Posted by Maximilian Hütter <mh...@blue-elephant-systems.com>.
Chris Hostetter schrieb:
> : It is the default heap size for the Sun JVM, so I guess 64MB max. The
> : documents are rather large, but if you manage to index 100,000 docs,
> : there seems to be some problem with Solr.
> 
> i think you mean "there DOES NOT seems to be some problem with Solr."
> right ... why would Mike being able to commit only every 100,000 indicate
> a problem with Solr?

Your right, what I meant was: there is a problem with my Solr setup.

> : What would be the recommended heap size for Solr?
> 
> there isn't one ... it's entirely dependent on how big your documents are,
> how many fields in yoru schema have norms enabled, what types of queries
> your process, how big you configurae teh various solr caches, etc....
> 
> -Hoss
> 

I thought so, but hoped there would be some experiences with heap space
settings for Solr. But I guess I have to try for myself.


-- 
Maximilian Hütter
blue elephant systems GmbH
Wollgrasweg 49
D-70599 Stuttgart

Tel            :  (+49) 0711 - 45 10 17 578
Fax            :  (+49) 0711 - 45 10 17 573
e-mail         :  max.huetter@blue-elephant-systems.com
Sitz           :  Stuttgart, Amtsgericht Stuttgart, HRB 24106
Geschäftsführer:  Joachim Hörnle, Thomas Gentsch, Holger Dietrich

Re: Commit after how many updates?

Posted by Chris Hostetter <ho...@fucit.org>.
: It is the default heap size for the Sun JVM, so I guess 64MB max. The
: documents are rather large, but if you manage to index 100,000 docs,
: there seems to be some problem with Solr.

i think you mean "there DOES NOT seems to be some problem with Solr."
right ... why would Mike being able to commit only every 100,000 indicate
a problem with Solr?

: What would be the recommended heap size for Solr?

there isn't one ... it's entirely dependent on how big your documents are,
how many fields in yoru schema have norms enabled, what types of queries
your process, how big you configurae teh various solr caches, etc....

-Hoss


Re: Commit after how many updates?

Posted by Mike Klaas <mi...@gmail.com>.
On 3/14/07, Maximilian Hütter <mh...@blue-elephant-systems.com> wrote:

> It is the default heap size for the Sun JVM, so I guess 64MB max. The
> documents are rather large, but if you manage to index 100,000 docs,
> there seems to be some problem with Solr.

The documents are not held in memory until a commit occurs (just some
tracking info), so I'm not sure that that is the appropriate
conclusion.  Lucene keeps a few documents in memory
(maxBufferedDocs--you could lower this setting), and if your documents
are large, this could use a higher maximum amount of memory than in my
case.  Solr does keep all uniqueIds in memory until commit.

> What would be the recommended heap size for Solr?

That is difficult to answer, since it depends on so many factors.  64
megs seems on the rather low end.  Remember that you aren't just
trying to avoid OOM errors--more memory means bigger caches and
increased query performance.

-Mike

Re: Commit after how many updates?

Posted by Maximilian Hütter <mh...@blue-elephant-systems.com>.
Mike Klaas schrieb:
> On 3/12/07, Maximilian Hütter <mh...@blue-elephant-systems.com> wrote:
>> Hi,
>>
>> I have a question regarding Solr's behaviour, in the standard
>> installation. When use the start.jar with a rather complex schema and I
>> do about 1000 updates and then try to commit, I get this:
>>
>> <result status="1">java.lang.OutOfMemoryError: Java heap space
>> </result>
>>
>> I know I can fix it by giving the VM a larger heap size, but still I
>> wonder what a good number of updates would be?
>>
>> What are your experiences?
> 
> That seems awfully few docs to cause OOM--I'm using autocommit @
> 100,000 docs without issues (then again, I give my instances a least a
> gig of heap).
> 
> What is your current heap size?
> 
> -Mike
> 
It is the default heap size for the Sun JVM, so I guess 64MB max. The
documents are rather large, but if you manage to index 100,000 docs,
there seems to be some problem with Solr.

What would be the recommended heap size for Solr?

-- 
Maximilian Hütter
blue elephant systems GmbH
Wollgrasweg 49
D-70599 Stuttgart

Tel            :  (+49) 0711 - 45 10 17 578
Fax            :  (+49) 0711 - 45 10 17 573
e-mail         :  max.huetter@blue-elephant-systems.com
Sitz           :  Stuttgart, Amtsgericht Stuttgart, HRB 24106
Geschäftsführer:  Joachim Hörnle, Thomas Gentsch, Holger Dietrich

Re: Commit after how many updates?

Posted by Mike Klaas <mi...@gmail.com>.
On 3/12/07, Maximilian Hütter <mh...@blue-elephant-systems.com> wrote:
> Hi,
>
> I have a question regarding Solr's behaviour, in the standard
> installation. When use the start.jar with a rather complex schema and I
> do about 1000 updates and then try to commit, I get this:
>
> <result status="1">java.lang.OutOfMemoryError: Java heap space
> </result>
>
> I know I can fix it by giving the VM a larger heap size, but still I
> wonder what a good number of updates would be?
>
> What are your experiences?

That seems awfully few docs to cause OOM--I'm using autocommit @
100,000 docs without issues (then again, I give my instances a least a
gig of heap).

What is your current heap size?

-Mike