You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by wojtekpia <wo...@hotmail.com> on 2009/01/21 18:49:18 UTC

Performance "dead-zone" due to garbage collection

I'm intermittently experiencing severe performance drops due to Java garbage
collection. I'm allocating a lot of RAM to my Java process (27GB of the 32GB
physically available). Under heavy load, the performance drops approximately
every 10 minutes, and the drop lasts for 30-40 seconds. This coincides with
the size of the old generation heap dropping from ~27GB to ~6GB. 

Is there a way to reduce the impact of garbage collection? A couple ideas
we've come up with (but haven't tried yet) are: increasing the minimum heap
size, more frequent (but hopefully less costly) garbage collection.

Thanks,

Wojtek

-- 
View this message in context: http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21588427.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Performance "dead-zone" due to garbage collection

Posted by "Feak, Todd" <To...@smss.sony.com>.
The large drop in old generation from 27GB->6GB indicates that things
are getting into your old generation prematurely. They really don't need
to get there at all, and should be collected sooner (more frequently).

Look into increasing young generation sizes via JVM parameters. Also
look into concurrent collection.

You could even consider decreasing your JVM max memory. Obviously you
aren't using it all, decreasing it will force the JVM to do more
frequent (and therefore smaller) collections. You're average collection
time may go up, but you will get smaller performance decreases.

Great details on memory tuning on Sun JDKs here 

http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html

There are other articles for 1.6 and 1.4 as well.

-Todd

-----Original Message-----
From: wojtekpia [mailto:wojtek_p@hotmail.com] 
Sent: Wednesday, January 21, 2009 9:49 AM
To: solr-user@lucene.apache.org
Subject: Performance "dead-zone" due to garbage collection


I'm intermittently experiencing severe performance drops due to Java
garbage
collection. I'm allocating a lot of RAM to my Java process (27GB of the
32GB
physically available). Under heavy load, the performance drops
approximately
every 10 minutes, and the drop lasts for 30-40 seconds. This coincides
with
the size of the old generation heap dropping from ~27GB to ~6GB. 

Is there a way to reduce the impact of garbage collection? A couple
ideas
we've come up with (but haven't tried yet) are: increasing the minimum
heap
size, more frequent (but hopefully less costly) garbage collection.

Thanks,

Wojtek

-- 
View this message in context:
http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collect
ion-tp21588427p21588427.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Performance "dead-zone" due to garbage collection

Posted by Alexander Ramos Jardim <al...@gmail.com>.
I would say that putting more Solr instances, each one with your own data
directory could help if you can qualify your docs, in such a way that you
can put "A" type docs in index "A", "B" type docs in index "B", and so on.

2009/1/21 wojtekpia <wo...@hotmail.com>

>
> I'm using a recent version of Sun's JVM (6 update 7) and am using the
> concurrent generational collector. I've tried several other collectors,
> none
> seemed to help the situation.
>
> I've tried reducing my heap allocation. The search performance got worse as
> I reduced the heap. I didn't monitor the garbage collector in those tests,
> but I imagine that it would've gotten better. (As a side note, I do lots of
> faceting and sorting, I have 10M records in this index, with an approximate
> index file size of 10GB).
>
> This index is on a single machine, in a single Solr core. Would splitting
> it
> across multiple Solr cores on a single machine help? I'd like to find the
> limit of this machine before spreading the data to more machines.
>
> Thanks,
>
> Wojtek
> --
> View this message in context:
> http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21590150.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Alexander Ramos Jardim

RE: Performance "dead-zone" due to garbage collection

Posted by "Feak, Todd" <To...@smss.sony.com>.
A ballpark calculation would be 

Collected Amount (From GC logging)/ # of Requests.

The GC logging can tell you how much it collected each time, no need to
try and snapshot before and after heap sizes. However (big caveat here),
this is a ballpark figure. The garbage collector is not guaranteed to
collect everything, every time. It can stop collecting depending on how
much time it spent. It may only collect from certain sections within
memory (Eden, survivor, tenured), etc.

This may still be enough to make broad comparisons to see if you've
decreased the overall garbage/request (via cache changes), but it will
be quite a rough estimate.

-Todd

-----Original Message-----
From: wojtekpia [mailto:wojtek_p@hotmail.com] 
Sent: Wednesday, January 21, 2009 3:08 PM
To: solr-user@lucene.apache.org
Subject: Re: Performance "dead-zone" due to garbage collection


(Thanks for the responses)

My filterCache hit rate is ~60% (so I'll try making it bigger), and I am
CPU
bound. 

How do I measure the size of my per-request garbage? Is it (total heap
size
before collection - total heap size after collection) / # of requests to
cause a collection?

I'll try your suggestions and post back any useful results.

-- 
View this message in context:
http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collect
ion-tp21588427p21593661.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Performance "dead-zone" due to garbage collection

Posted by wojtekpia <wo...@hotmail.com>.
(Thanks for the responses)

My filterCache hit rate is ~60% (so I'll try making it bigger), and I am CPU
bound. 

How do I measure the size of my per-request garbage? Is it (total heap size
before collection - total heap size after collection) / # of requests to
cause a collection?

I'll try your suggestions and post back any useful results.

-- 
View this message in context: http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21593661.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Performance "dead-zone" due to garbage collection

Posted by Walter Underwood <wu...@netflix.com>.
Have you tried different sizes for the nursery? It should be several
times larger than the per-request garbage.

Also, check your cache sizes. Objects evicted from the cache are
almost always tenured, so those will add to the time needed for
a full GC.

Guess who was tuning GC for a week or two in December ...

wunder

On 1/21/09 12:15 PM, "Feak, Todd" <To...@smss.sony.com> wrote:

> From a high level view, there is a certain amount of garbage collection
> that must occur. That garbage is generated per request, through a
> variety of means (buffers, request, response, cache expulsion). The only
> thing that JVM parameters can address is *when* that collection occurs.
> 
> It can occur often in small chunks, or rarely in large chunks (or
> anywhere in between). If you are CPU bound (which it sounds like you may
> be), then you really have a decision to make. Do you want an overall
> drop in performance, as more time is spent garbage collecting, OR do you
> want spikes in garbage collection that are more rare, but have a
> stronger impact. Realistically it becomes a question of one or the
> other. You *must* pay the cost of garbage collection at some point in
> time.
> 
> It is possible that increasing cache size will decrease overall garbage
> collection, as the churn caused by caused by cache misses creates
> additional garbage. Decreasing the churn could decrease garbage. BUT,
> this really depends on your cache hit rates. If they are pretty high
> (>90%) then it's probably not much of a factor. However, if you are in
> the 50%-60% range, larger caches may help you in a number of ways.
> 
> -Todd Feak
> 
> -----Original Message-----
> From: wojtekpia [mailto:wojtek_p@hotmail.com]
> Sent: Wednesday, January 21, 2009 11:14 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Performance "dead-zone" due to garbage collection
> 
> 
> I'm using a recent version of Sun's JVM (6 update 7) and am using the
> concurrent generational collector. I've tried several other collectors,
> none
> seemed to help the situation.
> 
> I've tried reducing my heap allocation. The search performance got worse
> as
> I reduced the heap. I didn't monitor the garbage collector in those
> tests,
> but I imagine that it would've gotten better. (As a side note, I do lots
> of
> faceting and sorting, I have 10M records in this index, with an
> approximate
> index file size of 10GB).
> 
> This index is on a single machine, in a single Solr core. Would
> splitting it
> across multiple Solr cores on a single machine help? I'd like to find
> the
> limit of this machine before spreading the data to more machines.
> 
> Thanks,
> 
> Wojtek


RE: Performance "dead-zone" due to garbage collection

Posted by "Feak, Todd" <To...@smss.sony.com>.
>From a high level view, there is a certain amount of garbage collection
that must occur. That garbage is generated per request, through a
variety of means (buffers, request, response, cache expulsion). The only
thing that JVM parameters can address is *when* that collection occurs. 

It can occur often in small chunks, or rarely in large chunks (or
anywhere in between). If you are CPU bound (which it sounds like you may
be), then you really have a decision to make. Do you want an overall
drop in performance, as more time is spent garbage collecting, OR do you
want spikes in garbage collection that are more rare, but have a
stronger impact. Realistically it becomes a question of one or the
other. You *must* pay the cost of garbage collection at some point in
time.

It is possible that increasing cache size will decrease overall garbage
collection, as the churn caused by caused by cache misses creates
additional garbage. Decreasing the churn could decrease garbage. BUT,
this really depends on your cache hit rates. If they are pretty high
(>90%) then it's probably not much of a factor. However, if you are in
the 50%-60% range, larger caches may help you in a number of ways.

-Todd Feak

-----Original Message-----
From: wojtekpia [mailto:wojtek_p@hotmail.com] 
Sent: Wednesday, January 21, 2009 11:14 AM
To: solr-user@lucene.apache.org
Subject: Re: Performance "dead-zone" due to garbage collection


I'm using a recent version of Sun's JVM (6 update 7) and am using the
concurrent generational collector. I've tried several other collectors,
none
seemed to help the situation.

I've tried reducing my heap allocation. The search performance got worse
as
I reduced the heap. I didn't monitor the garbage collector in those
tests,
but I imagine that it would've gotten better. (As a side note, I do lots
of
faceting and sorting, I have 10M records in this index, with an
approximate
index file size of 10GB).

This index is on a single machine, in a single Solr core. Would
splitting it
across multiple Solr cores on a single machine help? I'd like to find
the
limit of this machine before spreading the data to more machines.

Thanks,

Wojtek
-- 
View this message in context:
http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collect
ion-tp21588427p21590150.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Performance "dead-zone" due to garbage collection

Posted by wojtekpia <wo...@hotmail.com>.
I'm using a recent version of Sun's JVM (6 update 7) and am using the
concurrent generational collector. I've tried several other collectors, none
seemed to help the situation.

I've tried reducing my heap allocation. The search performance got worse as
I reduced the heap. I didn't monitor the garbage collector in those tests,
but I imagine that it would've gotten better. (As a side note, I do lots of
faceting and sorting, I have 10M records in this index, with an approximate
index file size of 10GB).

This index is on a single machine, in a single Solr core. Would splitting it
across multiple Solr cores on a single machine help? I'd like to find the
limit of this machine before spreading the data to more machines.

Thanks,

Wojtek
-- 
View this message in context: http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21590150.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Performance "dead-zone" due to garbage collection

Posted by Alexander Ramos Jardim <al...@gmail.com>.
How many boxes running your index? If it is just one, maybe distributing
your index will get you a better performance during garbage collection.

2009/1/21 wojtekpia <wo...@hotmail.com>

>
> I'm intermittently experiencing severe performance drops due to Java
> garbage
> collection. I'm allocating a lot of RAM to my Java process (27GB of the
> 32GB
> physically available). Under heavy load, the performance drops
> approximately
> every 10 minutes, and the drop lasts for 30-40 seconds. This coincides with
> the size of the old generation heap dropping from ~27GB to ~6GB.
>
> Is there a way to reduce the impact of garbage collection? A couple ideas
> we've come up with (but haven't tried yet) are: increasing the minimum heap
> size, more frequent (but hopefully less costly) garbage collection.
>
> Thanks,
>
> Wojtek
>
> --
> View this message in context:
> http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21588427.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Alexander Ramos Jardim

Re: Performance "dead-zone" due to garbage collection

Posted by wojtekpia <wo...@hotmail.com>.
I tried sorting using a function query instead of the Lucene sort and found
no change in performance. I wonder if Lance's results are related to
something specific to his deployment?
-- 
View this message in context: http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21922851.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Performance "dead-zone" due to garbage collection

Posted by wojtekpia <wo...@hotmail.com>.
I've been able to reduce these GC outages by:

1) Optimizing my schema. This reduced my index size by more than 50%
2) Smaller cache sizes. I started with filterCache, documentCache &
queryCache sizes of ~10,000. They're now at ~500
3) Reduce heap allocation. I started at 27 GB, now I'm 'only' allocating 8
GB
4) Update to trunk (was using Dec 2/08 code, now using Jan 26/09)

I still see outages due to garbage collection every ~10 minutes, but they
last ~2 seconds (instead of 20+ seconds). Note that my throughput dropped
from ~30 hits/second to ~23 hits/second. Luckily, I'm still hitting my
performance requirements, so I'm able to accept that.

Thanks for the tips!


Wojtek



yonik wrote:
> 
> On Tue, Feb 3, 2009 at 11:58 AM, wojtekpia <wo...@hotmail.com> wrote:
>> I noticed your wiki post about sorting with a function query instead of
>> the
>> Lucene sort mechanism. Did you see a significantly reduced memory
>> footprint
>> by doing this?
> 
> FunctionQuery derives field values from the FieldCache... so it would
> use the same amount of memory as sorting.
> 
> -Yonik
> 
> 

-- 
View this message in context: http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21922773.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Performance "dead-zone" due to garbage collection

Posted by Yonik Seeley <ys...@gmail.com>.
On Tue, Feb 3, 2009 at 11:58 AM, wojtekpia <wo...@hotmail.com> wrote:
> I noticed your wiki post about sorting with a function query instead of the
> Lucene sort mechanism. Did you see a significantly reduced memory footprint
> by doing this?

FunctionQuery derives field values from the FieldCache... so it would
use the same amount of memory as sorting.

-Yonik

Re: Performance "dead-zone" due to garbage collection

Posted by wojtekpia <wo...@hotmail.com>.
I noticed your wiki post about sorting with a function query instead of the
Lucene sort mechanism. Did you see a significantly reduced memory footprint
by doing this? Did you reduce the number of fields you allowed users to sort
by?


Lance Norskog-2 wrote:
> 
> Sorting creates a large array with "roughly" an entry for every document
> in
> the index. If it is not on an 'integer' field it takes even more memory.
> If
> you do a sorted request and then don't sort for a while, that will drop
> the
> sort structures and trigger a giant GC.
> 
> We went through some serious craziness with sorting.
> 

-- 
View this message in context: http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21814038.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Performance "dead-zone" due to garbage collection

Posted by Lance Norskog <go...@gmail.com>.
Sorting creates a large array with "roughly" an entry for every document in
the index. If it is not on an 'integer' field it takes even more memory. If
you do a sorted request and then don't sort for a while, that will drop the
sort structures and trigger a giant GC.

We went through some serious craziness with sorting.

On Fri, Jan 30, 2009 at 3:54 PM, wojtekpia <wo...@hotmail.com> wrote:

>
> I profiled our application, and GC is definitely the problem. The IBM JVM
> didn't change much. I'm currently looking into ways of reducing my memory
> footprint.
>
> --
> View this message in context:
> http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21758001.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Lance Norskog
goksron@gmail.com
650-922-8831 (US)

RE: Performance "dead-zone" due to garbage collection

Posted by wojtekpia <wo...@hotmail.com>.
I profiled our application, and GC is definitely the problem. The IBM JVM
didn't change much. I'm currently looking into ways of reducing my memory
footprint. 

-- 
View this message in context: http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21758001.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Performance "dead-zone" due to garbage collection

Posted by Renaud Waldura <re...@library.ucsf.edu>.
I'm coming in late on this thread, but I want to recommend the YourKit
Profiler product. It helped me track a performance problem similar to what
you describe. I had been futzing with GC logging etc. for days before
YourKit pinpointed the issue within minutes.

http://www.yourkit.com/

(My problem turned out to be silly. Straight Lucene, not Solr; the index was
opened and closed on every request. It worked OK for a few hours, then a
giant full GC kicked in, which froze the VM for minutes. Doh!)

Anyway, it may help you identify how much memory is used per request, etc.
and tune GC accordingly. Good luck!

--Renaud


-----Original Message-----
From: Feak, Todd [mailto:Todd.Feak@smss.sony.com] 
Sent: Friday, January 23, 2009 8:13 AM
To: solr-user@lucene.apache.org
Subject: RE: Performance "dead-zone" due to garbage collection

Can you share your experience with the IBM JDK once you've evaluated it?
You are working with a heavy load, I think many would benefit from the
feedback.

-Todd Feak

-----Original Message-----
From: wojtekpia [mailto:wojtek_p@hotmail.com]
Sent: Thursday, January 22, 2009 3:46 PM
To: solr-user@lucene.apache.org
Subject: Re: Performance "dead-zone" due to garbage collection


I'm not sure if you suggested it, but I'd like to try the IBM JVM. Aside
from
setting my JRE paths, is there anything else I need to do run inside the
IBM
JVM? (e.g. re-compiling?)


Walter Underwood wrote:
> 
> What JVM and garbage collector setting? We are using the IBM JVM with
> their concurrent generational collector. I would strongly recommend
> trying a similar collector on your JVM. Hint: how much memory is in
> use after a full GC? That is a good approximation to the working set.
> 
> 

-- 
View this message in context:
http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collect
ion-tp21588427p21616078.html
Sent from the Solr - User mailing list archive at Nabble.com.





RE: Performance "dead-zone" due to garbage collection

Posted by "Feak, Todd" <To...@smss.sony.com>.
Can you share your experience with the IBM JDK once you've evaluated it?
You are working with a heavy load, I think many would benefit from the
feedback.

-Todd Feak

-----Original Message-----
From: wojtekpia [mailto:wojtek_p@hotmail.com] 
Sent: Thursday, January 22, 2009 3:46 PM
To: solr-user@lucene.apache.org
Subject: Re: Performance "dead-zone" due to garbage collection


I'm not sure if you suggested it, but I'd like to try the IBM JVM. Aside
from
setting my JRE paths, is there anything else I need to do run inside the
IBM
JVM? (e.g. re-compiling?)


Walter Underwood wrote:
> 
> What JVM and garbage collector setting? We are using the IBM JVM with
> their concurrent generational collector. I would strongly recommend
> trying a similar collector on your JVM. Hint: how much memory is in
> use after a full GC? That is a good approximation to the working set.
> 
> 

-- 
View this message in context:
http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collect
ion-tp21588427p21616078.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Performance "dead-zone" due to garbage collection

Posted by Walter Underwood <wu...@netflix.com>.
No need to recompile. Install it and change your JAVA_HOME
and things should work. The options are different than for
the Sun JVM. --wunder

On 1/22/09 3:46 PM, "wojtekpia" <wo...@hotmail.com> wrote:

> 
> I'm not sure if you suggested it, but I'd like to try the IBM JVM. Aside from
> setting my JRE paths, is there anything else I need to do run inside the IBM
> JVM? (e.g. re-compiling?)
> 
> 
> Walter Underwood wrote:
>> 
>> What JVM and garbage collector setting? We are using the IBM JVM with
>> their concurrent generational collector. I would strongly recommend
>> trying a similar collector on your JVM. Hint: how much memory is in
>> use after a full GC? That is a good approximation to the working set.



Re: Performance "dead-zone" due to garbage collection

Posted by wojtekpia <wo...@hotmail.com>.
I'm not sure if you suggested it, but I'd like to try the IBM JVM. Aside from
setting my JRE paths, is there anything else I need to do run inside the IBM
JVM? (e.g. re-compiling?)


Walter Underwood wrote:
> 
> What JVM and garbage collector setting? We are using the IBM JVM with
> their concurrent generational collector. I would strongly recommend
> trying a similar collector on your JVM. Hint: how much memory is in
> use after a full GC? That is a good approximation to the working set.
> 
> 

-- 
View this message in context: http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21616078.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Performance "dead-zone" due to garbage collection

Posted by Walter Underwood <wu...@netflix.com>.
What JVM and garbage collector setting? We are using the IBM JVM with
their concurrent generational collector. I would strongly recommend
trying a similar collector on your JVM. Hint: how much memory is in
use after a full GC? That is a good approximation to the working set.

27GB is a very, very large heap. Is that really being used or is it
just filling up with garbage which makes the collections really long?

We run with a 4GB heap and really only need that to handle indexing
or starting new searchers. Searching only needs a 2GB heap for us.
Our full GC pauses for under a half second. Way longer than I'd like,
but that's Java (I still miss Python sometimes).

wunder

On 1/21/09 9:49 AM, "wojtekpia" <wo...@hotmail.com> wrote:

> 
> I'm intermittently experiencing severe performance drops due to Java garbage
> collection. I'm allocating a lot of RAM to my Java process (27GB of the 32GB
> physically available). Under heavy load, the performance drops approximately
> every 10 minutes, and the drop lasts for 30-40 seconds. This coincides with
> the size of the old generation heap dropping from ~27GB to ~6GB.
> 
> Is there a way to reduce the impact of garbage collection? A couple ideas
> we've come up with (but haven't tried yet) are: increasing the minimum heap
> size, more frequent (but hopefully less costly) garbage collection.
> 
> Thanks,
> 
> Wojtek