You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by wojtekpia <wo...@hotmail.com> on 2009/02/04 20:37:46 UTC

Queued Requests during GC

During full garbage collection, Solr doesn't acknowledge incoming requests.
Any requests that were received during the GC are timestamped the moment GC
finishes (at least that's what my logs show). Is there a limit to how many
requests can queue up during a full GC? This doesn't seem like a Solr
setting, but rather a container/OS setting (I'm using Tomcat on Linux).

Thanks.

Wojtek
-- 
View this message in context: http://www.nabble.com/Queued-Requests-during-GC-tp21837898p21837898.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Queued Requests during GC

Posted by Sridhar Basam <ba...@stream.aol.net>.
That is the expected behaviour, all application threads are paused 
during GC (CMS collector being an exception, there are smaller pauses 
but the application threads continue to mostly run). The number of 
connections that could end up being queued would depend on your 
acceptCount setting in the server.xml file and also the inbound request 
rate and the time the GC takes to complete.

The OS will queue upto acceptCount requests before it begins to ignore 
incoming tcp connection requests. So if your inbound request rate is 2 
per second and a full GC takes 6 seconds to complete, you should have 12 
(2x6) new requests waiting for you when GC completes.

  Sridhar


wojtekpia wrote:
> During full garbage collection, Solr doesn't acknowledge incoming requests.
> Any requests that were received during the GC are timestamped the moment GC
> finishes (at least that's what my logs show). Is there a limit to how many
> requests can queue up during a full GC? This doesn't seem like a Solr
> setting, but rather a container/OS setting (I'm using Tomcat on Linux).
>
> Thanks.
>
> Wojtek
>   


Re: Queued Requests during GC

Posted by Walter Underwood <wu...@netflix.com>.
On 2/4/09 3:44 PM, "Chris Hostetter" <ho...@fucit.org> wrote:

> I don't thinkg the Query class implementations themselves changed in
> anyway that would have made them larger -- but if you switched from the
> standard parser to dismax parser, or started using lots of boost
> queries, or started using prefix or wildcard queries, then yes: the Query
> objects used would have gotten bigger.

Could have been caused by fuzzy search, since we did that around
the same time. Lucene changed from 1.9 to 2.4, so I thought there
might have been some changes there.

wunder



Re: Queued Requests during GC

Posted by Chris Hostetter <ho...@fucit.org>.
: >> Aha! I bet that the full Query object became a lot more complicated
: >> between Solr 1.1 and 1.3. That would explain why we did 4X as much GC
: >> after the upgrade.

I don't thinkg the Query class implementations themselves changed in 
anyway that would have made them larger -- but if you switched from the 
standard parser to dismax parser, or started using lots of boost 
queries, or started using prefix or wildcard queries, then yes: the Query 
objects used would have gotten bigger.

: Another approach is to get fancy with the load balancing and always
: send the same query back to the same server. That increases the
: effective cache size by the number of servers, but it forces a
: simplistic round-robin load balancing and you have to be careful
: with down servers to avoid blowing all the caches simultaneously.

at a certain point, if you have enough machines, a two tiered LB situation 
starts to be worth consideration.  tier#1 can use hashing on the 
querystring to pick which tier#2 cluster to send the query to.  each 
tier#2 cluster can be fronted by a load balancer that picks the server to 
use based on whatever "workload" metric you want.  a small percentage of 
machines in any given cluster (or in every cluster) can be down w/o 
worrying about screwing up the caches or adversly afecting traffic -- you 
just can't let an entire cluster be down at once.



-Hoss


Re: Queued Requests during GC

Posted by Walter Underwood <wu...@netflix.com>.
On 2/4/09 3:17 PM, "Mark Miller" <ma...@gmail.com> wrote:

> Walter Underwood wrote:
>> Aha! I bet that the full Query object became a lot more complicated
>> between Solr 1.1 and 1.3. That would explain why we did 4X as much GC
>> after the upgrade.
>> 
>> Items evicted from cache are tenured, so they contribute to the full GC.
>> With an HTTP cache in front, there is hardly anything left to be
>> cached, so there are lots of evictions. We get a query result cache
>> hit rate around 0.12.
>> 
>> wunder
>>   
> At 10%, have you considered just not using the cache? Is that worth all
> the extra work? Or are you not paying as much as your losing in GC/cache
> time?

I was going to verify the source of the tenured garbage before starting
another round of trial-and-error tuning. Now that I have a good hunch,
I might spend some time on that after the Oscars (our peak day for the
year at Netflix).

Another approach is to get fancy with the load balancing and always
send the same query back to the same server. That increases the
effective cache size by the number of servers, but it forces a
simplistic round-robin load balancing and you have to be careful
with down servers to avoid blowing all the caches simultaneously.

At Infoseek, we learned that blowing all the caches when one server
goes down is a very bad idea.

wunder



Re: Queued Requests during GC

Posted by Mark Miller <ma...@gmail.com>.
Walter Underwood wrote:
> Aha! I bet that the full Query object became a lot more complicated
> between Solr 1.1 and 1.3. That would explain why we did 4X as much GC
> after the upgrade.
>
> Items evicted from cache are tenured, so they contribute to the full GC.
> With an HTTP cache in front, there is hardly anything left to be
> cached, so there are lots of evictions. We get a query result cache
> hit rate around 0.12.
>
> wunder
>   
At 10%, have you considered just not using the cache? Is that worth all 
the extra work? Or are you not paying as much as your losing in GC/cache 
time?

- Mark


Re: Queued Requests during GC

Posted by Walter Underwood <wu...@netflix.com>.
Aha! I bet that the full Query object became a lot more complicated
between Solr 1.1 and 1.3. That would explain why we did 4X as much GC
after the upgrade.

Items evicted from cache are tenured, so they contribute to the full GC.
With an HTTP cache in front, there is hardly anything left to be
cached, so there are lots of evictions. We get a query result cache
hit rate around 0.12.

wunder

On 2/4/09 3:01 PM, "Yonik Seeley" <ys...@gmail.com> wrote:

> On Wed, Feb 4, 2009 at 5:52 PM, Walter Underwood <wu...@netflix.com>
> wrote:
>> I have not had the time to pin it down, but I suspect that items
>> evicted from the query result cache contain a lot of objects.
>> Are the keys a full parse tree? That could be big.
> 
> Yes, keys are full Query objects.
> It would be non-trivial to switch to String given all of the things
> that can affect how a Query object is built.
> 
> -Yonik


Re: Queued Requests during GC

Posted by Yonik Seeley <ys...@gmail.com>.
On Wed, Feb 4, 2009 at 5:52 PM, Walter Underwood <wu...@netflix.com> wrote:
> I have not had the time to pin it down, but I suspect that items
> evicted from the query result cache contain a lot of objects.
> Are the keys a full parse tree? That could be big.

Yes, keys are full Query objects.
It would be non-trivial to switch to String given all of the things
that can affect how a Query object is built.

-Yonik

Re: Queued Requests during GC

Posted by Walter Underwood <wu...@netflix.com>.
On 2/4/09 2:48 PM, "Mark Miller" <ma...@gmail.com> wrote:

> If there are spots in Lucene/Solr that are producing so much garbage
> that we can't keep up, perhaps work can be done to address this upon
> pinpointing the issues.
> 
> - Mark

I have not had the time to pin it down, but I suspect that items
evicted from the query result cache contain a lot of objects.
Are the keys a full parse tree? That could be big.

wunder


Re: Queued Requests during GC

Posted by Mark Miller <ma...@gmail.com>.
Walter Underwood wrote:
> Also, only use as much heap as you really need. A larger heap
> means longer GCs.
>   
Right. Ideally you want to figure out how to get longer pauses down. 
There is a lot of fiddling that you can do to improve gc times.

On a multiprocessor machine you can parallelize collection of both the 
new and tenured spaces for a nice boost. You can resize spaces within 
the heap as well. There is also a low pause incremental collector you 
can try. A lot of this type of tuning takes trial and error and 
experience though. A really helpful tool is visualgc, which lets you 
watch garbage collection for your app in realtime. You can also use 
jconsole and other tools like that, but visualgc actually renders a view 
of the heap and its easier to watch and get a feel for how garbage 
collection is working. If its hard to get a GUI up, all of those tools 
work remotely as well.

You can find a lot of good info on things to try here:

http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html

If there are spots in Lucene/Solr that are producing so much garbage 
that we can't keep up, perhaps work can be done to address this upon 
pinpointing the issues.

- Mark

Re: Queued Requests during GC

Posted by Walter Underwood <wu...@netflix.com>.
This is when a load balancer helps. The requests sent around the
time that the GC starts will be stuck on that server, but later
ones can be sent to other servers.

We use a "least connections" load balancing strategy. Each connection
represents a request in progress, so this is the same as equalizing
the queue of requests for each server.

Also, only use as much heap as you really need. A larger heap
means longer GCs.

wunder

On 2/4/09 1:59 PM, "Yonik Seeley" <ys...@gmail.com> wrote:

> On Wed, Feb 4, 2009 at 3:12 PM, Otis Gospodnetic
> <ot...@yahoo.com> wrote:
>> I'd be curious if you could reproduce this in Jetty....
> 
> All application threads are blocked... it's going to be the same in
> Jetty or Tomcat or any other container that's pure Java.  There is an
> OS level listening queue that has a certain depth (configurable in
> both tomcat and jetty and passed down to the OS when listen() for the
> socket is called).  If too many connections are initiated without
> being accepted, they will start being rejected.
> 
> See UNIX man pages for listen() and connect() for more details.
> 
> For Tomcat, the config param you want is "acceptCount"
> http://tomcat.apache.org/tomcat-6.0-doc/config/http.html
> 
> Increasing this will ensure that connections don't get rejected while
> a long GC is going on.
> 
> -Yonik


Re: Queued Requests during GC

Posted by Yonik Seeley <ys...@gmail.com>.
On Wed, Feb 4, 2009 at 3:12 PM, Otis Gospodnetic
<ot...@yahoo.com> wrote:
> I'd be curious if you could reproduce this in Jetty....

All application threads are blocked... it's going to be the same in
Jetty or Tomcat or any other container that's pure Java.  There is an
OS level listening queue that has a certain depth (configurable in
both tomcat and jetty and passed down to the OS when listen() for the
socket is called).  If too many connections are initiated without
being accepted, they will start being rejected.

See UNIX man pages for listen() and connect() for more details.

For Tomcat, the config param you want is "acceptCount"
http://tomcat.apache.org/tomcat-6.0-doc/config/http.html

Increasing this will ensure that connections don't get rejected while
a long GC is going on.

-Yonik

Re: Queued Requests during GC

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Wojtek,

I'm not familiar with the details of Tomcat configuration, but this definitely sounds like a container issue, closely related to the JVM.

Doing a thread dump for the Java process (the JVM your TOmcat runs in) while the GC is running will show you which threads are blocked and in turn that should point you in the right direction as far as Tomcat setting is covered.  Sorry for not being able to give you a more specific answer.

Is this happening with the latest JVM from Sun?

I'd be curious if you could reproduce this in Jetty....

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: wojtekpia <wo...@hotmail.com>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, February 4, 2009 2:37:46 PM
> Subject: Queued Requests during GC
> 
> 
> During full garbage collection, Solr doesn't acknowledge incoming requests.
> Any requests that were received during the GC are timestamped the moment GC
> finishes (at least that's what my logs show). Is there a limit to how many
> requests can queue up during a full GC? This doesn't seem like a Solr
> setting, but rather a container/OS setting (I'm using Tomcat on Linux).
> 
> Thanks.
> 
> Wojtek
> -- 
> View this message in context: 
> http://www.nabble.com/Queued-Requests-during-GC-tp21837898p21837898.html
> Sent from the Solr - User mailing list archive at Nabble.com.