You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Michael <so...@gmail.com> on 2009/09/22 22:03:58 UTC

Parallel requests to Tomcat

Hi,
I have a Solr+Tomcat installation on an 8 CPU Linux box, and I just tried
sending parallel requests to it and measuring response time.  I would expect
that it could handle up to 8 parallel requests without significant slowdown
of any individual request.

Instead, I found that Tomcat is serializing the requests.

For example, the response time for each of 2 parallel requests is nearly 2
times that for a single request, and the time for each of 8 parallel
requests is about 4 times that of a single request.

I am pretty sure this is a Tomcat issue, for when I started 8 identical
instances of Solr+Tomcat on the machine (on 8 different ports), I could send
one request to each in parallel with only a 20% slowdown (compared to 300%
in a single Tomcat.)

I'm using the stock Tomcat download with minimal configuration changes,
except that I disabled all logging (in case the logger was blocking for each
request, serializing them.)  I'm giving 2G RAM to each JVM.

Does anyone more familiar with Tomcat know what's wrong?  I can't imagine
that Tomcat really can't handle parallel requests.

RE: Parallel requests to Tomcat

Posted by Fuad Efendi <fu...@efendi.ca>.
> 8 threads sharing something may have *some* overhead versus 8 processes,
but
> as you say, 410ms overhead points to a different problem.


- You have baseline (single-threaded load-stress script sending requests to
SOLR) (1-request-in-parallel, 8 requests to 8 Tomcats); 200ms looks
extremely high... only if you are not GETting more than top-10000 docs in a
single query (instead of default top-10).



Re: Parallel requests to Tomcat

Posted by Michael <so...@gmail.com>.
Hi Fuad,

On Wed, Sep 23, 2009 at 11:37 AM, Fuad Efendi <fu...@efendi.ca> wrote:

> >  8 queries against 1 Tomcat average 600ms per query, while 8 queries
> against
> > 8 Tomcats average 190ms per query (on a dedicated 8 CPU server w 32G
> RAM).
> >  I don't see how to interpret these numbers except that Tomcat is not
> > multithreading as well as it should :)
>
> Hi Michael, I think it is very natural; 8 single processes not sharing
> anything are faster than 8 threads sharing something.
>

8 threads sharing something may have *some* overhead versus 8 processes, but
as you say, 410ms overhead points to a different problem.


However, 600ms is too high.
>
> >My index is on a
> >ramfs which I've shown makes the QR and doc caches unnecessary;
>
> However, SOLR is faster than pure Lucene, try SOLR caches!
>

I have.  In a separate test, I verified that the caches that save disk I/O
(QR and doc) make no difference to query time, because my index is on a
ramfs.  The caches that save CPU cycles (filter and fieldvalue, because I'm
doing heavy faceting) DO help and I do have them turned on.

Michael

RE: Parallel requests to Tomcat

Posted by Fuad Efendi <fu...@efendi.ca>.
>  8 queries against 1 Tomcat average 600ms per query, while 8 queries
against
> 8 Tomcats average 190ms per query (on a dedicated 8 CPU server w 32G RAM).
>  I don't see how to interpret these numbers except that Tomcat is not
> multithreading as well as it should :)


Hi Michael, I think it is very natural; 8 single processes not sharing
anything are faster than 8 threads sharing something.


However, 600ms is too high.

>My index is on a
>ramfs which I've shown makes the QR and doc caches unnecessary;


However, SOLR is faster than pure Lucene, try SOLR caches! 

(I am not sure about current version of SOLR(Lucene), but Lucene always used
synchronized isDeleted() method which causes 'serialization').



Re: Parallel requests to Tomcat

Posted by Michael <so...@gmail.com>.
Hi Fuad, thanks for the reply.
My queries are heavy enough that the difference in performance is obvious.
 I am using a home-grown load testing script that sends 1000 realistic
queries to the server and takes the average response time.  My index is on a
ramfs which I've shown makes the QR and doc caches unnecessary; I am warming
up the filter and fieldvalue caches before beginning the test.  There's no
appreciable difference between query times at the beginning, middle, or end
of the test, so I can't blame the hotspot or the Tomcat thread pool for not
being warmed up.

The queries I'm using are complex enough that they take a long time to run.
 8 queries against 1 Tomcat average 600ms per query, while 8 queries against
8 Tomcats average 190ms per query (on a dedicated 8 CPU server w 32G RAM).
 I don't see how to interpret these numbers except that Tomcat is not
multithreading as well as it should :)

Your thoughts?
Michael


On Wed, Sep 23, 2009 at 10:48 AM, Fuad Efendi <fu...@efendi.ca> wrote:

> For 8-CPU load-stress testing of Tomcat you are probably making mistake:
> - you should execute load-stress software and wait 5-30 minutes (depends on
> index size) BEFORE taking measurements.
>
> 1. JVM HotSpot need to compile everything into native code
> 2. Tomcat Thread Pool needs warm up
> 3. SOLR caches need warm up(!)
> And etc.
>
> 8 parallel requests are too small for default Tomcat; it uses 150 threads
> (default for old versions), and new Concurrent package from Java 5
>
> You should not test manually; use software such as The Grinder etc., also
> note please: there is difference between mean time and response time,
> between average (successful) requests per second and average response
> time...
>
> > Tomcat is serializing the requests
> - doesn't mean anything for performance... yes, it has dedicated Listener
> on
> dedicated port dispatching requests to worker threads... and LAN NIC card
> serializes everything too...
>
>
>
> Fuad Efendi
> http://www.linkedin.com/in/liferay
>
>
>
> > -----Original Message-----
> > From: Michael [mailto:solrcoder@gmail.com]
> > Sent: September-22-09 4:04 PM
> > To: solr-user@lucene.apache.org
> > Subject: Parallel requests to Tomcat
> >
> > Hi,
> > I have a Solr+Tomcat installation on an 8 CPU Linux box, and I just tried
> > sending parallel requests to it and measuring response time.  I would
> expect
> > that it could handle up to 8 parallel requests without significant
> slowdown
> > of any individual request.
> >
> > Instead, I found that Tomcat is serializing the requests.
> >
> > For example, the response time for each of 2 parallel requests is nearly
> 2
> > times that for a single request, and the time for each of 8 parallel
> > requests is about 4 times that of a single request.
> >
> > I am pretty sure this is a Tomcat issue, for when I started 8 identical
> > instances of Solr+Tomcat on the machine (on 8 different ports), I could
> send
> > one request to each in parallel with only a 20% slowdown (compared to
> 300%
> > in a single Tomcat.)
> >
> > I'm using the stock Tomcat download with minimal configuration changes,
> > except that I disabled all logging (in case the logger was blocking for
> each
> > request, serializing them.)  I'm giving 2G RAM to each JVM.
> >
> > Does anyone more familiar with Tomcat know what's wrong?  I can't imagine
> > that Tomcat really can't handle parallel requests.
>
>
>

RE: Parallel requests to Tomcat

Posted by Fuad Efendi <fu...@efendi.ca>.
For 8-CPU load-stress testing of Tomcat you are probably making mistake:
- you should execute load-stress software and wait 5-30 minutes (depends on
index size) BEFORE taking measurements.

1. JVM HotSpot need to compile everything into native code
2. Tomcat Thread Pool needs warm up
3. SOLR caches need warm up(!)
And etc.

8 parallel requests are too small for default Tomcat; it uses 150 threads
(default for old versions), and new Concurrent package from Java 5

You should not test manually; use software such as The Grinder etc., also
note please: there is difference between mean time and response time,
between average (successful) requests per second and average response
time...

> Tomcat is serializing the requests
- doesn't mean anything for performance... yes, it has dedicated Listener on
dedicated port dispatching requests to worker threads... and LAN NIC card
serializes everything too... 



Fuad Efendi
http://www.linkedin.com/in/liferay



> -----Original Message-----
> From: Michael [mailto:solrcoder@gmail.com]
> Sent: September-22-09 4:04 PM
> To: solr-user@lucene.apache.org
> Subject: Parallel requests to Tomcat
> 
> Hi,
> I have a Solr+Tomcat installation on an 8 CPU Linux box, and I just tried
> sending parallel requests to it and measuring response time.  I would
expect
> that it could handle up to 8 parallel requests without significant
slowdown
> of any individual request.
> 
> Instead, I found that Tomcat is serializing the requests.
> 
> For example, the response time for each of 2 parallel requests is nearly 2
> times that for a single request, and the time for each of 8 parallel
> requests is about 4 times that of a single request.
> 
> I am pretty sure this is a Tomcat issue, for when I started 8 identical
> instances of Solr+Tomcat on the machine (on 8 different ports), I could
send
> one request to each in parallel with only a 20% slowdown (compared to 300%
> in a single Tomcat.)
> 
> I'm using the stock Tomcat download with minimal configuration changes,
> except that I disabled all logging (in case the logger was blocking for
each
> request, serializing them.)  I'm giving 2G RAM to each JVM.
> 
> Does anyone more familiar with Tomcat know what's wrong?  I can't imagine
> that Tomcat really can't handle parallel requests.



Re: Parallel requests to Tomcat

Posted by Michael <so...@gmail.com>.
Thanks for the suggestion, Walter!  I've been using Gaze 1.0 for a while
now, but when I moved to a multicore approach (which was the impetus behind
all of this testing) Gaze failed to start and I had to comment it out of
solrconfig.xml to get Solr to start.  Are you aware whether Gaze is able to
work in a multicore environment?
Michael

On Wed, Sep 23, 2009 at 11:55 AM, Walter Underwood <wu...@wunderwood.org>wrote:

> This sure seems like a good time to try LucidGaze for Solr. That would give
> some Solr-specific profiling data.
>
> http://www.lucidimagination.com/Downloads/LucidGaze-for-Solr
>
> wunder
>
>
> On Sep 23, 2009, at 8:47 AM, Michael wrote:
>
>  Hi Yonik,
>>
>> On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley
>> <yo...@lucidimagination.com>wrote:
>>
>>
>>> This could well be IO bound - lots of seeks and reads.
>>>
>>>
>> If this were IO bound, wouldn't I see the same results when sending my 8
>> requests to 8 Tomcats?  There's only one "disk" (well, RAM) whether I'm
>> querying 8 processes or 8 threads in 1 process, right?
>>
>> Michael
>>
>
>

Re: Parallel requests to Tomcat

Posted by Walter Underwood <wu...@wunderwood.org>.
This sure seems like a good time to try LucidGaze for Solr. That would  
give some Solr-specific profiling data.

http://www.lucidimagination.com/Downloads/LucidGaze-for-Solr

wunder

On Sep 23, 2009, at 8:47 AM, Michael wrote:

> Hi Yonik,
>
> On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley
> <yo...@lucidimagination.com>wrote:
>
>>
>> This could well be IO bound - lots of seeks and reads.
>>
>
> If this were IO bound, wouldn't I see the same results when sending  
> my 8
> requests to 8 Tomcats?  There's only one "disk" (well, RAM) whether  
> I'm
> querying 8 processes or 8 threads in 1 process, right?
>
> Michael


Re: Parallel requests to Tomcat

Posted by Michael <so...@gmail.com>.
Great news for Solr -- a third party library that I'm calling is serialized.
 Silly me, I made a mistake when ruling out that library as the culprit
earlier.  Solr itself scales just great as add threads.  JProfiler helped me
find the problem.
Sorry for the false alarm, and thanks for the suggestions!
Michael

On Fri, Sep 25, 2009 at 10:02 AM, Michael <so...@gmail.com> wrote:

> Thank you Grant and Lance for your comments -- I've run into a separate
> snag which puts this on hold for a bit, but I'll return to finish digging
> into this and post my results. - Michael
>
> On Thu, Sep 24, 2009 at 9:23 PM, Lance Norskog <go...@gmail.com> wrote:
>
>> Are you on Java 5, 6 or 7? Each release sees some tweaking of the Java
>> multithreading model as well as performance improvements (and bug
>> fixes) in the Sun HotSpot runtime.
>>
>> You may be tripping over the TCP/IP multithreaded connection manager.
>> You might wish to create each client thread with a separate socket.
>>
>> Also, here is a standard bit of benchmarking advice: include "think
>> time". This means that instead of sending requests constantly, each
>> thread should time out for a few seconds before sending the next
>> request. This simulates a user "stopping and thinking" before clicking
>> the mouse again. This helps simulate the quantity of threads, etc.
>> which are stopped and waiting at each stage of the request pipeline.
>> As it is, you are trying to simulate the throughput behaviour without
>> simulating the horizontal volume. (Benchmarking is much harder than it
>> looks.)
>>
>> On Wed, Sep 23, 2009 at 9:43 AM, Grant Ingersoll <gs...@apache.org>
>> wrote:
>> >
>> > On Sep 23, 2009, at 12:09 PM, Michael wrote:
>> >
>> >> On Wed, Sep 23, 2009 at 12:05 PM, Yonik Seeley
>> >> <yo...@lucidimagination.com>wrote:
>> >>
>> >>> On Wed, Sep 23, 2009 at 11:47 AM, Michael <so...@gmail.com>
>> wrote:
>> >>>>
>> >>>> If this were IO bound, wouldn't I see the same results when sending
>> my 8
>> >>>> requests to 8 Tomcats?  There's only one "disk" (well, RAM) whether
>> I'm
>> >>>> querying 8 processes or 8 threads in 1 process, right?
>> >>>
>> >>> Right - I was thinking IO bound at the Lucene Directory level - which
>> >>> synchronized in the past and led to poor concurrency.  Buy your Solr
>> >>> version is recent enough to use the newer unsynchronized method by
>> >>> default (on non-windows)
>> >>>
>> >>
>> >> Ah, OK.  So it looks like comparing to Jetty is my only next step.
>> >>  Although
>> >> I'm not sure what I'm going to do based on the result of that test --
>> if
>> >> Jetty behaves differently, then I still don't know why the heck Tomcat
>> is
>> >> behaving badly! :)
>> >
>> >
>> > Have you done any profiling to see where hotspots are?  Have you looked
>> at
>> > garbage collection?  Do you have any full collections occurring?  What
>> > garbage collector are you using?  How often are you updating/committing,
>> > etc?
>> >
>> >
>> > --------------------------
>> > Grant Ingersoll
>> > http://www.lucidimagination.com/
>> >
>> > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
>> > Solr/Lucene:
>> > http://www.lucidimagination.com/search
>> >
>> >
>>
>>
>>
>> --
>> Lance Norskog
>> goksron@gmail.com
>>
>
>

Re: Parallel requests to Tomcat

Posted by Michael <so...@gmail.com>.
Thank you Grant and Lance for your comments -- I've run into a separate snag
which puts this on hold for a bit, but I'll return to finish digging into
this and post my results. - Michael
On Thu, Sep 24, 2009 at 9:23 PM, Lance Norskog <go...@gmail.com> wrote:

> Are you on Java 5, 6 or 7? Each release sees some tweaking of the Java
> multithreading model as well as performance improvements (and bug
> fixes) in the Sun HotSpot runtime.
>
> You may be tripping over the TCP/IP multithreaded connection manager.
> You might wish to create each client thread with a separate socket.
>
> Also, here is a standard bit of benchmarking advice: include "think
> time". This means that instead of sending requests constantly, each
> thread should time out for a few seconds before sending the next
> request. This simulates a user "stopping and thinking" before clicking
> the mouse again. This helps simulate the quantity of threads, etc.
> which are stopped and waiting at each stage of the request pipeline.
> As it is, you are trying to simulate the throughput behaviour without
> simulating the horizontal volume. (Benchmarking is much harder than it
> looks.)
>
> On Wed, Sep 23, 2009 at 9:43 AM, Grant Ingersoll <gs...@apache.org>
> wrote:
> >
> > On Sep 23, 2009, at 12:09 PM, Michael wrote:
> >
> >> On Wed, Sep 23, 2009 at 12:05 PM, Yonik Seeley
> >> <yo...@lucidimagination.com>wrote:
> >>
> >>> On Wed, Sep 23, 2009 at 11:47 AM, Michael <so...@gmail.com> wrote:
> >>>>
> >>>> If this were IO bound, wouldn't I see the same results when sending my
> 8
> >>>> requests to 8 Tomcats?  There's only one "disk" (well, RAM) whether
> I'm
> >>>> querying 8 processes or 8 threads in 1 process, right?
> >>>
> >>> Right - I was thinking IO bound at the Lucene Directory level - which
> >>> synchronized in the past and led to poor concurrency.  Buy your Solr
> >>> version is recent enough to use the newer unsynchronized method by
> >>> default (on non-windows)
> >>>
> >>
> >> Ah, OK.  So it looks like comparing to Jetty is my only next step.
> >>  Although
> >> I'm not sure what I'm going to do based on the result of that test -- if
> >> Jetty behaves differently, then I still don't know why the heck Tomcat
> is
> >> behaving badly! :)
> >
> >
> > Have you done any profiling to see where hotspots are?  Have you looked
> at
> > garbage collection?  Do you have any full collections occurring?  What
> > garbage collector are you using?  How often are you updating/committing,
> > etc?
> >
> >
> > --------------------------
> > Grant Ingersoll
> > http://www.lucidimagination.com/
> >
> > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> > Solr/Lucene:
> > http://www.lucidimagination.com/search
> >
> >
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Re: Parallel requests to Tomcat

Posted by Lance Norskog <go...@gmail.com>.
Are you on Java 5, 6 or 7? Each release sees some tweaking of the Java
multithreading model as well as performance improvements (and bug
fixes) in the Sun HotSpot runtime.

You may be tripping over the TCP/IP multithreaded connection manager.
You might wish to create each client thread with a separate socket.

Also, here is a standard bit of benchmarking advice: include "think
time". This means that instead of sending requests constantly, each
thread should time out for a few seconds before sending the next
request. This simulates a user "stopping and thinking" before clicking
the mouse again. This helps simulate the quantity of threads, etc.
which are stopped and waiting at each stage of the request pipeline.
As it is, you are trying to simulate the throughput behaviour without
simulating the horizontal volume. (Benchmarking is much harder than it
looks.)

On Wed, Sep 23, 2009 at 9:43 AM, Grant Ingersoll <gs...@apache.org> wrote:
>
> On Sep 23, 2009, at 12:09 PM, Michael wrote:
>
>> On Wed, Sep 23, 2009 at 12:05 PM, Yonik Seeley
>> <yo...@lucidimagination.com>wrote:
>>
>>> On Wed, Sep 23, 2009 at 11:47 AM, Michael <so...@gmail.com> wrote:
>>>>
>>>> If this were IO bound, wouldn't I see the same results when sending my 8
>>>> requests to 8 Tomcats?  There's only one "disk" (well, RAM) whether I'm
>>>> querying 8 processes or 8 threads in 1 process, right?
>>>
>>> Right - I was thinking IO bound at the Lucene Directory level - which
>>> synchronized in the past and led to poor concurrency.  Buy your Solr
>>> version is recent enough to use the newer unsynchronized method by
>>> default (on non-windows)
>>>
>>
>> Ah, OK.  So it looks like comparing to Jetty is my only next step.
>>  Although
>> I'm not sure what I'm going to do based on the result of that test -- if
>> Jetty behaves differently, then I still don't know why the heck Tomcat is
>> behaving badly! :)
>
>
> Have you done any profiling to see where hotspots are?  Have you looked at
> garbage collection?  Do you have any full collections occurring?  What
> garbage collector are you using?  How often are you updating/committing,
> etc?
>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> Solr/Lucene:
> http://www.lucidimagination.com/search
>
>



-- 
Lance Norskog
goksron@gmail.com

Re: Parallel requests to Tomcat

Posted by Grant Ingersoll <gs...@apache.org>.
On Sep 23, 2009, at 12:09 PM, Michael wrote:

> On Wed, Sep 23, 2009 at 12:05 PM, Yonik Seeley
> <yo...@lucidimagination.com>wrote:
>
>> On Wed, Sep 23, 2009 at 11:47 AM, Michael <so...@gmail.com>  
>> wrote:
>>> If this were IO bound, wouldn't I see the same results when  
>>> sending my 8
>>> requests to 8 Tomcats?  There's only one "disk" (well, RAM)  
>>> whether I'm
>>> querying 8 processes or 8 threads in 1 process, right?
>>
>> Right - I was thinking IO bound at the Lucene Directory level - which
>> synchronized in the past and led to poor concurrency.  Buy your Solr
>> version is recent enough to use the newer unsynchronized method by
>> default (on non-windows)
>>
>
> Ah, OK.  So it looks like comparing to Jetty is my only next step.   
> Although
> I'm not sure what I'm going to do based on the result of that test  
> -- if
> Jetty behaves differently, then I still don't know why the heck  
> Tomcat is
> behaving badly! :)


Have you done any profiling to see where hotspots are?  Have you  
looked at garbage collection?  Do you have any full collections  
occurring?  What garbage collector are you using?  How often are you  
updating/committing, etc?


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Re: Parallel requests to Tomcat

Posted by Michael <so...@gmail.com>.
On Wed, Sep 23, 2009 at 12:05 PM, Yonik Seeley
<yo...@lucidimagination.com>wrote:

> On Wed, Sep 23, 2009 at 11:47 AM, Michael <so...@gmail.com> wrote:
> > If this were IO bound, wouldn't I see the same results when sending my 8
> > requests to 8 Tomcats?  There's only one "disk" (well, RAM) whether I'm
> > querying 8 processes or 8 threads in 1 process, right?
>
> Right - I was thinking IO bound at the Lucene Directory level - which
> synchronized in the past and led to poor concurrency.  Buy your Solr
> version is recent enough to use the newer unsynchronized method by
> default (on non-windows)
>

Ah, OK.  So it looks like comparing to Jetty is my only next step.  Although
I'm not sure what I'm going to do based on the result of that test -- if
Jetty behaves differently, then I still don't know why the heck Tomcat is
behaving badly! :)

Michael

Re: Parallel requests to Tomcat

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Wed, Sep 23, 2009 at 11:47 AM, Michael <so...@gmail.com> wrote:
> Hi Yonik,
>
> On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley <yo...@lucidimagination.com>
> wrote:
>>
>> This could well be IO bound - lots of seeks and reads.
>
> If this were IO bound, wouldn't I see the same results when sending my 8
> requests to 8 Tomcats?  There's only one "disk" (well, RAM) whether I'm
> querying 8 processes or 8 threads in 1 process, right?

Right - I was thinking IO bound at the Lucene Directory level - which
synchronized in the past and led to poor concurrency.  Buy your Solr
version is recent enough to use the newer unsynchronized method by
default (on non-windows)

-Yonik
http://www.lucidimagination.com

Re: Parallel requests to Tomcat

Posted by Michael <so...@gmail.com>.
Hi Yonik,

On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley
<yo...@lucidimagination.com>wrote:

>
> This could well be IO bound - lots of seeks and reads.
>

If this were IO bound, wouldn't I see the same results when sending my 8
requests to 8 Tomcats?  There's only one "disk" (well, RAM) whether I'm
querying 8 processes or 8 threads in 1 process, right?

Michael

Re: Parallel requests to Tomcat

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Wed, Sep 23, 2009 at 11:17 AM, Michael <so...@gmail.com> wrote:
> I'm using a Solr 1.4 nightly from around July.  Is that recent enough to
> have the improved reader implementation?
> I'm not sure whether you'd call my operations IO heavy -- each query has so
> many terms (~50) that even against a 45K document index a query takes 130ms,
> but the entire index is in a ramfs.

This could well be IO bound - lots of seeks and reads.
Perhaps try on a normal filesystem and see if you still see the
serialization - someone recently saw some funny results with tmpfs and
lucene, so it would be good to rule that out.

If you want to try and rule out tomcat, throw the webapp in jetty.

-Yonik
http://www.lucidimagination.com

Re: Parallel requests to Tomcat

Posted by Michael <so...@gmail.com>.
On Wed, Sep 23, 2009 at 11:26 AM, Fuad Efendi <fu...@efendi.ca> wrote:
>
> - something obviously wrong in your case, 130ms is too high. Is it
> dedicated
> server? Disk swapping? Etc.
>

It's that my queries are ridiculously complex.  My users are very familiar
with boolean searching, and I'm doing a lot of processing outside of Solr
that increases the query size by something like 50x.

I'm OK with the individual query time -- I can always shave terms off if I
must.  It's the difference between 1 Tomcat and 8 Tomcats that is the
problem: I'd like to be able to harness all 8 CPUs!  While my test corpus is
45K docs, my actual corpus will be 30MM, and so I'd like to get all the
performance I can out of my box.

Michael

RE: Parallel requests to Tomcat

Posted by Fuad Efendi <fu...@efendi.ca>.
Correction: 0 - 150ms (depends on size of query results; 150ms for
non-cached (new) queries returning more than 50K docs).


> -----Original Message-----
> From: Fuad Efendi [mailto:fuad@efendi.ca]
> Sent: September-23-09 11:26 AM
> To: solr-user@lucene.apache.org
> Subject: RE: Parallel requests to Tomcat
> 
> 
> I have 0-15ms for 50M (millions docs), Tomcat, 8-CPU:
> http://www.tokenizer.org
> ==================================
> 
> - something obviously wrong in your case, 130ms is too high. Is it
dedicated
> server? Disk swapping? Etc.
> 
> 
> 
> > -----Original Message-----
> > From: Michael [mailto:solrcoder@gmail.com]
> > Sent: September-23-09 11:17 AM
> > To: solr-user@lucene.apache.org; yonik@lucidimagination.com
> > Subject: Re: Parallel requests to Tomcat
> >
> > I'm using a Solr 1.4 nightly from around July.  Is that recent enough to
> > have the improved reader implementation?
> > I'm not sure whether you'd call my operations IO heavy -- each query has
> so
> > many terms (~50) that even against a 45K document index a query takes
> 130ms,
> > but the entire index is in a ramfs.
> > - Michael
> >
> > On Tue, Sep 22, 2009 at 8:08 PM, Yonik Seeley
> > <yo...@lucidimagination.com>wrote:
> >
> > > What version of Solr are you using?
> > > Solr1.3 and Lucene 2.4 defaulted to an index reader implementation
> > > that had to synchronize, so search operations that are IO "heavy"
> > > can't proceed in parallel.  You shouldn't see this with 1.4
> > >
> > > -Yonik
> > > http://www.lucidimagination.com
> > >
> > >
> > >
> > > On Tue, Sep 22, 2009 at 4:03 PM, Michael <so...@gmail.com> wrote:
> > > > Hi,
> > > > I have a Solr+Tomcat installation on an 8 CPU Linux box, and I just
> tried
> > > > sending parallel requests to it and measuring response time.  I
would
> > > expect
> > > > that it could handle up to 8 parallel requests without significant
> > > slowdown
> > > > of any individual request.
> > > >
> > > > Instead, I found that Tomcat is serializing the requests.
> > > >
> > > > For example, the response time for each of 2 parallel requests is
> nearly
> > > 2
> > > > times that for a single request, and the time for each of 8 parallel
> > > > requests is about 4 times that of a single request.
> > > >
> > > > I am pretty sure this is a Tomcat issue, for when I started 8
> identical
> > > > instances of Solr+Tomcat on the machine (on 8 different ports), I
> could
> > > send
> > > > one request to each in parallel with only a 20% slowdown (compared
to
> > > 300%
> > > > in a single Tomcat.)
> > > >
> > > > I'm using the stock Tomcat download with minimal configuration
> changes,
> > > > except that I disabled all logging (in case the logger was blocking
> for
> > > each
> > > > request, serializing them.)  I'm giving 2G RAM to each JVM.
> > > >
> > > > Does anyone more familiar with Tomcat know what's wrong?  I can't
> imagine
> > > > that Tomcat really can't handle parallel requests.
> > > >
> > >
> 




RE: Parallel requests to Tomcat

Posted by Fuad Efendi <fu...@efendi.ca>.
I have 0-15ms for 50M (millions docs), Tomcat, 8-CPU:
http://www.tokenizer.org
==================================

- something obviously wrong in your case, 130ms is too high. Is it dedicated
server? Disk swapping? Etc.



> -----Original Message-----
> From: Michael [mailto:solrcoder@gmail.com]
> Sent: September-23-09 11:17 AM
> To: solr-user@lucene.apache.org; yonik@lucidimagination.com
> Subject: Re: Parallel requests to Tomcat
> 
> I'm using a Solr 1.4 nightly from around July.  Is that recent enough to
> have the improved reader implementation?
> I'm not sure whether you'd call my operations IO heavy -- each query has
so
> many terms (~50) that even against a 45K document index a query takes
130ms,
> but the entire index is in a ramfs.
> - Michael
> 
> On Tue, Sep 22, 2009 at 8:08 PM, Yonik Seeley
> <yo...@lucidimagination.com>wrote:
> 
> > What version of Solr are you using?
> > Solr1.3 and Lucene 2.4 defaulted to an index reader implementation
> > that had to synchronize, so search operations that are IO "heavy"
> > can't proceed in parallel.  You shouldn't see this with 1.4
> >
> > -Yonik
> > http://www.lucidimagination.com
> >
> >
> >
> > On Tue, Sep 22, 2009 at 4:03 PM, Michael <so...@gmail.com> wrote:
> > > Hi,
> > > I have a Solr+Tomcat installation on an 8 CPU Linux box, and I just
tried
> > > sending parallel requests to it and measuring response time.  I would
> > expect
> > > that it could handle up to 8 parallel requests without significant
> > slowdown
> > > of any individual request.
> > >
> > > Instead, I found that Tomcat is serializing the requests.
> > >
> > > For example, the response time for each of 2 parallel requests is
nearly
> > 2
> > > times that for a single request, and the time for each of 8 parallel
> > > requests is about 4 times that of a single request.
> > >
> > > I am pretty sure this is a Tomcat issue, for when I started 8
identical
> > > instances of Solr+Tomcat on the machine (on 8 different ports), I
could
> > send
> > > one request to each in parallel with only a 20% slowdown (compared to
> > 300%
> > > in a single Tomcat.)
> > >
> > > I'm using the stock Tomcat download with minimal configuration
changes,
> > > except that I disabled all logging (in case the logger was blocking
for
> > each
> > > request, serializing them.)  I'm giving 2G RAM to each JVM.
> > >
> > > Does anyone more familiar with Tomcat know what's wrong?  I can't
imagine
> > > that Tomcat really can't handle parallel requests.
> > >
> >



RE: Parallel requests to Tomcat

Posted by Fuad Efendi <fu...@efendi.ca>.
> I'm not sure whether you'd call my operations IO heavy -- each query has
so
> many terms (~50) that even against a 45K document index a query takes
130ms,
> but the entire index is in a ramfs.



The more terms, the more it takes to find docset intersections (belonging to
each term); something in SOLR/Lucene is still synchronized...

Try to compare with smaller 1-term queries, different terms for parallel
requests...



Re: Parallel requests to Tomcat

Posted by Michael <so...@gmail.com>.
I'm using a Solr 1.4 nightly from around July.  Is that recent enough to
have the improved reader implementation?
I'm not sure whether you'd call my operations IO heavy -- each query has so
many terms (~50) that even against a 45K document index a query takes 130ms,
but the entire index is in a ramfs.
- Michael

On Tue, Sep 22, 2009 at 8:08 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:

> What version of Solr are you using?
> Solr1.3 and Lucene 2.4 defaulted to an index reader implementation
> that had to synchronize, so search operations that are IO "heavy"
> can't proceed in parallel.  You shouldn't see this with 1.4
>
> -Yonik
> http://www.lucidimagination.com
>
>
>
> On Tue, Sep 22, 2009 at 4:03 PM, Michael <so...@gmail.com> wrote:
> > Hi,
> > I have a Solr+Tomcat installation on an 8 CPU Linux box, and I just tried
> > sending parallel requests to it and measuring response time.  I would
> expect
> > that it could handle up to 8 parallel requests without significant
> slowdown
> > of any individual request.
> >
> > Instead, I found that Tomcat is serializing the requests.
> >
> > For example, the response time for each of 2 parallel requests is nearly
> 2
> > times that for a single request, and the time for each of 8 parallel
> > requests is about 4 times that of a single request.
> >
> > I am pretty sure this is a Tomcat issue, for when I started 8 identical
> > instances of Solr+Tomcat on the machine (on 8 different ports), I could
> send
> > one request to each in parallel with only a 20% slowdown (compared to
> 300%
> > in a single Tomcat.)
> >
> > I'm using the stock Tomcat download with minimal configuration changes,
> > except that I disabled all logging (in case the logger was blocking for
> each
> > request, serializing them.)  I'm giving 2G RAM to each JVM.
> >
> > Does anyone more familiar with Tomcat know what's wrong?  I can't imagine
> > that Tomcat really can't handle parallel requests.
> >
>

Re: Parallel requests to Tomcat

Posted by Yonik Seeley <yo...@lucidimagination.com>.
What version of Solr are you using?
Solr1.3 and Lucene 2.4 defaulted to an index reader implementation
that had to synchronize, so search operations that are IO "heavy"
can't proceed in parallel.  You shouldn't see this with 1.4

-Yonik
http://www.lucidimagination.com



On Tue, Sep 22, 2009 at 4:03 PM, Michael <so...@gmail.com> wrote:
> Hi,
> I have a Solr+Tomcat installation on an 8 CPU Linux box, and I just tried
> sending parallel requests to it and measuring response time.  I would expect
> that it could handle up to 8 parallel requests without significant slowdown
> of any individual request.
>
> Instead, I found that Tomcat is serializing the requests.
>
> For example, the response time for each of 2 parallel requests is nearly 2
> times that for a single request, and the time for each of 8 parallel
> requests is about 4 times that of a single request.
>
> I am pretty sure this is a Tomcat issue, for when I started 8 identical
> instances of Solr+Tomcat on the machine (on 8 different ports), I could send
> one request to each in parallel with only a 20% slowdown (compared to 300%
> in a single Tomcat.)
>
> I'm using the stock Tomcat download with minimal configuration changes,
> except that I disabled all logging (in case the logger was blocking for each
> request, serializing them.)  I'm giving 2G RAM to each JVM.
>
> Does anyone more familiar with Tomcat know what's wrong?  I can't imagine
> that Tomcat really can't handle parallel requests.
>