You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Peter Keegan <pe...@gmail.com> on 2006/02/22 20:52:21 UTC

Re: Throughput doesn't increase when using more concurrent threads

I am doing a performance comparison of Lucene on Linux vs Windows.

I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
processors, 64GB RAM). One is running CentOS 4 Linux, the other is running
Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs from Sun.
The Lucene server is using MMapDirectory. I'm running the jvm with
-Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and 7.8GB on
windows.

I'm observing query rates of 330 queries/sec on the Wintel server, but only
200 qps on the Linux box. At first, I suspected a network bottleneck, but
when I 'short-circuited' Lucene, the query rates were identical.

I suspect that there are some things to be tuned in Linux, but I'm not sure
what. Any advice would be appreciated.

Peter



On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
>
> I cranked up the dial on my query tester and was able to get the rate up
> to 325 qps. Unfortunately, the machine died shortly thereafter (memory
> errors :-( ) Hopefully, it was just a coincidence. I haven't measured 64-bit
> indexing speed, yet.
>
> Peter
>
> On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> >
> > Peter Keegan wrote:
> > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and I'm now
> > > getting 250 queries/sec and excellent cpu utilization (equal
> > concurrency on
> > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I wasn't
> > aware
> > > of it.
> > >
> > Wow.  That's fast.
> >
> > Out of interest, does indexing time speed up much on 64-bit hardware?
> > I'm particularly interested in this side of things because for our own
> > application, any query response under half a second is good enough, but
> > the indexing side could always be faster. :-)
> >
> > Daniel
> >
> > --
> > Daniel Noll
> >
> > Nuix Australia Pty Ltd
> > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > Phone: (02) 9280 0699
> > Fax:   (02) 9212 6902
> >
> > This message is intended only for the named recipient. If you are not
> > the intended recipient you are notified that disclosing, copying,
> > distributing or taking any action in reliance on the contents of this
> > message or attachment is strictly prohibited.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Raghavendra Prabhu <rr...@gmail.com>.
Hi

Sorry for the trouble

I was sending my first mail to the group

and replied to this thread and then later on sent a direct mail.

I would like to apologise for the inconvenience caused.

Rgds
Prabhu


On 2/23/06, Otis Gospodnetic <ot...@yahoo.com> wrote:
>
> Hi,
>
> Please ask on the Nutch mailing list (I answered your question in general@
> already).
> Also, please don't steal other people's threads - it's considered inpolite
> for obvious reasons.
>
> Otis
>
>
> ----- Original Message ----
> From: Raghavendra Prabhu <rr...@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Thursday, February 23, 2006 11:10:11 AM
> Subject: Re: Throughput doesn't increase when using more concurrent
> threads
>
> Can nutch be made to use lucene query parser?
>
> Rgds
> Prabhu
>
>
> On 2/23/06, Peter Keegan <pe...@gmail.com> wrote:
> >
> > Hi Otis,
> >
> > The Lucene server is actually CPU and network bound, as the index gets
> > memory mapped pretty quickly. There is little disk activity observed.
> >
> > I was also able to run the server on a Sun box last night with 4 dual
> core
> > opterons (same Linux and JVM) and I'm observing query rates of 400 qps!
> > Has
> > Linux been optimized to run on this hardware? I imagine that Sun's JVM
> has
> > been.
> >
> > Peter
> >
> > On 2/22/06, Otis Gospodnetic <ot...@yahoo.com> wrote:
> > >
> > > Hi,
> > >
> > > Some things that could be different:
> > > - thread scheduling (shouldn't make too much of a difference though)
> > >
> > > --- I would also play with disk IO schedulers, if you can.  CentOS is
> > > based on RedHat, I believe, and RedHat (ext3, really) now has about 4
> > > different IO schedulers that, according to articles I recently read,
> can
> > > have an impact on disk read/write performance.  These schedules can be
> > > specified at mount time, I believe, and maybe at boot time (kernel
> line
> > in
> > > Grub/LILO).
> > >
> > > Otis
> > >
> > >
> > > On 2/22/06, Peter Keegan <pe...@gmail.com> wrote:
> > > > I am doing a performance comparison of Lucene on Linux vs Windows.
> > > >
> > > > I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
> > > > processors, 64GB RAM). One is running CentOS 4 Linux, the other is
> > > running
> > > > Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs
> from
> > > Sun.
> > > > The Lucene server is using MMapDirectory. I'm running the jvm with
> > > > -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and
> > 7.8GBon
> > > > windows.
> > > >
> > > > I'm observing query rates of 330 queries/sec on the Wintel server,
> but
> > > only
> > > > 200 qps on the Linux box. At first, I suspected a network
> bottleneck,
> > > but
> > > > when I 'short-circuited' Lucene, the query rates were identical.
> > > >
> > > > I suspect that there are some things to be tuned in Linux, but I'm
> not
> > > sure
> > > > what. Any advice would be appreciated.
> > > >
> > > > Peter
> > > >
> > > >
> > > >
> > > > On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
> > > > >
> > > > > I cranked up the dial on my query tester and was able to get the
> > rate
> > > up
> > > > > to 325 qps. Unfortunately, the machine died shortly thereafter
> > (memory
> > > > > errors :-( ) Hopefully, it was just a coincidence. I haven't
> > measured
> > > 64-bit
> > > > > indexing speed, yet.
> > > > >
> > > > > Peter
> > > > >
> > > > > On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> > > > > >
> > > > > > Peter Keegan wrote:
> > > > > > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and
> > I'm
> > > now
> > > > > > > getting 250 queries/sec and excellent cpu utilization (equal
> > > > > > concurrency on
> > > > > > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I
> > > wasn't
> > > > > > aware
> > > > > > > of it.
> > > > > > >
> > > > > > Wow.  That's fast.
> > > > > >
> > > > > > Out of interest, does indexing time speed up much on 64-bit
> > > hardware?
> > > > > > I'm particularly interested in this side of things because for
> our
> > > own
> > > > > > application, any query response under half a second is good
> > enough,
> > > but
> > > > > > the indexing side could always be faster. :-)
> > > > > >
> > > > > > Daniel
> > > > > >
> > > > > > --
> > > > > > Daniel Noll
> > > > > >
> > > > > > Nuix Australia Pty Ltd
> > > > > > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > > > > > Phone: (02) 9280 0699
> > > > > > Fax:   (02) 9212 6902
> > > > > >
> > > > > > This message is intended only for the named recipient. If you
> are
> > > not
> > > > > > the intended recipient you are notified that disclosing,
> copying,
> > > > > > distributing or taking any action in reliance on the contents of
> > > this
> > > > > > message or attachment is strictly prohibited.
> > > > > >
> > > > > >
> > > > > >
> > > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > > For additional commands, e-mail:
> java-user-help@lucene.apache.org
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
> >
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi,

Please ask on the Nutch mailing list (I answered your question in general@ already).
Also, please don't steal other people's threads - it's considered inpolite for obvious reasons.

Otis


----- Original Message ----
From: Raghavendra Prabhu <rr...@gmail.com>
To: java-user@lucene.apache.org
Sent: Thursday, February 23, 2006 11:10:11 AM
Subject: Re: Throughput doesn't increase when using more concurrent threads

Can nutch be made to use lucene query parser?

Rgds
Prabhu


On 2/23/06, Peter Keegan <pe...@gmail.com> wrote:
>
> Hi Otis,
>
> The Lucene server is actually CPU and network bound, as the index gets
> memory mapped pretty quickly. There is little disk activity observed.
>
> I was also able to run the server on a Sun box last night with 4 dual core
> opterons (same Linux and JVM) and I'm observing query rates of 400 qps!
> Has
> Linux been optimized to run on this hardware? I imagine that Sun's JVM has
> been.
>
> Peter
>
> On 2/22/06, Otis Gospodnetic <ot...@yahoo.com> wrote:
> >
> > Hi,
> >
> > Some things that could be different:
> > - thread scheduling (shouldn't make too much of a difference though)
> >
> > --- I would also play with disk IO schedulers, if you can.  CentOS is
> > based on RedHat, I believe, and RedHat (ext3, really) now has about 4
> > different IO schedulers that, according to articles I recently read, can
> > have an impact on disk read/write performance.  These schedules can be
> > specified at mount time, I believe, and maybe at boot time (kernel line
> in
> > Grub/LILO).
> >
> > Otis
> >
> >
> > On 2/22/06, Peter Keegan <pe...@gmail.com> wrote:
> > > I am doing a performance comparison of Lucene on Linux vs Windows.
> > >
> > > I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
> > > processors, 64GB RAM). One is running CentOS 4 Linux, the other is
> > running
> > > Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs from
> > Sun.
> > > The Lucene server is using MMapDirectory. I'm running the jvm with
> > > -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and
> 7.8GBon
> > > windows.
> > >
> > > I'm observing query rates of 330 queries/sec on the Wintel server, but
> > only
> > > 200 qps on the Linux box. At first, I suspected a network bottleneck,
> > but
> > > when I 'short-circuited' Lucene, the query rates were identical.
> > >
> > > I suspect that there are some things to be tuned in Linux, but I'm not
> > sure
> > > what. Any advice would be appreciated.
> > >
> > > Peter
> > >
> > >
> > >
> > > On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
> > > >
> > > > I cranked up the dial on my query tester and was able to get the
> rate
> > up
> > > > to 325 qps. Unfortunately, the machine died shortly thereafter
> (memory
> > > > errors :-( ) Hopefully, it was just a coincidence. I haven't
> measured
> > 64-bit
> > > > indexing speed, yet.
> > > >
> > > > Peter
> > > >
> > > > On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> > > > >
> > > > > Peter Keegan wrote:
> > > > > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and
> I'm
> > now
> > > > > > getting 250 queries/sec and excellent cpu utilization (equal
> > > > > concurrency on
> > > > > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I
> > wasn't
> > > > > aware
> > > > > > of it.
> > > > > >
> > > > > Wow.  That's fast.
> > > > >
> > > > > Out of interest, does indexing time speed up much on 64-bit
> > hardware?
> > > > > I'm particularly interested in this side of things because for our
> > own
> > > > > application, any query response under half a second is good
> enough,
> > but
> > > > > the indexing side could always be faster. :-)
> > > > >
> > > > > Daniel
> > > > >
> > > > > --
> > > > > Daniel Noll
> > > > >
> > > > > Nuix Australia Pty Ltd
> > > > > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > > > > Phone: (02) 9280 0699
> > > > > Fax:   (02) 9212 6902
> > > > >
> > > > > This message is intended only for the named recipient. If you are
> > not
> > > > > the intended recipient you are notified that disclosing, copying,
> > > > > distributing or taking any action in reliance on the contents of
> > this
> > > > > message or attachment is strictly prohibited.
> > > > >
> > > > >
> > > > >
> > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > > >
> > > > >
> > > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Throughput doesn't increase when using more concurrent threads

Posted by Raghavendra Prabhu <rr...@gmail.com>.
Can nutch be made to use lucene query parser?

Rgds
Prabhu


On 2/23/06, Peter Keegan <pe...@gmail.com> wrote:
>
> Hi Otis,
>
> The Lucene server is actually CPU and network bound, as the index gets
> memory mapped pretty quickly. There is little disk activity observed.
>
> I was also able to run the server on a Sun box last night with 4 dual core
> opterons (same Linux and JVM) and I'm observing query rates of 400 qps!
> Has
> Linux been optimized to run on this hardware? I imagine that Sun's JVM has
> been.
>
> Peter
>
> On 2/22/06, Otis Gospodnetic <ot...@yahoo.com> wrote:
> >
> > Hi,
> >
> > Some things that could be different:
> > - thread scheduling (shouldn't make too much of a difference though)
> >
> > --- I would also play with disk IO schedulers, if you can.  CentOS is
> > based on RedHat, I believe, and RedHat (ext3, really) now has about 4
> > different IO schedulers that, according to articles I recently read, can
> > have an impact on disk read/write performance.  These schedules can be
> > specified at mount time, I believe, and maybe at boot time (kernel line
> in
> > Grub/LILO).
> >
> > Otis
> >
> >
> > On 2/22/06, Peter Keegan <pe...@gmail.com> wrote:
> > > I am doing a performance comparison of Lucene on Linux vs Windows.
> > >
> > > I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
> > > processors, 64GB RAM). One is running CentOS 4 Linux, the other is
> > running
> > > Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs from
> > Sun.
> > > The Lucene server is using MMapDirectory. I'm running the jvm with
> > > -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and
> 7.8GBon
> > > windows.
> > >
> > > I'm observing query rates of 330 queries/sec on the Wintel server, but
> > only
> > > 200 qps on the Linux box. At first, I suspected a network bottleneck,
> > but
> > > when I 'short-circuited' Lucene, the query rates were identical.
> > >
> > > I suspect that there are some things to be tuned in Linux, but I'm not
> > sure
> > > what. Any advice would be appreciated.
> > >
> > > Peter
> > >
> > >
> > >
> > > On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
> > > >
> > > > I cranked up the dial on my query tester and was able to get the
> rate
> > up
> > > > to 325 qps. Unfortunately, the machine died shortly thereafter
> (memory
> > > > errors :-( ) Hopefully, it was just a coincidence. I haven't
> measured
> > 64-bit
> > > > indexing speed, yet.
> > > >
> > > > Peter
> > > >
> > > > On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> > > > >
> > > > > Peter Keegan wrote:
> > > > > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and
> I'm
> > now
> > > > > > getting 250 queries/sec and excellent cpu utilization (equal
> > > > > concurrency on
> > > > > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I
> > wasn't
> > > > > aware
> > > > > > of it.
> > > > > >
> > > > > Wow.  That's fast.
> > > > >
> > > > > Out of interest, does indexing time speed up much on 64-bit
> > hardware?
> > > > > I'm particularly interested in this side of things because for our
> > own
> > > > > application, any query response under half a second is good
> enough,
> > but
> > > > > the indexing side could always be faster. :-)
> > > > >
> > > > > Daniel
> > > > >
> > > > > --
> > > > > Daniel Noll
> > > > >
> > > > > Nuix Australia Pty Ltd
> > > > > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > > > > Phone: (02) 9280 0699
> > > > > Fax:   (02) 9212 6902
> > > > >
> > > > > This message is intended only for the named recipient. If you are
> > not
> > > > > the intended recipient you are notified that disclosing, copying,
> > > > > distributing or taking any action in reliance on the contents of
> > this
> > > > > message or attachment is strictly prohibited.
> > > > >
> > > > >
> > > > >
> > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > > >
> > > > >
> > > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Dan Armbrust <da...@gmail.com>.
I would give the IBM or blackdown JVM a try on linux - I've seen pretty 
wide variance in their speed on different operations.

Sometimes better than Sun, sometimes worse - it depended on the task (I 
did some adhoc tests at one point that showed sun was faster for 
indexing, but IBM was faster for querying - but that was quite a while ago.

Dan


-- 
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.
Hi Otis,

The Lucene server is actually CPU and network bound, as the index gets
memory mapped pretty quickly. There is little disk activity observed.

I was also able to run the server on a Sun box last night with 4 dual core
opterons (same Linux and JVM) and I'm observing query rates of 400 qps! Has
Linux been optimized to run on this hardware? I imagine that Sun's JVM has
been.

Peter

On 2/22/06, Otis Gospodnetic <ot...@yahoo.com> wrote:
>
> Hi,
>
> Some things that could be different:
> - thread scheduling (shouldn't make too much of a difference though)
>
> --- I would also play with disk IO schedulers, if you can.  CentOS is
> based on RedHat, I believe, and RedHat (ext3, really) now has about 4
> different IO schedulers that, according to articles I recently read, can
> have an impact on disk read/write performance.  These schedules can be
> specified at mount time, I believe, and maybe at boot time (kernel line in
> Grub/LILO).
>
> Otis
>
>
> On 2/22/06, Peter Keegan <pe...@gmail.com> wrote:
> > I am doing a performance comparison of Lucene on Linux vs Windows.
> >
> > I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
> > processors, 64GB RAM). One is running CentOS 4 Linux, the other is
> running
> > Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs from
> Sun.
> > The Lucene server is using MMapDirectory. I'm running the jvm with
> > -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and 7.8GBon
> > windows.
> >
> > I'm observing query rates of 330 queries/sec on the Wintel server, but
> only
> > 200 qps on the Linux box. At first, I suspected a network bottleneck,
> but
> > when I 'short-circuited' Lucene, the query rates were identical.
> >
> > I suspect that there are some things to be tuned in Linux, but I'm not
> sure
> > what. Any advice would be appreciated.
> >
> > Peter
> >
> >
> >
> > On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
> > >
> > > I cranked up the dial on my query tester and was able to get the rate
> up
> > > to 325 qps. Unfortunately, the machine died shortly thereafter (memory
> > > errors :-( ) Hopefully, it was just a coincidence. I haven't measured
> 64-bit
> > > indexing speed, yet.
> > >
> > > Peter
> > >
> > > On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> > > >
> > > > Peter Keegan wrote:
> > > > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and I'm
> now
> > > > > getting 250 queries/sec and excellent cpu utilization (equal
> > > > concurrency on
> > > > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I
> wasn't
> > > > aware
> > > > > of it.
> > > > >
> > > > Wow.  That's fast.
> > > >
> > > > Out of interest, does indexing time speed up much on 64-bit
> hardware?
> > > > I'm particularly interested in this side of things because for our
> own
> > > > application, any query response under half a second is good enough,
> but
> > > > the indexing side could always be faster. :-)
> > > >
> > > > Daniel
> > > >
> > > > --
> > > > Daniel Noll
> > > >
> > > > Nuix Australia Pty Ltd
> > > > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > > > Phone: (02) 9280 0699
> > > > Fax:   (02) 9212 6902
> > > >
> > > > This message is intended only for the named recipient. If you are
> not
> > > > the intended recipient you are notified that disclosing, copying,
> > > > distributing or taking any action in reliance on the contents of
> this
> > > > message or attachment is strictly prohibited.
> > > >
> > > >
> > > >
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > >
> > > >
> > >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi,

Some things that could be different:
- thread scheduling (shouldn't make too much of a difference though)

--- I would also play with disk IO schedulers, if you can.  CentOS is based on RedHat, I believe, and RedHat (ext3, really) now has about 4 different IO schedulers that, according to articles I recently read, can have an impact on disk read/write performance.  These schedules can be specified at mount time, I believe, and maybe at boot time (kernel line in Grub/LILO).

Otis


On 2/22/06, Peter Keegan <pe...@gmail.com> wrote:
> I am doing a performance comparison of Lucene on Linux vs Windows.
>
> I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
> processors, 64GB RAM). One is running CentOS 4 Linux, the other is running
> Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs from Sun.
> The Lucene server is using MMapDirectory. I'm running the jvm with
> -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and 7.8GB on
> windows.
>
> I'm observing query rates of 330 queries/sec on the Wintel server, but only
> 200 qps on the Linux box. At first, I suspected a network bottleneck, but
> when I 'short-circuited' Lucene, the query rates were identical.
>
> I suspect that there are some things to be tuned in Linux, but I'm not sure
> what. Any advice would be appreciated.
>
> Peter
>
>
>
> On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
> >
> > I cranked up the dial on my query tester and was able to get the rate up
> > to 325 qps. Unfortunately, the machine died shortly thereafter (memory
> > errors :-( ) Hopefully, it was just a coincidence. I haven't measured 64-bit
> > indexing speed, yet.
> >
> > Peter
> >
> > On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> > >
> > > Peter Keegan wrote:
> > > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and I'm now
> > > > getting 250 queries/sec and excellent cpu utilization (equal
> > > concurrency on
> > > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I wasn't
> > > aware
> > > > of it.
> > > >
> > > Wow.  That's fast.
> > >
> > > Out of interest, does indexing time speed up much on 64-bit hardware?
> > > I'm particularly interested in this side of things because for our own
> > > application, any query response under half a second is good enough, but
> > > the indexing side could always be faster. :-)
> > >
> > > Daniel
> > >
> > > --
> > > Daniel Noll
> > >
> > > Nuix Australia Pty Ltd
> > > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > > Phone: (02) 9280 0699
> > > Fax:   (02) 9212 6902
> > >
> > > This message is intended only for the named recipient. If you are not
> > > the intended recipient you are notified that disclosing, copying,
> > > distributing or taking any action in reliance on the contents of this
> > > message or attachment is strictly prohibited.
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.
Good question. 'Top' reports the jvm at 99.9% CPU, but the individual CPUs
(top/1) don't seem to add up to 99.9. This server is actually 2 - 8 CPU
servers whose backplanes are cabled together, so there may be some issue
here.  The network load is heavy, but doesn't seem to be the bottleneck (on
the Opteron server, the network is very close to being the bottleneck).
There is an expected monitor lock on an ArrayBlockingQueue, but it isn't
starving that thread.
So, at this point, I'm suspicious about the hardware setup.

Peter

On 3/17/06, Doug Cutting <cu...@apache.org> wrote:
>
> Peter Keegan wrote:
> > I did some additional testing with Chris's patch and mine (based on
> Doug's
> > note) vs. no patch and found that all 3 produced the same throughput -
> about
> > 330 qps - over a longer period.
>
> Was CPU utilizaton 100%?  If not, where do you think the bottleneck now
> is?  Network?  Or some other Java monitor contention?
>
> Doug
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Doug Cutting <cu...@apache.org>.
Peter Keegan wrote:
> I did some additional testing with Chris's patch and mine (based on Doug's
> note) vs. no patch and found that all 3 produced the same throughput - about
> 330 qps - over a longer period.

Was CPU utilizaton 100%?  If not, where do you think the bottleneck now 
is?  Network?  Or some other Java monitor contention?

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.
I did some additional testing with Chris's patch and mine (based on Doug's
note) vs. no patch and found that all 3 produced the same throughput - about
330 qps - over a longer period. So, there seems to be a point of diminishing
returns to adding more cpus. The dual core Opterons (8 cpu) still win
handily at 400 qps.

Peter


On 3/13/06, Peter Keegan <pe...@gmail.com> wrote:
>
> Chris,
> My apologies - this error was apparently caused by a file format mismatch
> (probably line endings).
> Thanks,
> Peter
>
>
> On 3/13/06, Peter Keegan <pe...@gmail.com> wrote:
> >
> > Chris,
> >
> > Should this patch work against the current code base? I'm getting this
> > error:
> >
> > D:\lucene-1.9>patch -b -p0 -i nio-lucene-1.9.patch
> > patching file src/java/org/apache/lucene/index/CompoundFileReader.java
> > patching file src/java/org/apache/lucene/index/FieldsReader.java
> > missing header for unified diff at line 45 of patch
> > can't find file to patch at input line 45
> > Perhaps you used the wrong -p or --strip option?
> > The text leading up to this was:
> > --------------------------
> > | +47,9 @@
> > |     fieldsStream = d.openInput(segment + ".fdt");
> > |     indexStream = d.openInput(segment + ".fdx");
> > |
> > |+    fstream = new ThreadStream(fieldsStream);
> > |+    istream = new ThreadStream(indexStream);
> > |+
> > |     size = (int)(indexStream.length() / 8);
> > |   }
> > |
> > --------------------------
> >
> > Thanks,
> > Peter
> >
> >
> >
> > On 3/10/06, Chris Lamprecht <cl...@gmail.com> wrote:
> > >
> > > Peter,
> > >
> > > I think this is similar to the patch in this bugzilla task:
> > >
> > > http://issues.apache.org/bugzilla/show_bug.cgi?id=35838
> > > the patch itself is
> > > http://issues.apache.org/bugzilla/attachment.cgi?id=15757
> > >
> > > (BTW does JIRA have a way to display the patch diffs?)
> > >
> > > The above patch also has a change to SegmentReader to avoid
> > > synchronization on isDeleted().  However, with that patch, you no
> > > longer have the guarantee that one thread will immediately see
> > > deletions by another thread.  This was fine for my purposes, and
> > > resulted in a big performance boost when there were deleted documents,
> > >
> > > but it may not be "correct" for others' needs.
> > >
> > > -chris
> > > On 3/10/06, Peter Keegan <peterlkeegan@gmail.com > wrote:
> > > > > 3. Use the ThreadLocal's FieldReader in the document() method.
> > > >
> > > > As I understand it, this means that the document method no longer
> > > needs to
> > > > be synchronized, right?
> > > >
> > > > I've made these changes and it does appear to improve performance.
> > > Random
> > > > snapshots of the stack traces show only an occasional lock in
> > > 'isDeleted'.
> > > > Mostly, though, the threads are busy scoring and adding results to
> > > priority
> > > > queues, which is great. I've included some sample stacks, below.
> > > I'll report
> > > > the new query rates after it has run for at least overnight, and I'd
> > > be
> > > > happy submit these changes to the lucene committers, if interested.
> > > >
> > > > Peter
> > > >
> > > >
> > > > Sample stack traces:
> > > >
> > > > "QueryThread group 1,#8" prio=1 tid=0x0000002ce48eeb80 nid=0x6b87
> > > runnable
> > > > [0x0000000043887000..0x0000000043887bb0]
> > > >     at org.apache.lucene.search.FieldSortedHitQueue.lessThan(
> > > > FieldSortedHitQueue.java:108)
> > > >     at org.apache.lucene.util.PriorityQueue.insert(
> > > PriorityQueue.java :61)
> > > >     at org.apache.lucene.search.FieldSortedHitQueue.insert(
> > > > FieldSortedHitQueue.java:85)
> > > >     at org.apache.lucene.search.FieldSortedHitQueue.insert(
> > > > FieldSortedHitQueue.java:92)
> > > >     at org.apache.lucene.search.TopFieldDocCollector.collect(
> > > > TopFieldDocCollector.java:51)
> > > >     at org.apache.lucene.search.TermScorer.score(TermScorer.java:75)
> > > >     at org.apache.lucene.search.TermScorer.score (TermScorer.java
> > > :60)
> > > >     at org.apache.lucene.search.IndexSearcher.search(
> > > IndexSearcher.java:132)
> > > >     at org.apache.lucene.search.IndexSearcher.search(
> > > IndexSearcher.java:110)
> > > >     at org.apache.lucene.search.MultiSearcher.search (
> > > MultiSearcher.java:225)
> > > >     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> > > >     at org.apache.lucene.search.Hits.<init>(Hits.java:52)
> > > >     at org.apache.lucene.search.Searcher.search (Searcher.java:62)
> > > >
> > > > "QueryThread group 1,#5" prio=1 tid=0x0000002ce4d659f0 nid=0x6b84
> > > runnable
> > > > [0x0000000043584000..0x0000000043584d30]
> > > >     at org.apache.lucene.search.TermScorer.score (TermScorer.java
> > > :75)
> > > >     at org.apache.lucene.search.TermScorer.score(TermScorer.java:60)
> > > >     at org.apache.lucene.search.IndexSearcher.search(
> > > IndexSearcher.java:132)
> > > >     at org.apache.lucene.search.IndexSearcher.search (
> > > IndexSearcher.java:110)
> > > >     at org.apache.lucene.search.MultiSearcher.search(
> > > MultiSearcher.java:225)
> > > >     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> > > >     at org.apache.lucene.search.Hits .<init>(Hits.java:52)
> > > >     at org.apache.lucene.search.Searcher.search(Searcher.java:62)
> > > >
> > > > "QueryThread group 1,#4" prio=1 tid=0x0000002ce10afd50 nid=0x6b83
> > > runnable
> > > > [0x0000000043483000..0x0000000043483db0]
> > > >     at org.apache.lucene.store.MMapDirectory$MMapIndexInput.readByte
> > > (
> > > > MMapDirectory.java:46)
> > > >     at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java
> > > :56)
> > > >     at org.apache.lucene.index.SegmentTermDocs.next (
> > > SegmentTermDocs.java
> > > > :101)
> > > >     at org.apache.lucene.index.SegmentTermDocs.skipTo(
> > > SegmentTermDocs.java
> > > > :194)
> > > >     at org.apache.lucene.search.TermScorer.skipTo(TermScorer.java
> > > :144)
> > > >     at org.apache.lucene.search.ConjunctionScorer.doNext(
> > > > ConjunctionScorer.java:56)
> > > >     at org.apache.lucene.search.ConjunctionScorer.next(
> > > > ConjunctionScorer.java:51)
> > > >     at org.apache.lucene.search.BooleanScorer2.score (
> > > BooleanScorer2.java
> > > > :290)
> > > >     at org.apache.lucene.search.IndexSearcher.search(
> > > IndexSearcher.java:132)
> > > >     at org.apache.lucene.search.IndexSearcher.search(
> > > IndexSearcher.java:110)
> > > >     at org.apache.lucene.search.MultiSearcher.search (
> > > MultiSearcher.java:225)
> > > >     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> > > >     at org.apache.lucene.search.Hits.<init>(Hits.java:52)
> > > >     at org.apache.lucene.search.Searcher.search (Searcher.java:62)
> > > >
> > > > "QueryThread group 1,#3" prio=1 tid=0x0000002ce48959f0 nid=0x6b82
> > > runnable
> > > > [0x0000000043382000..0x0000000043382e30]
> > > >     at java.util.LinkedList.listIterator(LinkedList.java :523)
> > > >     at java.util.AbstractList.listIterator(AbstractList.java:349)
> > > >     at java.util.AbstractSequentialList.iterator(
> > > AbstractSequentialList.java
> > > > :250)
> > > >     at org.apache.lucene.search.ConjunctionScorer.score (
> > > > ConjunctionScorer.java:80)
> > > >     at org.apache.lucene.search.BooleanScorer2$2.score(
> > > BooleanScorer2.java
> > > > :186)
> > > >     at org.apache.lucene.search.BooleanScorer2.score(
> > > BooleanScorer2.java
> > > > :327)
> > > >     at org.apache.lucene.search.BooleanScorer2.score(
> > > BooleanScorer2.java
> > > > :291)
> > > >     at org.apache.lucene.search.IndexSearcher.search(
> > > IndexSearcher.java:132)
> > > >     at org.apache.lucene.search.IndexSearcher.search (
> > > IndexSearcher.java:110)
> > > >     at org.apache.lucene.search.MultiSearcher.search(
> > > MultiSearcher.java:225)
> > > >     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> > > >     at org.apache.lucene.search.Hits .<init>(Hits.java:52)
> > > >     at org.apache.lucene.search.Searcher.search(Searcher.java:62)
> > > >
> > > >
> > > > On 3/7/06, Doug Cutting < cutting@apache.org> wrote:
> > > > >
> > > > > Peter Keegan wrote:
> > > > > > I ran a query performance tester against 8-cpu and 16-cpu Xeon
> > > servers
> > > > > > (16/32 cpu hyperthreaded). on Linux. Here are the results:
> > > > > >
> > > > > > 8-cpu:  275 qps
> > > > > > 16-cpu: 305 qps
> > > > > > (the dual-core Opteron servers are still faster)
> > > > > >
> > > > > > Here is the stack trace of 8 of the 16 query threads during the
> > > test:
> > > > > >
> > > > > >         at org.apache.lucene.index.SegmentReader.document(
> > > > > SegmentReader.java
> > > > > > :281)
> > > > > >         - waiting to lock <0x0000002adf5b2110> (a
> > > > > > org.apache.lucene.index.SegmentReader)
> > > > > >         at org.apache.lucene.search.IndexSearcher.doc(
> > > IndexSearcher.java
> > > > > :83)
> > > > > >         at org.apache.lucene.search.MultiSearcher.doc (
> > > MultiSearcher.java
> > > > > > :146)
> > > > > >         at org.apache.lucene.search.Hits.doc(Hits.java:103)
> > > > > >
> > > > > > SegmentReader.document is a synchronized method. I have one
> > > stored field
> > > > > > (binary, uncompressed) with and average length of 0.5Kb. The
> > > retrieval
> > > > > of
> > > > > > this stored field is within this synchronized code. Since I am
> > > using
> > > > > > MMapDirectory, does this retrieval need to be synchronized?
> > > > >
> > > > > Yes, since in FieldReader the file positions must be synchronized.
> > > > >
> > > > > The way to avoid this would be to:
> > > > >
> > > > > 1. Add a clone() method to FieldReader that clones it's two
> > > IndexInputs.
> > > > > 2. Add a ThreadLocal to SegmentReader whose value is a cloned
> > > FieldReader.
> > > > > 3. Use the ThreadLocal's FieldReader in the document() method.
> > > > >
> > > > > TermInfosReader has a similar optimization, using a ThreadLocal
> > > > > containing a SegmentTermEnum for each thread.
> > > > >
> > > > > Doug
> > > > >
> > > > >
> > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > > >
> > > > >
> > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.
Chris,
My apologies - this error was apparently caused by a file format mismatch
(probably line endings).
Thanks,
Peter

On 3/13/06, Peter Keegan <pe...@gmail.com> wrote:
>
> Chris,
>
> Should this patch work against the current code base? I'm getting this
> error:
>
> D:\lucene-1.9>patch -b -p0 -i nio-lucene-1.9.patch
> patching file src/java/org/apache/lucene/index/CompoundFileReader.java
> patching file src/java/org/apache/lucene/index/FieldsReader.java
> missing header for unified diff at line 45 of patch
> can't find file to patch at input line 45
> Perhaps you used the wrong -p or --strip option?
> The text leading up to this was:
> --------------------------
> | +47,9 @@
> |     fieldsStream = d.openInput(segment + ".fdt");
> |     indexStream = d.openInput(segment + ".fdx");
> |
> |+    fstream = new ThreadStream(fieldsStream);
> |+    istream = new ThreadStream(indexStream);
> |+
> |     size = (int)(indexStream.length() / 8);
> |   }
> |
> --------------------------
>
> Thanks,
> Peter
>
>
>
> On 3/10/06, Chris Lamprecht <cl...@gmail.com> wrote:
> >
> > Peter,
> >
> > I think this is similar to the patch in this bugzilla task:
> >
> > http://issues.apache.org/bugzilla/show_bug.cgi?id=35838
> > the patch itself is
> > http://issues.apache.org/bugzilla/attachment.cgi?id=15757
> >
> > (BTW does JIRA have a way to display the patch diffs?)
> >
> > The above patch also has a change to SegmentReader to avoid
> > synchronization on isDeleted().  However, with that patch, you no
> > longer have the guarantee that one thread will immediately see
> > deletions by another thread.  This was fine for my purposes, and
> > resulted in a big performance boost when there were deleted documents,
> > but it may not be "correct" for others' needs.
> >
> > -chris
> > On 3/10/06, Peter Keegan <pe...@gmail.com> wrote:
> > > > 3. Use the ThreadLocal's FieldReader in the document() method.
> > >
> > > As I understand it, this means that the document method no longer
> > needs to
> > > be synchronized, right?
> > >
> > > I've made these changes and it does appear to improve performance.
> > Random
> > > snapshots of the stack traces show only an occasional lock in
> > 'isDeleted'.
> > > Mostly, though, the threads are busy scoring and adding results to
> > priority
> > > queues, which is great. I've included some sample stacks, below. I'll
> > report
> > > the new query rates after it has run for at least overnight, and I'd
> > be
> > > happy submit these changes to the lucene committers, if interested.
> > >
> > > Peter
> > >
> > >
> > > Sample stack traces:
> > >
> > > "QueryThread group 1,#8" prio=1 tid=0x0000002ce48eeb80 nid=0x6b87
> > runnable
> > > [0x0000000043887000..0x0000000043887bb0]
> > >     at org.apache.lucene.search.FieldSortedHitQueue.lessThan(
> > > FieldSortedHitQueue.java:108)
> > >     at org.apache.lucene.util.PriorityQueue.insert(PriorityQueue.java:61)
> > >     at org.apache.lucene.search.FieldSortedHitQueue.insert(
> > > FieldSortedHitQueue.java:85)
> > >     at org.apache.lucene.search.FieldSortedHitQueue.insert(
> > > FieldSortedHitQueue.java:92)
> > >     at org.apache.lucene.search.TopFieldDocCollector.collect(
> > > TopFieldDocCollector.java:51)
> > >     at org.apache.lucene.search.TermScorer.score(TermScorer.java:75)
> > >     at org.apache.lucene.search.TermScorer.score (TermScorer.java:60)
> > >     at org.apache.lucene.search.IndexSearcher.search(
> > IndexSearcher.java:132)
> > >     at org.apache.lucene.search.IndexSearcher.search(
> > IndexSearcher.java:110)
> > >     at org.apache.lucene.search.MultiSearcher.search (
> > MultiSearcher.java:225)
> > >     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> > >     at org.apache.lucene.search.Hits.<init>(Hits.java:52)
> > >     at org.apache.lucene.search.Searcher.search (Searcher.java:62)
> > >
> > > "QueryThread group 1,#5" prio=1 tid=0x0000002ce4d659f0 nid=0x6b84
> > runnable
> > > [0x0000000043584000..0x0000000043584d30]
> > >     at org.apache.lucene.search.TermScorer.score (TermScorer.java:75)
> > >     at org.apache.lucene.search.TermScorer.score(TermScorer.java:60)
> > >     at org.apache.lucene.search.IndexSearcher.search(
> > IndexSearcher.java:132)
> > >     at org.apache.lucene.search.IndexSearcher.search (
> > IndexSearcher.java:110)
> > >     at org.apache.lucene.search.MultiSearcher.search(
> > MultiSearcher.java:225)
> > >     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> > >     at org.apache.lucene.search.Hits .<init>(Hits.java:52)
> > >     at org.apache.lucene.search.Searcher.search(Searcher.java:62)
> > >
> > > "QueryThread group 1,#4" prio=1 tid=0x0000002ce10afd50 nid=0x6b83
> > runnable
> > > [0x0000000043483000..0x0000000043483db0]
> > >     at org.apache.lucene.store.MMapDirectory$MMapIndexInput.readByte(
> > > MMapDirectory.java:46)
> > >     at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:56)
> > >     at org.apache.lucene.index.SegmentTermDocs.next (
> > SegmentTermDocs.java
> > > :101)
> > >     at org.apache.lucene.index.SegmentTermDocs.skipTo(
> > SegmentTermDocs.java
> > > :194)
> > >     at org.apache.lucene.search.TermScorer.skipTo(TermScorer.java:144)
> > >     at org.apache.lucene.search.ConjunctionScorer.doNext(
> > > ConjunctionScorer.java:56)
> > >     at org.apache.lucene.search.ConjunctionScorer.next(
> > > ConjunctionScorer.java:51)
> > >     at org.apache.lucene.search.BooleanScorer2.score (
> > BooleanScorer2.java
> > > :290)
> > >     at org.apache.lucene.search.IndexSearcher.search(
> > IndexSearcher.java:132)
> > >     at org.apache.lucene.search.IndexSearcher.search(
> > IndexSearcher.java:110)
> > >     at org.apache.lucene.search.MultiSearcher.search (
> > MultiSearcher.java:225)
> > >     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> > >     at org.apache.lucene.search.Hits.<init>(Hits.java:52)
> > >     at org.apache.lucene.search.Searcher.search (Searcher.java:62)
> > >
> > > "QueryThread group 1,#3" prio=1 tid=0x0000002ce48959f0 nid=0x6b82
> > runnable
> > > [0x0000000043382000..0x0000000043382e30]
> > >     at java.util.LinkedList.listIterator(LinkedList.java :523)
> > >     at java.util.AbstractList.listIterator(AbstractList.java:349)
> > >     at java.util.AbstractSequentialList.iterator(
> > AbstractSequentialList.java
> > > :250)
> > >     at org.apache.lucene.search.ConjunctionScorer.score (
> > > ConjunctionScorer.java:80)
> > >     at org.apache.lucene.search.BooleanScorer2$2.score(
> > BooleanScorer2.java
> > > :186)
> > >     at org.apache.lucene.search.BooleanScorer2.score(
> > BooleanScorer2.java
> > > :327)
> > >     at org.apache.lucene.search.BooleanScorer2.score(
> > BooleanScorer2.java
> > > :291)
> > >     at org.apache.lucene.search.IndexSearcher.search(
> > IndexSearcher.java:132)
> > >     at org.apache.lucene.search.IndexSearcher.search (
> > IndexSearcher.java:110)
> > >     at org.apache.lucene.search.MultiSearcher.search(
> > MultiSearcher.java:225)
> > >     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> > >     at org.apache.lucene.search.Hits .<init>(Hits.java:52)
> > >     at org.apache.lucene.search.Searcher.search(Searcher.java:62)
> > >
> > >
> > > On 3/7/06, Doug Cutting <cu...@apache.org> wrote:
> > > >
> > > > Peter Keegan wrote:
> > > > > I ran a query performance tester against 8-cpu and 16-cpu Xeon
> > servers
> > > > > (16/32 cpu hyperthreaded). on Linux. Here are the results:
> > > > >
> > > > > 8-cpu:  275 qps
> > > > > 16-cpu: 305 qps
> > > > > (the dual-core Opteron servers are still faster)
> > > > >
> > > > > Here is the stack trace of 8 of the 16 query threads during the
> > test:
> > > > >
> > > > >         at org.apache.lucene.index.SegmentReader.document(
> > > > SegmentReader.java
> > > > > :281)
> > > > >         - waiting to lock <0x0000002adf5b2110> (a
> > > > > org.apache.lucene.index.SegmentReader)
> > > > >         at org.apache.lucene.search.IndexSearcher.doc(
> > IndexSearcher.java
> > > > :83)
> > > > >         at org.apache.lucene.search.MultiSearcher.doc (
> > MultiSearcher.java
> > > > > :146)
> > > > >         at org.apache.lucene.search.Hits.doc(Hits.java:103)
> > > > >
> > > > > SegmentReader.document is a synchronized method. I have one stored
> > field
> > > > > (binary, uncompressed) with and average length of 0.5Kb. The
> > retrieval
> > > > of
> > > > > this stored field is within this synchronized code. Since I am
> > using
> > > > > MMapDirectory, does this retrieval need to be synchronized?
> > > >
> > > > Yes, since in FieldReader the file positions must be synchronized.
> > > >
> > > > The way to avoid this would be to:
> > > >
> > > > 1. Add a clone() method to FieldReader that clones it's two
> > IndexInputs.
> > > > 2. Add a ThreadLocal to SegmentReader whose value is a cloned
> > FieldReader.
> > > > 3. Use the ThreadLocal's FieldReader in the document() method.
> > > >
> > > > TermInfosReader has a similar optimization, using a ThreadLocal
> > > > containing a SegmentTermEnum for each thread.
> > > >
> > > > Doug
> > > >
> > > >
> > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > >
> > > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.
Chris,

Should this patch work against the current code base? I'm getting this
error:

D:\lucene-1.9>patch -b -p0 -i nio-lucene-1.9.patch
patching file src/java/org/apache/lucene/index/CompoundFileReader.java
patching file src/java/org/apache/lucene/index/FieldsReader.java
missing header for unified diff at line 45 of patch
can't find file to patch at input line 45
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
| +47,9 @@
|     fieldsStream = d.openInput(segment + ".fdt");
|     indexStream = d.openInput(segment + ".fdx");
|
|+    fstream = new ThreadStream(fieldsStream);
|+    istream = new ThreadStream(indexStream);
|+
|     size = (int)(indexStream.length() / 8);
|   }
|
--------------------------

Thanks,
Peter


On 3/10/06, Chris Lamprecht <cl...@gmail.com> wrote:
>
> Peter,
>
> I think this is similar to the patch in this bugzilla task:
>
> http://issues.apache.org/bugzilla/show_bug.cgi?id=35838
> the patch itself is
> http://issues.apache.org/bugzilla/attachment.cgi?id=15757
>
> (BTW does JIRA have a way to display the patch diffs?)
>
> The above patch also has a change to SegmentReader to avoid
> synchronization on isDeleted().  However, with that patch, you no
> longer have the guarantee that one thread will immediately see
> deletions by another thread.  This was fine for my purposes, and
> resulted in a big performance boost when there were deleted documents,
> but it may not be "correct" for others' needs.
>
> -chris
> On 3/10/06, Peter Keegan <pe...@gmail.com> wrote:
> > > 3. Use the ThreadLocal's FieldReader in the document() method.
> >
> > As I understand it, this means that the document method no longer needs
> to
> > be synchronized, right?
> >
> > I've made these changes and it does appear to improve performance.
> Random
> > snapshots of the stack traces show only an occasional lock in
> 'isDeleted'.
> > Mostly, though, the threads are busy scoring and adding results to
> priority
> > queues, which is great. I've included some sample stacks, below. I'll
> report
> > the new query rates after it has run for at least overnight, and I'd be
> > happy submit these changes to the lucene committers, if interested.
> >
> > Peter
> >
> >
> > Sample stack traces:
> >
> > "QueryThread group 1,#8" prio=1 tid=0x0000002ce48eeb80 nid=0x6b87
> runnable
> > [0x0000000043887000..0x0000000043887bb0]
> >     at org.apache.lucene.search.FieldSortedHitQueue.lessThan(
> > FieldSortedHitQueue.java:108)
> >     at org.apache.lucene.util.PriorityQueue.insert(PriorityQueue.java
> :61)
> >     at org.apache.lucene.search.FieldSortedHitQueue.insert(
> > FieldSortedHitQueue.java:85)
> >     at org.apache.lucene.search.FieldSortedHitQueue.insert(
> > FieldSortedHitQueue.java:92)
> >     at org.apache.lucene.search.TopFieldDocCollector.collect(
> > TopFieldDocCollector.java:51)
> >     at org.apache.lucene.search.TermScorer.score(TermScorer.java:75)
> >     at org.apache.lucene.search.TermScorer.score(TermScorer.java:60)
> >     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java
> :132)
> >     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java
> :110)
> >     at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java
> :225)
> >     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> >     at org.apache.lucene.search.Hits.<init>(Hits.java:52)
> >     at org.apache.lucene.search.Searcher.search(Searcher.java:62)
> >
> > "QueryThread group 1,#5" prio=1 tid=0x0000002ce4d659f0 nid=0x6b84
> runnable
> > [0x0000000043584000..0x0000000043584d30]
> >     at org.apache.lucene.search.TermScorer.score(TermScorer.java:75)
> >     at org.apache.lucene.search.TermScorer.score(TermScorer.java:60)
> >     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java
> :132)
> >     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java
> :110)
> >     at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java
> :225)
> >     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> >     at org.apache.lucene.search.Hits.<init>(Hits.java:52)
> >     at org.apache.lucene.search.Searcher.search(Searcher.java:62)
> >
> > "QueryThread group 1,#4" prio=1 tid=0x0000002ce10afd50 nid=0x6b83
> runnable
> > [0x0000000043483000..0x0000000043483db0]
> >     at org.apache.lucene.store.MMapDirectory$MMapIndexInput.readByte(
> > MMapDirectory.java:46)
> >     at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:56)
> >     at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java
> > :101)
> >     at org.apache.lucene.index.SegmentTermDocs.skipTo(
> SegmentTermDocs.java
> > :194)
> >     at org.apache.lucene.search.TermScorer.skipTo(TermScorer.java:144)
> >     at org.apache.lucene.search.ConjunctionScorer.doNext(
> > ConjunctionScorer.java:56)
> >     at org.apache.lucene.search.ConjunctionScorer.next(
> > ConjunctionScorer.java:51)
> >     at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java
> > :290)
> >     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java
> :132)
> >     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java
> :110)
> >     at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java
> :225)
> >     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> >     at org.apache.lucene.search.Hits.<init>(Hits.java:52)
> >     at org.apache.lucene.search.Searcher.search(Searcher.java:62)
> >
> > "QueryThread group 1,#3" prio=1 tid=0x0000002ce48959f0 nid=0x6b82
> runnable
> > [0x0000000043382000..0x0000000043382e30]
> >     at java.util.LinkedList.listIterator(LinkedList.java:523)
> >     at java.util.AbstractList.listIterator(AbstractList.java:349)
> >     at java.util.AbstractSequentialList.iterator(
> AbstractSequentialList.java
> > :250)
> >     at org.apache.lucene.search.ConjunctionScorer.score(
> > ConjunctionScorer.java:80)
> >     at org.apache.lucene.search.BooleanScorer2$2.score(
> BooleanScorer2.java
> > :186)
> >     at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java
> > :327)
> >     at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java
> > :291)
> >     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java
> :132)
> >     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java
> :110)
> >     at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java
> :225)
> >     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
> >     at org.apache.lucene.search.Hits.<init>(Hits.java:52)
> >     at org.apache.lucene.search.Searcher.search(Searcher.java:62)
> >
> >
> > On 3/7/06, Doug Cutting <cu...@apache.org> wrote:
> > >
> > > Peter Keegan wrote:
> > > > I ran a query performance tester against 8-cpu and 16-cpu Xeon
> servers
> > > > (16/32 cpu hyperthreaded). on Linux. Here are the results:
> > > >
> > > > 8-cpu:  275 qps
> > > > 16-cpu: 305 qps
> > > > (the dual-core Opteron servers are still faster)
> > > >
> > > > Here is the stack trace of 8 of the 16 query threads during the
> test:
> > > >
> > > >         at org.apache.lucene.index.SegmentReader.document(
> > > SegmentReader.java
> > > > :281)
> > > >         - waiting to lock <0x0000002adf5b2110> (a
> > > > org.apache.lucene.index.SegmentReader)
> > > >         at org.apache.lucene.search.IndexSearcher.doc(
> IndexSearcher.java
> > > :83)
> > > >         at org.apache.lucene.search.MultiSearcher.doc(
> MultiSearcher.java
> > > > :146)
> > > >         at org.apache.lucene.search.Hits.doc(Hits.java:103)
> > > >
> > > > SegmentReader.document is a synchronized method. I have one stored
> field
> > > > (binary, uncompressed) with and average length of 0.5Kb. The
> retrieval
> > > of
> > > > this stored field is within this synchronized code. Since I am using
> > > > MMapDirectory, does this retrieval need to be synchronized?
> > >
> > > Yes, since in FieldReader the file positions must be synchronized.
> > >
> > > The way to avoid this would be to:
> > >
> > > 1. Add a clone() method to FieldReader that clones it's two
> IndexInputs.
> > > 2. Add a ThreadLocal to SegmentReader whose value is a cloned
> FieldReader.
> > > 3. Use the ThreadLocal's FieldReader in the document() method.
> > >
> > > TermInfosReader has a similar optimization, using a ThreadLocal
> > > containing a SegmentTermEnum for each thread.
> > >
> > > Doug
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Chris Lamprecht <cl...@gmail.com>.
Peter,

I think this is similar to the patch in this bugzilla task:

http://issues.apache.org/bugzilla/show_bug.cgi?id=35838
the patch itself is http://issues.apache.org/bugzilla/attachment.cgi?id=15757

(BTW does JIRA have a way to display the patch diffs?)

The above patch also has a change to SegmentReader to avoid
synchronization on isDeleted().  However, with that patch, you no
longer have the guarantee that one thread will immediately see
deletions by another thread.  This was fine for my purposes, and
resulted in a big performance boost when there were deleted documents,
but it may not be "correct" for others' needs.

-chris
On 3/10/06, Peter Keegan <pe...@gmail.com> wrote:
> > 3. Use the ThreadLocal's FieldReader in the document() method.
>
> As I understand it, this means that the document method no longer needs to
> be synchronized, right?
>
> I've made these changes and it does appear to improve performance. Random
> snapshots of the stack traces show only an occasional lock in 'isDeleted'.
> Mostly, though, the threads are busy scoring and adding results to priority
> queues, which is great. I've included some sample stacks, below. I'll report
> the new query rates after it has run for at least overnight, and I'd be
> happy submit these changes to the lucene committers, if interested.
>
> Peter
>
>
> Sample stack traces:
>
> "QueryThread group 1,#8" prio=1 tid=0x0000002ce48eeb80 nid=0x6b87 runnable
> [0x0000000043887000..0x0000000043887bb0]
>     at org.apache.lucene.search.FieldSortedHitQueue.lessThan(
> FieldSortedHitQueue.java:108)
>     at org.apache.lucene.util.PriorityQueue.insert(PriorityQueue.java:61)
>     at org.apache.lucene.search.FieldSortedHitQueue.insert(
> FieldSortedHitQueue.java:85)
>     at org.apache.lucene.search.FieldSortedHitQueue.insert(
> FieldSortedHitQueue.java:92)
>     at org.apache.lucene.search.TopFieldDocCollector.collect(
> TopFieldDocCollector.java:51)
>     at org.apache.lucene.search.TermScorer.score(TermScorer.java:75)
>     at org.apache.lucene.search.TermScorer.score(TermScorer.java:60)
>     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:132)
>     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:110)
>     at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:225)
>     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
>     at org.apache.lucene.search.Hits.<init>(Hits.java:52)
>     at org.apache.lucene.search.Searcher.search(Searcher.java:62)
>
> "QueryThread group 1,#5" prio=1 tid=0x0000002ce4d659f0 nid=0x6b84 runnable
> [0x0000000043584000..0x0000000043584d30]
>     at org.apache.lucene.search.TermScorer.score(TermScorer.java:75)
>     at org.apache.lucene.search.TermScorer.score(TermScorer.java:60)
>     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:132)
>     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:110)
>     at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:225)
>     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
>     at org.apache.lucene.search.Hits.<init>(Hits.java:52)
>     at org.apache.lucene.search.Searcher.search(Searcher.java:62)
>
> "QueryThread group 1,#4" prio=1 tid=0x0000002ce10afd50 nid=0x6b83 runnable
> [0x0000000043483000..0x0000000043483db0]
>     at org.apache.lucene.store.MMapDirectory$MMapIndexInput.readByte(
> MMapDirectory.java:46)
>     at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:56)
>     at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java
> :101)
>     at org.apache.lucene.index.SegmentTermDocs.skipTo(SegmentTermDocs.java
> :194)
>     at org.apache.lucene.search.TermScorer.skipTo(TermScorer.java:144)
>     at org.apache.lucene.search.ConjunctionScorer.doNext(
> ConjunctionScorer.java:56)
>     at org.apache.lucene.search.ConjunctionScorer.next(
> ConjunctionScorer.java:51)
>     at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java
> :290)
>     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:132)
>     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:110)
>     at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:225)
>     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
>     at org.apache.lucene.search.Hits.<init>(Hits.java:52)
>     at org.apache.lucene.search.Searcher.search(Searcher.java:62)
>
> "QueryThread group 1,#3" prio=1 tid=0x0000002ce48959f0 nid=0x6b82 runnable
> [0x0000000043382000..0x0000000043382e30]
>     at java.util.LinkedList.listIterator(LinkedList.java:523)
>     at java.util.AbstractList.listIterator(AbstractList.java:349)
>     at java.util.AbstractSequentialList.iterator(AbstractSequentialList.java
> :250)
>     at org.apache.lucene.search.ConjunctionScorer.score(
> ConjunctionScorer.java:80)
>     at org.apache.lucene.search.BooleanScorer2$2.score(BooleanScorer2.java
> :186)
>     at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java
> :327)
>     at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java
> :291)
>     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:132)
>     at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:110)
>     at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:225)
>     at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
>     at org.apache.lucene.search.Hits.<init>(Hits.java:52)
>     at org.apache.lucene.search.Searcher.search(Searcher.java:62)
>
>
> On 3/7/06, Doug Cutting <cu...@apache.org> wrote:
> >
> > Peter Keegan wrote:
> > > I ran a query performance tester against 8-cpu and 16-cpu Xeon servers
> > > (16/32 cpu hyperthreaded). on Linux. Here are the results:
> > >
> > > 8-cpu:  275 qps
> > > 16-cpu: 305 qps
> > > (the dual-core Opteron servers are still faster)
> > >
> > > Here is the stack trace of 8 of the 16 query threads during the test:
> > >
> > >         at org.apache.lucene.index.SegmentReader.document(
> > SegmentReader.java
> > > :281)
> > >         - waiting to lock <0x0000002adf5b2110> (a
> > > org.apache.lucene.index.SegmentReader)
> > >         at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java
> > :83)
> > >         at org.apache.lucene.search.MultiSearcher.doc(MultiSearcher.java
> > > :146)
> > >         at org.apache.lucene.search.Hits.doc(Hits.java:103)
> > >
> > > SegmentReader.document is a synchronized method. I have one stored field
> > > (binary, uncompressed) with and average length of 0.5Kb. The retrieval
> > of
> > > this stored field is within this synchronized code. Since I am using
> > > MMapDirectory, does this retrieval need to be synchronized?
> >
> > Yes, since in FieldReader the file positions must be synchronized.
> >
> > The way to avoid this would be to:
> >
> > 1. Add a clone() method to FieldReader that clones it's two IndexInputs.
> > 2. Add a ThreadLocal to SegmentReader whose value is a cloned FieldReader.
> > 3. Use the ThreadLocal's FieldReader in the document() method.
> >
> > TermInfosReader has a similar optimization, using a ThreadLocal
> > containing a SegmentTermEnum for each thread.
> >
> > Doug
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.
> 3. Use the ThreadLocal's FieldReader in the document() method.

As I understand it, this means that the document method no longer needs to
be synchronized, right?

I've made these changes and it does appear to improve performance. Random
snapshots of the stack traces show only an occasional lock in 'isDeleted'.
Mostly, though, the threads are busy scoring and adding results to priority
queues, which is great. I've included some sample stacks, below. I'll report
the new query rates after it has run for at least overnight, and I'd be
happy submit these changes to the lucene committers, if interested.

Peter


Sample stack traces:

"QueryThread group 1,#8" prio=1 tid=0x0000002ce48eeb80 nid=0x6b87 runnable
[0x0000000043887000..0x0000000043887bb0]
    at org.apache.lucene.search.FieldSortedHitQueue.lessThan(
FieldSortedHitQueue.java:108)
    at org.apache.lucene.util.PriorityQueue.insert(PriorityQueue.java:61)
    at org.apache.lucene.search.FieldSortedHitQueue.insert(
FieldSortedHitQueue.java:85)
    at org.apache.lucene.search.FieldSortedHitQueue.insert(
FieldSortedHitQueue.java:92)
    at org.apache.lucene.search.TopFieldDocCollector.collect(
TopFieldDocCollector.java:51)
    at org.apache.lucene.search.TermScorer.score(TermScorer.java:75)
    at org.apache.lucene.search.TermScorer.score(TermScorer.java:60)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:132)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:110)
    at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:225)
    at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
    at org.apache.lucene.search.Hits.<init>(Hits.java:52)
    at org.apache.lucene.search.Searcher.search(Searcher.java:62)

"QueryThread group 1,#5" prio=1 tid=0x0000002ce4d659f0 nid=0x6b84 runnable
[0x0000000043584000..0x0000000043584d30]
    at org.apache.lucene.search.TermScorer.score(TermScorer.java:75)
    at org.apache.lucene.search.TermScorer.score(TermScorer.java:60)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:132)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:110)
    at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:225)
    at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
    at org.apache.lucene.search.Hits.<init>(Hits.java:52)
    at org.apache.lucene.search.Searcher.search(Searcher.java:62)

"QueryThread group 1,#4" prio=1 tid=0x0000002ce10afd50 nid=0x6b83 runnable
[0x0000000043483000..0x0000000043483db0]
    at org.apache.lucene.store.MMapDirectory$MMapIndexInput.readByte(
MMapDirectory.java:46)
    at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:56)
    at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java
:101)
    at org.apache.lucene.index.SegmentTermDocs.skipTo(SegmentTermDocs.java
:194)
    at org.apache.lucene.search.TermScorer.skipTo(TermScorer.java:144)
    at org.apache.lucene.search.ConjunctionScorer.doNext(
ConjunctionScorer.java:56)
    at org.apache.lucene.search.ConjunctionScorer.next(
ConjunctionScorer.java:51)
    at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java
:290)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:132)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:110)
    at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:225)
    at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
    at org.apache.lucene.search.Hits.<init>(Hits.java:52)
    at org.apache.lucene.search.Searcher.search(Searcher.java:62)

"QueryThread group 1,#3" prio=1 tid=0x0000002ce48959f0 nid=0x6b82 runnable
[0x0000000043382000..0x0000000043382e30]
    at java.util.LinkedList.listIterator(LinkedList.java:523)
    at java.util.AbstractList.listIterator(AbstractList.java:349)
    at java.util.AbstractSequentialList.iterator(AbstractSequentialList.java
:250)
    at org.apache.lucene.search.ConjunctionScorer.score(
ConjunctionScorer.java:80)
    at org.apache.lucene.search.BooleanScorer2$2.score(BooleanScorer2.java
:186)
    at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java
:327)
    at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java
:291)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:132)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:110)
    at org.apache.lucene.search.MultiSearcher.search(MultiSearcher.java:225)
    at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65)
    at org.apache.lucene.search.Hits.<init>(Hits.java:52)
    at org.apache.lucene.search.Searcher.search(Searcher.java:62)


On 3/7/06, Doug Cutting <cu...@apache.org> wrote:
>
> Peter Keegan wrote:
> > I ran a query performance tester against 8-cpu and 16-cpu Xeon servers
> > (16/32 cpu hyperthreaded). on Linux. Here are the results:
> >
> > 8-cpu:  275 qps
> > 16-cpu: 305 qps
> > (the dual-core Opteron servers are still faster)
> >
> > Here is the stack trace of 8 of the 16 query threads during the test:
> >
> >         at org.apache.lucene.index.SegmentReader.document(
> SegmentReader.java
> > :281)
> >         - waiting to lock <0x0000002adf5b2110> (a
> > org.apache.lucene.index.SegmentReader)
> >         at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java
> :83)
> >         at org.apache.lucene.search.MultiSearcher.doc(MultiSearcher.java
> > :146)
> >         at org.apache.lucene.search.Hits.doc(Hits.java:103)
> >
> > SegmentReader.document is a synchronized method. I have one stored field
> > (binary, uncompressed) with and average length of 0.5Kb. The retrieval
> of
> > this stored field is within this synchronized code. Since I am using
> > MMapDirectory, does this retrieval need to be synchronized?
>
> Yes, since in FieldReader the file positions must be synchronized.
>
> The way to avoid this would be to:
>
> 1. Add a clone() method to FieldReader that clones it's two IndexInputs.
> 2. Add a ThreadLocal to SegmentReader whose value is a cloned FieldReader.
> 3. Use the ThreadLocal's FieldReader in the document() method.
>
> TermInfosReader has a similar optimization, using a ThreadLocal
> containing a SegmentTermEnum for each thread.
>
> Doug
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Doug Cutting <cu...@apache.org>.
Peter Keegan wrote:
> I ran a query performance tester against 8-cpu and 16-cpu Xeon servers
> (16/32 cpu hyperthreaded). on Linux. Here are the results:
> 
> 8-cpu:  275 qps
> 16-cpu: 305 qps
> (the dual-core Opteron servers are still faster)
> 
> Here is the stack trace of 8 of the 16 query threads during the test:
> 
>         at org.apache.lucene.index.SegmentReader.document(SegmentReader.java
> :281)
>         - waiting to lock <0x0000002adf5b2110> (a
> org.apache.lucene.index.SegmentReader)
>         at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:83)
>         at org.apache.lucene.search.MultiSearcher.doc(MultiSearcher.java
> :146)
>         at org.apache.lucene.search.Hits.doc(Hits.java:103)
> 
> SegmentReader.document is a synchronized method. I have one stored field
> (binary, uncompressed) with and average length of 0.5Kb. The retrieval of
> this stored field is within this synchronized code. Since I am using
> MMapDirectory, does this retrieval need to be synchronized?

Yes, since in FieldReader the file positions must be synchronized.

The way to avoid this would be to:

1. Add a clone() method to FieldReader that clones it's two IndexInputs.
2. Add a ThreadLocal to SegmentReader whose value is a cloned FieldReader.
3. Use the ThreadLocal's FieldReader in the document() method.

TermInfosReader has a similar optimization, using a ThreadLocal 
containing a SegmentTermEnum for each thread.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.
I ran a query performance tester against 8-cpu and 16-cpu Xeon servers
(16/32 cpu hyperthreaded). on Linux. Here are the results:

8-cpu:  275 qps
16-cpu: 305 qps
(the dual-core Opteron servers are still faster)

Here is the stack trace of 8 of the 16 query threads during the test:

        at org.apache.lucene.index.SegmentReader.document(SegmentReader.java
:281)
        - waiting to lock <0x0000002adf5b2110> (a
org.apache.lucene.index.SegmentReader)
        at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:83)
        at org.apache.lucene.search.MultiSearcher.doc(MultiSearcher.java
:146)
        at org.apache.lucene.search.Hits.doc(Hits.java:103)

SegmentReader.document is a synchronized method. I have one stored field
(binary, uncompressed) with and average length of 0.5Kb. The retrieval of
this stored field is within this synchronized code. Since I am using
MMapDirectory, does this retrieval need to be synchronized?

Peter

On 2/23/06, Peter Keegan <pe...@gmail.com> wrote:
>
> Yonik,
>
> We're investigating both approaches.
> Yes, the resources (and permutations) are dizzying!
>
> Peter
>
>
> On 2/23/06, Yonik Seeley < yseeley@gmail.com> wrote:
> >
> > Wow, some resources!
> > Would it be cheaper / more scalable to copy the index to multiple
> > boxes and loadbalance requests across them?
> >
> > -Yonik
> >
> > On 2/23/06, Peter Keegan <pe...@gmail.com> wrote:
> > > Since I seem to be cpu-bound right now, I'll be trying a 16-cpu system
> > next
> > > (32 with hyperthreading), on LinTel. I may give JRockit another go
> > around
> > > then.
> > >
> > > Thanks,
> > > Peter
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.
Yonik,

We're investigating both approaches.
Yes, the resources (and permutations) are dizzying!

Peter

On 2/23/06, Yonik Seeley <ys...@gmail.com> wrote:
>
> Wow, some resources!
> Would it be cheaper / more scalable to copy the index to multiple
> boxes and loadbalance requests across them?
>
> -Yonik
>
> On 2/23/06, Peter Keegan <pe...@gmail.com> wrote:
> > Since I seem to be cpu-bound right now, I'll be trying a 16-cpu system
> next
> > (32 with hyperthreading), on LinTel. I may give JRockit another go
> around
> > then.
> >
> > Thanks,
> > Peter
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Yonik Seeley <ys...@gmail.com>.
Wow, some resources!
Would it be cheaper / more scalable to copy the index to multiple
boxes and loadbalance requests across them?

-Yonik

On 2/23/06, Peter Keegan <pe...@gmail.com> wrote:
> Since I seem to be cpu-bound right now, I'll be trying a 16-cpu system next
> (32 with hyperthreading), on LinTel. I may give JRockit another go around
> then.
>
> Thanks,
> Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.
Chris,

I tried JRockit a while back on 8-cpu/windows and it was slower than Sun's.
Since I seem to be cpu-bound right now, I'll be trying a 16-cpu system next
(32 with hyperthreading), on LinTel. I may give JRockit another go around
then.

Thanks,
Peter

On 2/23/06, Chris Lamprecht <cl...@gmail.com> wrote:
>
> Peter,
> Have you given JRockit JVM a try?  I've seen it help throughput
> compared to Sun's JVM on a dual xeon/linux machine, especially with
> concurrency (up to 6 concurrent searches happening).  I'm curious to
> see if it makes a difference for you.
>
> -chris
>
> On 2/23/06, Peter Keegan <pe...@gmail.com> wrote:
> > We discovered that the kernel was only using 8 CPUs. After recompiling
> for
> > 16 (8+hyperthreads), it looks like the query rate will settle in around
> > 280-300 qps. Much better, although still quite a bit slower than the
> > opteron.
> >
> > Peter
> >
> >
> >
> >
> > On 2/22/06, Yonik Seeley <ys...@gmail.com> wrote:
> > >
> > > Hmmm, not sure what that could be.
> > > You could try using the default FSDir instead of MMapDir to see if the
> > > differences are there.
> > >
> > > Some things that could be different:
> > > - thread scheduling (shouldn't make too much of a difference though)
> > > - synchronization workings
> > > - page replacement policy... how to figure out what pages to swap in
> > > and which to swap out, esp of the memory mapped files.
> > >
> > > You could also try a profiler on both platforms to try and see where
> > > the difference is.
> > >
> > > -Yonik
> > >
> > > On 2/22/06, Peter Keegan <pe...@gmail.com> wrote:
> > > > I am doing a performance comparison of Lucene on Linux vs Windows.
> > > >
> > > > I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
> > > > processors, 64GB RAM). One is running CentOS 4 Linux, the other is
> > > running
> > > > Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs
> from
> > > Sun.
> > > > The Lucene server is using MMapDirectory. I'm running the jvm with
> > > > -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and
> 7.8GBon
> > > > windows.
> > > >
> > > > I'm observing query rates of 330 queries/sec on the Wintel server,
> but
> > > only
> > > > 200 qps on the Linux box. At first, I suspected a network
> bottleneck,
> > > but
> > > > when I 'short-circuited' Lucene, the query rates were identical.
> > > >
> > > > I suspect that there are some things to be tuned in Linux, but I'm
> not
> > > sure
> > > > what. Any advice would be appreciated.
> > > >
> > > > Peter
> > > >
> > > >
> > > >
> > > > On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
> > > > >
> > > > > I cranked up the dial on my query tester and was able to get the
> rate
> > > up
> > > > > to 325 qps. Unfortunately, the machine died shortly thereafter
> (memory
> > > > > errors :-( ) Hopefully, it was just a coincidence. I haven't
> measured
> > > 64-bit
> > > > > indexing speed, yet.
> > > > >
> > > > > Peter
> > > > >
> > > > > On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> > > > > >
> > > > > > Peter Keegan wrote:
> > > > > > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and
> I'm
> > > now
> > > > > > > getting 250 queries/sec and excellent cpu utilization (equal
> > > > > > concurrency on
> > > > > > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I
> > > wasn't
> > > > > > aware
> > > > > > > of it.
> > > > > > >
> > > > > > Wow.  That's fast.
> > > > > >
> > > > > > Out of interest, does indexing time speed up much on 64-bit
> > > hardware?
> > > > > > I'm particularly interested in this side of things because for
> our
> > > own
> > > > > > application, any query response under half a second is good
> enough,
> > > but
> > > > > > the indexing side could always be faster. :-)
> > > > > >
> > > > > > Daniel
> > > > > >
> > > > > > --
> > > > > > Daniel Noll
> > > > > >
> > > > > > Nuix Australia Pty Ltd
> > > > > > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > > > > > Phone: (02) 9280 0699
> > > > > > Fax:   (02) 9212 6902
> > > > > >
> > > > > > This message is intended only for the named recipient. If you
> are
> > > not
> > > > > > the intended recipient you are notified that disclosing,
> copying,
> > > > > > distributing or taking any action in reliance on the contents of
> > > this
> > > > > > message or attachment is strictly prohibited.
> > > > > >
> > > > > >
> > > > > >
> > > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > > For additional commands, e-mail:
> java-user-help@lucene.apache.org
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Chris Lamprecht <cl...@gmail.com>.
Peter,
Have you given JRockit JVM a try?  I've seen it help throughput
compared to Sun's JVM on a dual xeon/linux machine, especially with
concurrency (up to 6 concurrent searches happening).  I'm curious to
see if it makes a difference for you.

-chris

On 2/23/06, Peter Keegan <pe...@gmail.com> wrote:
> We discovered that the kernel was only using 8 CPUs. After recompiling for
> 16 (8+hyperthreads), it looks like the query rate will settle in around
> 280-300 qps. Much better, although still quite a bit slower than the
> opteron.
>
> Peter
>
>
>
>
> On 2/22/06, Yonik Seeley <ys...@gmail.com> wrote:
> >
> > Hmmm, not sure what that could be.
> > You could try using the default FSDir instead of MMapDir to see if the
> > differences are there.
> >
> > Some things that could be different:
> > - thread scheduling (shouldn't make too much of a difference though)
> > - synchronization workings
> > - page replacement policy... how to figure out what pages to swap in
> > and which to swap out, esp of the memory mapped files.
> >
> > You could also try a profiler on both platforms to try and see where
> > the difference is.
> >
> > -Yonik
> >
> > On 2/22/06, Peter Keegan <pe...@gmail.com> wrote:
> > > I am doing a performance comparison of Lucene on Linux vs Windows.
> > >
> > > I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
> > > processors, 64GB RAM). One is running CentOS 4 Linux, the other is
> > running
> > > Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs from
> > Sun.
> > > The Lucene server is using MMapDirectory. I'm running the jvm with
> > > -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and 7.8GBon
> > > windows.
> > >
> > > I'm observing query rates of 330 queries/sec on the Wintel server, but
> > only
> > > 200 qps on the Linux box. At first, I suspected a network bottleneck,
> > but
> > > when I 'short-circuited' Lucene, the query rates were identical.
> > >
> > > I suspect that there are some things to be tuned in Linux, but I'm not
> > sure
> > > what. Any advice would be appreciated.
> > >
> > > Peter
> > >
> > >
> > >
> > > On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
> > > >
> > > > I cranked up the dial on my query tester and was able to get the rate
> > up
> > > > to 325 qps. Unfortunately, the machine died shortly thereafter (memory
> > > > errors :-( ) Hopefully, it was just a coincidence. I haven't measured
> > 64-bit
> > > > indexing speed, yet.
> > > >
> > > > Peter
> > > >
> > > > On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> > > > >
> > > > > Peter Keegan wrote:
> > > > > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and I'm
> > now
> > > > > > getting 250 queries/sec and excellent cpu utilization (equal
> > > > > concurrency on
> > > > > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I
> > wasn't
> > > > > aware
> > > > > > of it.
> > > > > >
> > > > > Wow.  That's fast.
> > > > >
> > > > > Out of interest, does indexing time speed up much on 64-bit
> > hardware?
> > > > > I'm particularly interested in this side of things because for our
> > own
> > > > > application, any query response under half a second is good enough,
> > but
> > > > > the indexing side could always be faster. :-)
> > > > >
> > > > > Daniel
> > > > >
> > > > > --
> > > > > Daniel Noll
> > > > >
> > > > > Nuix Australia Pty Ltd
> > > > > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > > > > Phone: (02) 9280 0699
> > > > > Fax:   (02) 9212 6902
> > > > >
> > > > > This message is intended only for the named recipient. If you are
> > not
> > > > > the intended recipient you are notified that disclosing, copying,
> > > > > distributing or taking any action in reliance on the contents of
> > this
> > > > > message or attachment is strictly prohibited.
> > > > >
> > > > >
> > > > >
> > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > > >
> > > > >
> > > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.
We discovered that the kernel was only using 8 CPUs. After recompiling for
16 (8+hyperthreads), it looks like the query rate will settle in around
280-300 qps. Much better, although still quite a bit slower than the
opteron.

Peter




On 2/22/06, Yonik Seeley <ys...@gmail.com> wrote:
>
> Hmmm, not sure what that could be.
> You could try using the default FSDir instead of MMapDir to see if the
> differences are there.
>
> Some things that could be different:
> - thread scheduling (shouldn't make too much of a difference though)
> - synchronization workings
> - page replacement policy... how to figure out what pages to swap in
> and which to swap out, esp of the memory mapped files.
>
> You could also try a profiler on both platforms to try and see where
> the difference is.
>
> -Yonik
>
> On 2/22/06, Peter Keegan <pe...@gmail.com> wrote:
> > I am doing a performance comparison of Lucene on Linux vs Windows.
> >
> > I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
> > processors, 64GB RAM). One is running CentOS 4 Linux, the other is
> running
> > Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs from
> Sun.
> > The Lucene server is using MMapDirectory. I'm running the jvm with
> > -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and 7.8GBon
> > windows.
> >
> > I'm observing query rates of 330 queries/sec on the Wintel server, but
> only
> > 200 qps on the Linux box. At first, I suspected a network bottleneck,
> but
> > when I 'short-circuited' Lucene, the query rates were identical.
> >
> > I suspect that there are some things to be tuned in Linux, but I'm not
> sure
> > what. Any advice would be appreciated.
> >
> > Peter
> >
> >
> >
> > On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
> > >
> > > I cranked up the dial on my query tester and was able to get the rate
> up
> > > to 325 qps. Unfortunately, the machine died shortly thereafter (memory
> > > errors :-( ) Hopefully, it was just a coincidence. I haven't measured
> 64-bit
> > > indexing speed, yet.
> > >
> > > Peter
> > >
> > > On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> > > >
> > > > Peter Keegan wrote:
> > > > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and I'm
> now
> > > > > getting 250 queries/sec and excellent cpu utilization (equal
> > > > concurrency on
> > > > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I
> wasn't
> > > > aware
> > > > > of it.
> > > > >
> > > > Wow.  That's fast.
> > > >
> > > > Out of interest, does indexing time speed up much on 64-bit
> hardware?
> > > > I'm particularly interested in this side of things because for our
> own
> > > > application, any query response under half a second is good enough,
> but
> > > > the indexing side could always be faster. :-)
> > > >
> > > > Daniel
> > > >
> > > > --
> > > > Daniel Noll
> > > >
> > > > Nuix Australia Pty Ltd
> > > > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > > > Phone: (02) 9280 0699
> > > > Fax:   (02) 9212 6902
> > > >
> > > > This message is intended only for the named recipient. If you are
> not
> > > > the intended recipient you are notified that disclosing, copying,
> > > > distributing or taking any action in reliance on the contents of
> this
> > > > message or attachment is strictly prohibited.
> > > >
> > > >
> > > >
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > >
> > > >
> > >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Yonik Seeley <ys...@gmail.com>.
Hmmm, not sure what that could be.
You could try using the default FSDir instead of MMapDir to see if the
differences are there.

Some things that could be different:
- thread scheduling (shouldn't make too much of a difference though)
- synchronization workings
- page replacement policy... how to figure out what pages to swap in
and which to swap out, esp of the memory mapped files.

You could also try a profiler on both platforms to try and see where
the difference is.

-Yonik

On 2/22/06, Peter Keegan <pe...@gmail.com> wrote:
> I am doing a performance comparison of Lucene on Linux vs Windows.
>
> I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
> processors, 64GB RAM). One is running CentOS 4 Linux, the other is running
> Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs from Sun.
> The Lucene server is using MMapDirectory. I'm running the jvm with
> -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and 7.8GB on
> windows.
>
> I'm observing query rates of 330 queries/sec on the Wintel server, but only
> 200 qps on the Linux box. At first, I suspected a network bottleneck, but
> when I 'short-circuited' Lucene, the query rates were identical.
>
> I suspect that there are some things to be tuned in Linux, but I'm not sure
> what. Any advice would be appreciated.
>
> Peter
>
>
>
> On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
> >
> > I cranked up the dial on my query tester and was able to get the rate up
> > to 325 qps. Unfortunately, the machine died shortly thereafter (memory
> > errors :-( ) Hopefully, it was just a coincidence. I haven't measured 64-bit
> > indexing speed, yet.
> >
> > Peter
> >
> > On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> > >
> > > Peter Keegan wrote:
> > > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and I'm now
> > > > getting 250 queries/sec and excellent cpu utilization (equal
> > > concurrency on
> > > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I wasn't
> > > aware
> > > > of it.
> > > >
> > > Wow.  That's fast.
> > >
> > > Out of interest, does indexing time speed up much on 64-bit hardware?
> > > I'm particularly interested in this side of things because for our own
> > > application, any query response under half a second is good enough, but
> > > the indexing side could always be faster. :-)
> > >
> > > Daniel
> > >
> > > --
> > > Daniel Noll
> > >
> > > Nuix Australia Pty Ltd
> > > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > > Phone: (02) 9280 0699
> > > Fax:   (02) 9212 6902
> > >
> > > This message is intended only for the named recipient. If you are not
> > > the intended recipient you are notified that disclosing, copying,
> > > distributing or taking any action in reliance on the contents of this
> > > message or attachment is strictly prohibited.
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org