You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Peter Keegan <pe...@gmail.com> on 2006/02/22 20:52:21 UTC

Re: Throughput doesn't increase when using more concurrent threads

I am doing a performance comparison of Lucene on Linux vs Windows.

I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
processors, 64GB RAM). One is running CentOS 4 Linux, the other is running
Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs from Sun.
The Lucene server is using MMapDirectory. I'm running the jvm with
-Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and 7.8GB on
windows.

I'm observing query rates of 330 queries/sec on the Wintel server, but only
200 qps on the Linux box. At first, I suspected a network bottleneck, but
when I 'short-circuited' Lucene, the query rates were identical.

I suspect that there are some things to be tuned in Linux, but I'm not sure
what. Any advice would be appreciated.

Peter



On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
>
> I cranked up the dial on my query tester and was able to get the rate up
> to 325 qps. Unfortunately, the machine died shortly thereafter (memory
> errors :-( ) Hopefully, it was just a coincidence. I haven't measured 64-bit
> indexing speed, yet.
>
> Peter
>
> On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> >
> > Peter Keegan wrote:
> > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and I'm now
> > > getting 250 queries/sec and excellent cpu utilization (equal
> > concurrency on
> > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I wasn't
> > aware
> > > of it.
> > >
> > Wow.  That's fast.
> >
> > Out of interest, does indexing time speed up much on 64-bit hardware?
> > I'm particularly interested in this side of things because for our own
> > application, any query response under half a second is good enough, but
> > the indexing side could always be faster. :-)
> >
> > Daniel
> >
> > --
> > Daniel Noll
> >
> > Nuix Australia Pty Ltd
> > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > Phone: (02) 9280 0699
> > Fax:   (02) 9212 6902
> >
> > This message is intended only for the named recipient. If you are not
> > the intended recipient you are notified that disclosing, copying,
> > distributing or taking any action in reliance on the contents of this
> > message or attachment is strictly prohibited.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Raghavendra Prabhu <rr...@gmail.com>.

Hi

Sorry for the trouble

I was sending my first mail to the group

and replied to this thread and then later on sent a direct mail.

I would like to apologise for the inconvenience caused.

Rgds
Prabhu


On 2/23/06, Otis Gospodnetic <ot...@yahoo.com> wrote:
>
> Hi,
>
> Please ask on the Nutch mailing list (I answered your question in general@
> already).
> Also, please don't steal other people's threads - it's considered inpolite
> for obvious reasons.
>
> Otis
>
>
> ----- Original Message ----
> From: Raghavendra Prabhu <rr...@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Thursday, February 23, 2006 11:10:11 AM
> Subject: Re: Throughput doesn't increase when using more concurrent
> threads
>
> Can nutch be made to use lucene query parser?
>
> Rgds
> Prabhu
>
>
> On 2/23/06, Peter Keegan <pe...@gmail.com> wrote:
> >
> > Hi Otis,
> >
> > The Lucene server is actually CPU and network bound, as the index gets
> > memory mapped pretty quickly. There is little disk activity observed.
> >
> > I was also able to run the server on a Sun box last night with 4 dual
> core
> > opterons (same Linux and JVM) and I'm observing query rates of 400 qps!
> > Has
> > Linux been optimized to run on this hardware? I imagine that Sun's JVM
> has
> > been.
> >
> > Peter
> >
> > On 2/22/06, Otis Gospodnetic <ot...@yahoo.com> wrote:
> > >
> > > Hi,
> > >
> > > Some things that could be different:
> > > - thread scheduling (shouldn't make too much of a difference though)
> > >
> > > --- I would also play with disk IO schedulers, if you can.  CentOS is
> > > based on RedHat, I believe, and RedHat (ext3, really) now has about 4
> > > different IO schedulers that, according to articles I recently read,
> can
> > > have an impact on disk read/write performance.  These schedules can be
> > > specified at mount time, I believe, and maybe at boot time (kernel
> line
> > in
> > > Grub/LILO).
> > >
> > > Otis
> > >
> > >
> > > On 2/22/06, Peter Keegan <pe...@gmail.com> wrote:
> > > > I am doing a performance comparison of Lucene on Linux vs Windows.
> > > >
> > > > I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
> > > > processors, 64GB RAM). One is running CentOS 4 Linux, the other is
> > > running
> > > > Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs
> from
> > > Sun.
> > > > The Lucene server is using MMapDirectory. I'm running the jvm with
> > > > -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and
> > 7.8GBon
> > > > windows.
> > > >
> > > > I'm observing query rates of 330 queries/sec on the Wintel server,
> but
> > > only
> > > > 200 qps on the Linux box. At first, I suspected a network
> bottleneck,
> > > but
> > > > when I 'short-circuited' Lucene, the query rates were identical.
> > > >
> > > > I suspect that there are some things to be tuned in Linux, but I'm
> not
> > > sure
> > > > what. Any advice would be appreciated.
> > > >
> > > > Peter
> > > >
> > > >
> > > >
> > > > On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
> > > > >
> > > > > I cranked up the dial on my query tester and was able to get the
> > rate
> > > up
> > > > > to 325 qps. Unfortunately, the machine died shortly thereafter
> > (memory
> > > > > errors :-( ) Hopefully, it was just a coincidence. I haven't
> > measured
> > > 64-bit
> > > > > indexing speed, yet.
> > > > >
> > > > > Peter
> > > > >
> > > > > On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> > > > > >
> > > > > > Peter Keegan wrote:
> > > > > > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and
> > I'm
> > > now
> > > > > > > getting 250 queries/sec and excellent cpu utilization (equal
> > > > > > concurrency on
> > > > > > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I
> > > wasn't
> > > > > > aware
> > > > > > > of it.
> > > > > > >
> > > > > > Wow.  That's fast.
> > > > > >
> > > > > > Out of interest, does indexing time speed up much on 64-bit
> > > hardware?
> > > > > > I'm particularly interested in this side of things because for
> our
> > > own
> > > > > > application, any query response under half a second is good
> > enough,
> > > but
> > > > > > the indexing side could always be faster. :-)
> > > > > >
> > > > > > Daniel
> > > > > >
> > > > > > --
> > > > > > Daniel Noll
> > > > > >
> > > > > > Nuix Australia Pty Ltd
> > > > > > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > > > > > Phone: (02) 9280 0699
> > > > > > Fax:   (02) 9212 6902
> > > > > >
> > > > > > This message is intended only for the named recipient. If you
> are
> > > not
> > > > > > the intended recipient you are notified that disclosing,
> copying,
> > > > > > distributing or taking any action in reliance on the contents of
> > > this
> > > > > > message or attachment is strictly prohibited.
> > > > > >
> > > > > >
> > > > > >
> > > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > > For additional commands, e-mail:
> java-user-help@lucene.apache.org
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
> >
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Otis Gospodnetic <ot...@yahoo.com>.

Hi,

Please ask on the Nutch mailing list (I answered your question in general@ already).
Also, please don't steal other people's threads - it's considered inpolite for obvious reasons.

Otis


----- Original Message ----
From: Raghavendra Prabhu <rr...@gmail.com>
To: java-user@lucene.apache.org
Sent: Thursday, February 23, 2006 11:10:11 AM
Subject: Re: Throughput doesn't increase when using more concurrent threads

Can nutch be made to use lucene query parser?

Rgds
Prabhu


On 2/23/06, Peter Keegan <pe...@gmail.com> wrote:
>
> Hi Otis,
>
> The Lucene server is actually CPU and network bound, as the index gets
> memory mapped pretty quickly. There is little disk activity observed.
>
> I was also able to run the server on a Sun box last night with 4 dual core
> opterons (same Linux and JVM) and I'm observing query rates of 400 qps!
> Has
> Linux been optimized to run on this hardware? I imagine that Sun's JVM has
> been.
>
> Peter
>
> On 2/22/06, Otis Gospodnetic <ot...@yahoo.com> wrote:
> >
> > Hi,
> >
> > Some things that could be different:
> > - thread scheduling (shouldn't make too much of a difference though)
> >
> > --- I would also play with disk IO schedulers, if you can.  CentOS is
> > based on RedHat, I believe, and RedHat (ext3, really) now has about 4
> > different IO schedulers that, according to articles I recently read, can
> > have an impact on disk read/write performance.  These schedules can be
> > specified at mount time, I believe, and maybe at boot time (kernel line
> in
> > Grub/LILO).
> >
> > Otis
> >
> >
> > On 2/22/06, Peter Keegan <pe...@gmail.com> wrote:
> > > I am doing a performance comparison of Lucene on Linux vs Windows.
> > >
> > > I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
> > > processors, 64GB RAM). One is running CentOS 4 Linux, the other is
> > running
> > > Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs from
> > Sun.
> > > The Lucene server is using MMapDirectory. I'm running the jvm with
> > > -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and
> 7.8GBon
> > > windows.
> > >
> > > I'm observing query rates of 330 queries/sec on the Wintel server, but
> > only
> > > 200 qps on the Linux box. At first, I suspected a network bottleneck,
> > but
> > > when I 'short-circuited' Lucene, the query rates were identical.
> > >
> > > I suspect that there are some things to be tuned in Linux, but I'm not
> > sure
> > > what. Any advice would be appreciated.
> > >
> > > Peter
> > >
> > >
> > >
> > > On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
> > > >
> > > > I cranked up the dial on my query tester and was able to get the
> rate
> > up
> > > > to 325 qps. Unfortunately, the machine died shortly thereafter
> (memory
> > > > errors :-( ) Hopefully, it was just a coincidence. I haven't
> measured
> > 64-bit
> > > > indexing speed, yet.
> > > >
> > > > Peter
> > > >
> > > > On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> > > > >
> > > > > Peter Keegan wrote:
> > > > > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and
> I'm
> > now
> > > > > > getting 250 queries/sec and excellent cpu utilization (equal
> > > > > concurrency on
> > > > > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I
> > wasn't
> > > > > aware
> > > > > > of it.
> > > > > >
> > > > > Wow.  That's fast.
> > > > >
> > > > > Out of interest, does indexing time speed up much on 64-bit
> > hardware?
> > > > > I'm particularly interested in this side of things because for our
> > own
> > > > > application, any query response under half a second is good
> enough,
> > but
> > > > > the indexing side could always be faster. :-)
> > > > >
> > > > > Daniel
> > > > >
> > > > > --
> > > > > Daniel Noll
> > > > >
> > > > > Nuix Australia Pty Ltd
> > > > > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > > > > Phone: (02) 9280 0699
> > > > > Fax:   (02) 9212 6902
> > > > >
> > > > > This message is intended only for the named recipient. If you are
> > not
> > > > > the intended recipient you are notified that disclosing, copying,
> > > > > distributing or taking any action in reliance on the contents of
> > this
> > > > > message or attachment is strictly prohibited.
> > > > >
> > > > >
> > > > >
> > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > > >
> > > > >
> > > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Throughput doesn't increase when using more concurrent threads

Posted by Raghavendra Prabhu <rr...@gmail.com>.

Can nutch be made to use lucene query parser?

Rgds
Prabhu


On 2/23/06, Peter Keegan <pe...@gmail.com> wrote:
>
> Hi Otis,
>
> The Lucene server is actually CPU and network bound, as the index gets
> memory mapped pretty quickly. There is little disk activity observed.
>
> I was also able to run the server on a Sun box last night with 4 dual core
> opterons (same Linux and JVM) and I'm observing query rates of 400 qps!
> Has
> Linux been optimized to run on this hardware? I imagine that Sun's JVM has
> been.
>
> Peter
>
> On 2/22/06, Otis Gospodnetic <ot...@yahoo.com> wrote:
> >
> > Hi,
> >
> > Some things that could be different:
> > - thread scheduling (shouldn't make too much of a difference though)
> >
> > --- I would also play with disk IO schedulers, if you can.  CentOS is
> > based on RedHat, I believe, and RedHat (ext3, really) now has about 4
> > different IO schedulers that, according to articles I recently read, can
> > have an impact on disk read/write performance.  These schedules can be
> > specified at mount time, I believe, and maybe at boot time (kernel line
> in
> > Grub/LILO).
> >
> > Otis
> >
> >
> > On 2/22/06, Peter Keegan <pe...@gmail.com> wrote:
> > > I am doing a performance comparison of Lucene on Linux vs Windows.
> > >
> > > I have 2 identically configured servers (8-CPUs (real) x 3GHz Xeon
> > > processors, 64GB RAM). One is running CentOS 4 Linux, the other is
> > running
> > > Windows server 2003 Enterprise Edition x64. Both have 64-bit JVMs from
> > Sun.
> > > The Lucene server is using MMapDirectory. I'm running the jvm with
> > > -Xmx16000M. Peak memory usage of the jvm on Linux is about 6GB and
> 7.8GBon
> > > windows.
> > >
> > > I'm observing query rates of 330 queries/sec on the Wintel server, but
> > only
> > > 200 qps on the Linux box. At first, I suspected a network bottleneck,
> > but
> > > when I 'short-circuited' Lucene, the query rates were identical.
> > >
> > > I suspect that there are some things to be tuned in Linux, but I'm not
> > sure
> > > what. Any advice would be appreciated.
> > >
> > > Peter
> > >
> > >
> > >
> > > On 1/30/06, Peter Keegan <pe...@gmail.com> wrote:
> > > >
> > > > I cranked up the dial on my query tester and was able to get the
> rate
> > up
> > > > to 325 qps. Unfortunately, the machine died shortly thereafter
> (memory
> > > > errors :-( ) Hopefully, it was just a coincidence. I haven't
> measured
> > 64-bit
> > > > indexing speed, yet.
> > > >
> > > > Peter
> > > >
> > > > On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> > > > >
> > > > > Peter Keegan wrote:
> > > > > > I tried the AMD64-bit JVM from Sun and with MMapDirectory and
> I'm
> > now
> > > > > > getting 250 queries/sec and excellent cpu utilization (equal
> > > > > concurrency on
> > > > > > all cpus)!! Yonik, thanks for the pointer to the 64-bit jvm. I
> > wasn't
> > > > > aware
> > > > > > of it.
> > > > > >
> > > > > Wow.  That's fast.
> > > > >
> > > > > Out of interest, does indexing time speed up much on 64-bit
> > hardware?
> > > > > I'm particularly interested in this side of things because for our
> > own
> > > > > application, any query response under half a second is good
> enough,
> > but
> > > > > the indexing side could always be faster. :-)
> > > > >
> > > > > Daniel
> > > > >
> > > > > --
> > > > > Daniel Noll
> > > > >
> > > > > Nuix Australia Pty Ltd
> > > > > Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
> > > > > Phone: (02) 9280 0699
> > > > > Fax:   (02) 9212 6902
> > > > >
> > > > > This message is intended only for the named recipient. If you are
> > not
> > > > > the intended recipient you are notified that disclosing, copying,
> > > > > distributing or taking any action in reliance on the contents of
> > this
> > > > > message or attachment is strictly prohibited.
> > > > >
> > > > >
> > > > >
> > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > > >
> > > > >
> > > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>