You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Peter Keegan <pe...@gmail.com> on 2006/01/25 19:50:32 UTC

Re: Throughput doesn't increase when using more concurrent threads

This is just  fyi - in my stress tests on a 8-cpu box (that's 8 real cpus),
the maximum throughput occurred with just 4 query threads. The query
throughput decreased with fewer than 4 or greater than 4 query threads. The
entire index was most likely in the file system cache, too. Periodic
snapshots of stack traces showed most threads blocked in the synchronization
in: FSIndexInput.readInternal(), when the thread count exceeded 4.

Peter


On 11/22/05, Oren Shir <sh...@gmail.com> wrote:
>
> Hi,
>
> There are two sunchronization points: on the stream and on the reader.
> Using
> different FSDirectoriy and IndexReaders should solve this. I'll let you
> know
> once I code it. Right now I'm checking if making my Documents store less
> data will move the bottleneck to some other place.
>
> Thanks again,
> Oren Shir
>
> On 11/21/05, Doug Cutting <cu...@apache.org> wrote:
> >
> > Jay Booth wrote:
> > > I had a similar problem with threading, the problem turned out to be
> > that in
> > > the back end of the FSDirectory class I believe it was, there was a
> > > synchronized block on the actual RandomAccessFile resource when
> reading
> > a
> > > block of data from it... high-concurrency situations caused threads to
> > stack
> > > up in front of this synchronized block and our CPU time wound up being
> > spent
> > > thrashing between blocked threads instead of doing anything useful.
> >
> > This is correct. In Lucene, multiple streams per file are created by
> > cloning, and all clones of an FSDirectory input stream share a
> > RandomAccessFile and must synchronize input from it. MmapDirectory does
> > not have this limitation. If your indexes are less than a few GB or you
> > are using 64-bit hardware, then MmapDirectory should work well for you.
> > Otherwise it would be simple to write an nio-based Directory that does
> > not use mmap that is also unsynchronized. Such a contribution would be
> > welcome.
> >
> > > Making multiple IndexSearchers and FSDirectories didn't help because
> in
> > the
> > > back end, lucene consults a singleton HashMap of some kind (don't
> > remember
> > > implementation) that maintained a single FSDirectory for any given
> index
> > > being accessed from the JVM... multiple calls to
> > FSDirectory.getDirectory
> > > actually return the same FSDirectory object with synchronization at
> the
> > same
> > > point.
> >
> > This does not make sense to me. FSDirectory does keep a cache of
> > FSDirectory instances, but i/o should not be synchronized on these. One
> > should be able to open multiple input streams on the same file from an
> > FSDirectory. But this would not be a great solution, since file handle
> > limits would soon become a problem.
> >
> > Doug
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.

Yes, it's hyperthreaded (16 cpus show up in task manager - the box is
running 2003). I plan to turn off hyperthreading to see if it has any
effect.

Peter


On 1/25/06, Yonik Seeley <ys...@gmail.com> wrote:
>
> On 1/25/06, Peter Keegan <pe...@gmail.com> wrote:
> > It's a 3GHz Intel box with Xeon processors, 64GB ram :)
>
> Nice!
>
> Xeon processors are normally hyperthreaded.  On a linux box, if you
> cat /proc/cpuinfo, you will see 8 processors for a 4 physical CPU
> system.  Are you positive you have 8 physical Xeon processors?
>
> -Yonik
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Yonik Seeley <ys...@gmail.com>.

On 1/25/06, Peter Keegan <pe...@gmail.com> wrote:
> It's a 3GHz Intel box with Xeon processors, 64GB ram :)

Nice!

Xeon processors are normally hyperthreaded.  On a linux box, if you
cat /proc/cpuinfo, you will see 8 processors for a 4 physical CPU
system.  Are you positive you have 8 physical Xeon processors?

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.

It's a 3GHz Intel box with Xeon processors, 64GB ram :)

Peter


On 1/25/06, Yonik Seeley <ys...@gmail.com> wrote:
>
> Thanks Peter, that's useful info.
>
> Just out of curiosity, what kind of box is this?  what CPUs?
>
> -Yonik
>
> On 1/25/06, Peter Keegan <pe...@gmail.com> wrote:
> > This is just  fyi - in my stress tests on a 8-cpu box (that's 8 real
> cpus),
> > the maximum throughput occurred with just 4 query threads. The query
> > throughput decreased with fewer than 4 or greater than 4 query threads.
> The
> > entire index was most likely in the file system cache, too. Periodic
> > snapshots of stack traces showed most threads blocked in the
> synchronization
> > in: FSIndexInput.readInternal(), when the thread count exceeded 4.
> >
> > Peter
> >
> >
> > On 11/22/05, Oren Shir <sh...@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > There are two sunchronization points: on the stream and on the reader.
> > > Using
> > > different FSDirectoriy and IndexReaders should solve this. I'll let
> you
> > > know
> > > once I code it. Right now I'm checking if making my Documents store
> less
> > > data will move the bottleneck to some other place.
> > >
> > > Thanks again,
> > > Oren Shir
> > >
> > > On 11/21/05, Doug Cutting <cu...@apache.org> wrote:
> > > >
> > > > Jay Booth wrote:
> > > > > I had a similar problem with threading, the problem turned out to
> be
> > > > that in
> > > > > the back end of the FSDirectory class I believe it was, there was
> a
> > > > > synchronized block on the actual RandomAccessFile resource when
> > > reading
> > > > a
> > > > > block of data from it... high-concurrency situations caused
> threads to
> > > > stack
> > > > > up in front of this synchronized block and our CPU time wound up
> being
> > > > spent
> > > > > thrashing between blocked threads instead of doing anything
> useful.
> > > >
> > > > This is correct. In Lucene, multiple streams per file are created by
> > > > cloning, and all clones of an FSDirectory input stream share a
> > > > RandomAccessFile and must synchronize input from it. MmapDirectory
> does
> > > > not have this limitation. If your indexes are less than a few GB or
> you
> > > > are using 64-bit hardware, then MmapDirectory should work well for
> you.
> > > > Otherwise it would be simple to write an nio-based Directory that
> does
> > > > not use mmap that is also unsynchronized. Such a contribution would
> be
> > > > welcome.
> > > >
> > > > > Making multiple IndexSearchers and FSDirectories didn't help
> because
> > > in
> > > > the
> > > > > back end, lucene consults a singleton HashMap of some kind (don't
> > > > remember
> > > > > implementation) that maintained a single FSDirectory for any given
> > > index
> > > > > being accessed from the JVM... multiple calls to
> > > > FSDirectory.getDirectory
> > > > > actually return the same FSDirectory object with synchronization
> at
> > > the
> > > > same
> > > > > point.
> > > >
> > > > This does not make sense to me. FSDirectory does keep a cache of
> > > > FSDirectory instances, but i/o should not be synchronized on these.
> One
> > > > should be able to open multiple input streams on the same file from
> an
> > > > FSDirectory. But this would not be a great solution, since file
> handle
> > > > limits would soon become a problem.
> > > >
> > > > Doug
> > > >
> > > >
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Yonik Seeley <ys...@gmail.com>.

Thanks Peter, that's useful info.

Just out of curiosity, what kind of box is this?  what CPUs?

-Yonik

On 1/25/06, Peter Keegan <pe...@gmail.com> wrote:
> This is just  fyi - in my stress tests on a 8-cpu box (that's 8 real cpus),
> the maximum throughput occurred with just 4 query threads. The query
> throughput decreased with fewer than 4 or greater than 4 query threads. The
> entire index was most likely in the file system cache, too. Periodic
> snapshots of stack traces showed most threads blocked in the synchronization
> in: FSIndexInput.readInternal(), when the thread count exceeded 4.
>
> Peter
>
>
> On 11/22/05, Oren Shir <sh...@gmail.com> wrote:
> >
> > Hi,
> >
> > There are two sunchronization points: on the stream and on the reader.
> > Using
> > different FSDirectoriy and IndexReaders should solve this. I'll let you
> > know
> > once I code it. Right now I'm checking if making my Documents store less
> > data will move the bottleneck to some other place.
> >
> > Thanks again,
> > Oren Shir
> >
> > On 11/21/05, Doug Cutting <cu...@apache.org> wrote:
> > >
> > > Jay Booth wrote:
> > > > I had a similar problem with threading, the problem turned out to be
> > > that in
> > > > the back end of the FSDirectory class I believe it was, there was a
> > > > synchronized block on the actual RandomAccessFile resource when
> > reading
> > > a
> > > > block of data from it... high-concurrency situations caused threads to
> > > stack
> > > > up in front of this synchronized block and our CPU time wound up being
> > > spent
> > > > thrashing between blocked threads instead of doing anything useful.
> > >
> > > This is correct. In Lucene, multiple streams per file are created by
> > > cloning, and all clones of an FSDirectory input stream share a
> > > RandomAccessFile and must synchronize input from it. MmapDirectory does
> > > not have this limitation. If your indexes are less than a few GB or you
> > > are using 64-bit hardware, then MmapDirectory should work well for you.
> > > Otherwise it would be simple to write an nio-based Directory that does
> > > not use mmap that is also unsynchronized. Such a contribution would be
> > > welcome.
> > >
> > > > Making multiple IndexSearchers and FSDirectories didn't help because
> > in
> > > the
> > > > back end, lucene consults a singleton HashMap of some kind (don't
> > > remember
> > > > implementation) that maintained a single FSDirectory for any given
> > index
> > > > being accessed from the JVM... multiple calls to
> > > FSDirectory.getDirectory
> > > > actually return the same FSDirectory object with synchronization at
> > the
> > > same
> > > > point.
> > >
> > > This does not make sense to me. FSDirectory does keep a cache of
> > > FSDirectory instances, but i/o should not be synchronized on these. One
> > > should be able to open multiple input streams on the same file from an
> > > FSDirectory. But this would not be a great solution, since file handle
> > > limits would soon become a problem.
> > >
> > > Doug
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Throughput doesn't increase when using more concurrent threads

Posted by Daniel Noll <da...@nuix.com.au>.

Yonik Seeley wrote:
> On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
>   
>> Peter Keegan wrote:
>>     
>>> I'd love to try this, but I'm not aware of any 64-bit jvms for Windows on
>>> Intel. If you know of any, please let me know. Linux may be an option, too.
>>>
>>>       
>> Is this true about the 64-bit JVM not working on Intel?
>>     
>
> Go back and look at my response to the message you quoted :-)
> The short answer is yes, it will work on Intel.
>   
Ah.  Okay, sorry about that.  The only response I saw was the one about 
JRockit supporting it.
> Support is a different issue.  It may work, but it may or may not be a
> "supported" platform of the JVM vendor.
>   
True enough.  At the moment our most likely move is to support Sun's 
64-bit JVM on Windows, but not other vendors' JVMs (i.e., we'll support 
whatever JVMs we redistribute with our own app.)  Of course, this will 
only come once we claim to support 64-bit hardware... I'm sure there are 
many things still yet to be done there, such as making sure all our JNI 
libraries will compile properly for 64-bit Windows.

Daniel

-- 
Daniel Noll

Nuix Australia Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
Phone: (02) 9280 0699
Fax:   (02) 9212 6902

This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Throughput doesn't increase when using more concurrent threads

Posted by Yonik Seeley <ys...@gmail.com>.

On 1/29/06, Daniel Noll <da...@nuix.com.au> wrote:
> Peter Keegan wrote:
> > I'd love to try this, but I'm not aware of any 64-bit jvms for Windows on
> > Intel. If you know of any, please let me know. Linux may be an option, too.
> >
> Is this true about the 64-bit JVM not working on Intel?

Go back and look at my response to the message you quoted :-)
The short answer is yes, it will work on Intel.

> I was under the
> impression that it supported the AMD64 instruction set, and that Intel's
> 64-bit processors basically cloned AMD's instruction set.

Pretty much, but it's never that simple (and wasn't for 32 bit mode either)
http://en.wikipedia.org/wiki/EM64T#Differences_between_AMD64_and_EM64T

Now that they have both been out a while, compilers generally produce
code that work on both. Tricky things like JVMs and esp kernels needed
explicit support.

> I really hope this isn't the case, because it's going to be one hell of
> a caveat if we end up telling customers "yes, we support 64-bit AMD, but
> not 64-bit Intel."

Support is a different issue.  It may work, but it may or may not be a
"supported" platform of the JVM vendor.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Throughput doesn't increase when using more concurrent threads

Posted by Daniel Noll <da...@nuix.com.au>.

Peter Keegan wrote:
> I'd love to try this, but I'm not aware of any 64-bit jvms for Windows on
> Intel. If you know of any, please let me know. Linux may be an option, too.
>   
Is this true about the 64-bit JVM not working on Intel?  I was under the 
impression that it supported the AMD64 instruction set, and that Intel's 
64-bit processors basically cloned AMD's instruction set.

I really hope this isn't the case, because it's going to be one hell of 
a caveat if we end up telling customers "yes, we support 64-bit AMD, but 
not 64-bit Intel."

Daniel

-- 
Daniel Noll

Nuix Australia Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
Phone: (02) 9280 0699
Fax:   (02) 9212 6902

This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Throughput doesn't increase when using more concurrent threads

Posted by Yonik Seeley <ys...@gmail.com>.

BEA Jrockit supports both AMD64 and Intel's EM64T (basically renamed AMD64)
http://www.bea.com/framework.jsp?CNT=index.htm&FP=/content/products/jrockit/

and Sun's Java 1.5 for "Windows AMD64 Platform"
They advertize AMD64, presumably because that's what there servers
use, but it should work on Intel's x86_64 (EM64T) also.  The release
notes have the following:
"With the release, J2SE support for Windows 64-bit has progressed from
release candidate to final release. This version runs on AMD64/EM64T
64-bit mode machines with Windows Server 2003 x64 Editions."

Of course, if the platform is up to you, I'd choose Linux :-)

-Yonik

On 1/26/06, Peter Keegan <pe...@gmail.com> wrote:
> I'd love to try this, but I'm not aware of any 64-bit jvms for Windows on
> Intel. If you know of any, please let me know. Linux may be an option, too.
>
> btw, I'm getting a sustained rate of 135 queries/sec with 4 threads, which
> is pretty impressive. Another way around the concurrency limit is to run
> multiple jvms. The throughput of each is less, but the aggregate throughput
> is higher.
>
> Peter
>
>
> On 1/26/06, Yonik Seeley <ys...@gmail.com> wrote:
> >
> > Hmmm, can you run the 64 bit version of Windows (and hence a 64 bit JVM?)
> > We're running with heap sizes up to 8GB (RH Linux 64 bit, Opterons,
> > Sun Java 1.5)
> >
> > -Yonik
> >
> > On 1/26/06, Peter Keegan <pe...@gmail.com> wrote:
> > > Paul,
> > >
> > > I tried this but it ran out of memory trying to read the 500Mb .fdt
> > file. I
> > > tried various values for MAX_BBUF, but it still ran out of memory (I'm
> > using
> > > -Xmx1600M, which is the jvm's maximum value (v1.5))  I'll give
> > > NioFSDirectory a try.
> > >
> > > Thanks,
> > > Peter
> > >
> > >
> > > On 1/26/06, Paul Elschot <pa...@xs4all.nl> wrote:
> > > >
> > > > On Wednesday 25 January 2006 20:51, Peter Keegan wrote:
> > > > > The index is non-compound format and optimized. Yes, I did try
> > > > > MMapDirectory, but the index is too big - 3.5 GB (1.3GB is term
> > vectors)
> > > > >
> > > > > Peter
> > > > >
> > > > You could also give this a try:
> > > >
> > > > http://issues.apache.org/jira/browse/LUCENE-283
> > > >
> > > > Regards,
> > > > Paul Elschot
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > >
> > > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.

I'd love to try this, but I'm not aware of any 64-bit jvms for Windows on
Intel. If you know of any, please let me know. Linux may be an option, too.

btw, I'm getting a sustained rate of 135 queries/sec with 4 threads, which
is pretty impressive. Another way around the concurrency limit is to run
multiple jvms. The throughput of each is less, but the aggregate throughput
is higher.

Peter


On 1/26/06, Yonik Seeley <ys...@gmail.com> wrote:
>
> Hmmm, can you run the 64 bit version of Windows (and hence a 64 bit JVM?)
> We're running with heap sizes up to 8GB (RH Linux 64 bit, Opterons,
> Sun Java 1.5)
>
> -Yonik
>
> On 1/26/06, Peter Keegan <pe...@gmail.com> wrote:
> > Paul,
> >
> > I tried this but it ran out of memory trying to read the 500Mb .fdt
> file. I
> > tried various values for MAX_BBUF, but it still ran out of memory (I'm
> using
> > -Xmx1600M, which is the jvm's maximum value (v1.5))  I'll give
> > NioFSDirectory a try.
> >
> > Thanks,
> > Peter
> >
> >
> > On 1/26/06, Paul Elschot <pa...@xs4all.nl> wrote:
> > >
> > > On Wednesday 25 January 2006 20:51, Peter Keegan wrote:
> > > > The index is non-compound format and optimized. Yes, I did try
> > > > MMapDirectory, but the index is too big - 3.5 GB (1.3GB is term
> vectors)
> > > >
> > > > Peter
> > > >
> > > You could also give this a try:
> > >
> > > http://issues.apache.org/jira/browse/LUCENE-283
> > >
> > > Regards,
> > > Paul Elschot
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Yonik Seeley <ys...@gmail.com>.

Hmmm, can you run the 64 bit version of Windows (and hence a 64 bit JVM?)
We're running with heap sizes up to 8GB (RH Linux 64 bit, Opterons,
Sun Java 1.5)

-Yonik

On 1/26/06, Peter Keegan <pe...@gmail.com> wrote:
> Paul,
>
> I tried this but it ran out of memory trying to read the 500Mb .fdt file. I
> tried various values for MAX_BBUF, but it still ran out of memory (I'm using
> -Xmx1600M, which is the jvm's maximum value (v1.5))  I'll give
> NioFSDirectory a try.
>
> Thanks,
> Peter
>
>
> On 1/26/06, Paul Elschot <pa...@xs4all.nl> wrote:
> >
> > On Wednesday 25 January 2006 20:51, Peter Keegan wrote:
> > > The index is non-compound format and optimized. Yes, I did try
> > > MMapDirectory, but the index is too big - 3.5 GB (1.3GB is term vectors)
> > >
> > > Peter
> > >
> > You could also give this a try:
> >
> > http://issues.apache.org/jira/browse/LUCENE-283
> >
> > Regards,
> > Paul Elschot
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.

Paul,

I tried this but it ran out of memory trying to read the 500Mb .fdt file. I
tried various values for MAX_BBUF, but it still ran out of memory (I'm using
-Xmx1600M, which is the jvm's maximum value (v1.5))  I'll give
NioFSDirectory a try.

Thanks,
Peter


On 1/26/06, Paul Elschot <pa...@xs4all.nl> wrote:
>
> On Wednesday 25 January 2006 20:51, Peter Keegan wrote:
> > The index is non-compound format and optimized. Yes, I did try
> > MMapDirectory, but the index is too big - 3.5 GB (1.3GB is term vectors)
> >
> > Peter
> >
> You could also give this a try:
>
> http://issues.apache.org/jira/browse/LUCENE-283
>
> Regards,
> Paul Elschot
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Paul Elschot <pa...@xs4all.nl>.

On Wednesday 25 January 2006 20:51, Peter Keegan wrote:
> The index is non-compound format and optimized. Yes, I did try
> MMapDirectory, but the index is too big - 3.5 GB (1.3GB is term vectors)
> 
> Peter
> 
You could also give this a try:

http://issues.apache.org/jira/browse/LUCENE-283

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Throughput doesn't increase when using more concurrent threads

Posted by Peter Keegan <pe...@gmail.com>.

The index is non-compound format and optimized. Yes, I did try
MMapDirectory, but the index is too big - 3.5 GB (1.3GB is term vectors)

Peter

On 1/25/06, Doug Cutting <cu...@apache.org> wrote:
>
> Peter Keegan wrote:
> > This is just  fyi - in my stress tests on a 8-cpu box (that's 8 real
> cpus),
> > the maximum throughput occurred with just 4 query threads. The query
> > throughput decreased with fewer than 4 or greater than 4 query threads.
> The
> > entire index was most likely in the file system cache, too. Periodic
> > snapshots of stack traces showed most threads blocked in the
> synchronization
> > in: FSIndexInput.readInternal(), when the thread count exceeded 4.
>
> Was this with a compound or non-compound format index?  The non-compound
> should fare slightly better, since there are more file handles per
> index.  Did you try using MMapDirectory?  This should have no i/o
> concurrency limits, but, on 32-bit systems, only works with indexes less
> than a few GB.
>
> Doug
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Throughput doesn't increase when using more concurrent threads

Posted by Doug Cutting <cu...@apache.org>.

Peter Keegan wrote:
> This is just  fyi - in my stress tests on a 8-cpu box (that's 8 real cpus),
> the maximum throughput occurred with just 4 query threads. The query
> throughput decreased with fewer than 4 or greater than 4 query threads. The
> entire index was most likely in the file system cache, too. Periodic
> snapshots of stack traces showed most threads blocked in the synchronization
> in: FSIndexInput.readInternal(), when the thread count exceeded 4.

Was this with a compound or non-compound format index?  The non-compound 
should fare slightly better, since there are more file handles per 
index.  Did you try using MMapDirectory?  This should have no i/o 
concurrency limits, but, on 32-bit systems, only works with indexes less 
than a few GB.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org