You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Paul Taylor <pa...@fastmail.fm> on 2009/03/27 12:07:13 UTC

Unable to improve performance

Hi

I am trying to run the performance tests against lucene, and am suprised 
about the results.

I have a test that creates a queue of queries, and a number of threads. 
The threads run concurrently getting the next query available, peforming 
a query on the index and taking the top hits. The index is 2GB in size, 
and was originally created froma database table of about 7 millions rows.

I ran the test a number of times with 30 threads, and max memory of 
3500mb I was processing 10,000 records in about 43 seconds ( 233 
queries/second) , the index was stored on a solid state drive running on 
a MacBook Pro (2.66 Ghz Intel Core 2 Duo, 4GB DDR). I dont really have a 
view on whether this is a good result or not but I was keen to try a few 
other things to see if I could improve performance further, but all my 
efforts have had minimal effect.

I tried creating a RAMDirectory based on the file index, once the index 
had been created (4 min 20 seconds) it again took
I copied the index to a slower external convention hard drive and it 
still took 43 seconds.

Reducing/increasing the memory allocated and the number of threads had 
minimal impact.

The main thing Im suprised about is I was expecting a massive difference 
in holding the index in memory instead on disk

thanks Paul



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Unable to improve performance

Posted by Michael McCandless <lu...@mikemccandless.com>.
Also, see here for other ideas that may help:

    http://wiki.apache.org/lucene-java/ImproveSearchingSpeed

I just updated that page with readOnly IndexReader & NIOFSDirectory.

Mike

On Fri, Mar 27, 2009 at 7:07 AM, Paul Taylor <pa...@fastmail.fm> wrote:
> Hi
>
> I am trying to run the performance tests against lucene, and am suprised
> about the results.
>
> I have a test that creates a queue of queries, and a number of threads. The
> threads run concurrently getting the next query available, peforming a query
> on the index and taking the top hits. The index is 2GB in size, and was
> originally created froma database table of about 7 millions rows.
>
> I ran the test a number of times with 30 threads, and max memory of 3500mb I
> was processing 10,000 records in about 43 seconds ( 233 queries/second) ,
> the index was stored on a solid state drive running on a MacBook Pro (2.66
> Ghz Intel Core 2 Duo, 4GB DDR). I dont really have a view on whether this is
> a good result or not but I was keen to try a few other things to see if I
> could improve performance further, but all my efforts have had minimal
> effect.
>
> I tried creating a RAMDirectory based on the file index, once the index had
> been created (4 min 20 seconds) it again took
> I copied the index to a slower external convention hard drive and it still
> took 43 seconds.
>
> Reducing/increasing the memory allocated and the number of threads had
> minimal impact.
>
> The main thing Im suprised about is I was expecting a massive difference in
> holding the index in memory instead on disk
>
> thanks Paul
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Unable to improve performance

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Fri, 2009-03-27 at 12:07 +0100, Paul Taylor wrote:

[2Gb index, 7 million documents(?)]

> I ran the test a number of times with 30 threads, and max memory of 
> 3500mb I was processing 10,000 records in about 43 seconds ( 233 
> queries/second) , the index was stored on a solid state drive running on 
> a MacBook Pro (2.66 Ghz Intel Core 2 Duo, 4GB DDR). I dont really have a 
> view on whether this is a good result or not but I was keen to try a few 
> other things to see if I could improve performance further, but all my 
> efforts have had minimal effect.

You might want to try reducing the number of threads all the way down to
3 or 4 and queue pending searches instead, but I doubt it will change
much - as far as I know, the SSDs in MacBooks are quite okay with regard
to read-latency and with such a small index the system will probably
cache most of it anyway.

I can see elsewhere that you have upped the speed to 466 q/sec by
switching to NIOFSDirectory, so my guess is that you're now CPU and
memory speed bound.

You could try the freeware tool visualVM that profiles running Java
applications. It is extremely easy to use (just run it and select your
application from a list) and it will show you where the CPU-time is
used. Of course, if you're just using simple query analysis with Lucene
supplied Analyzers, there's probably not much you can do about it. On
the other hand, it might show you that you're spending a lot of time
generating queries or similar outside-Lucene-work.

> The main thing Im suprised about is I was expecting a massive difference 
> in holding the index in memory instead on disk

Solid state Drives (and disk cache) rules. Our experiments shows very
little performance increase going from SSD to RAMDirectory.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Unable to improve performance

Posted by Michael McCandless <lu...@mikemccandless.com>.
Alas, it's new as of 2.4.  Can you upgrade?

Mike

On Fri, Mar 27, 2009 at 11:55 AM,  <sp...@gmx.eu> wrote:
>> > How can I open it "readonly"?
>>
>> See the javadocs for IndexReader.
>
> I did it already for 2.3 - cannot find readonly
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Unable to improve performance

Posted by Simon Willnauer <si...@googlemail.com>.
ReadOnly option was introduce with 2.4
from javadoc: "...as of 2.4, it's possible to open a read-only
IndexReader using one of the static open methods that accepts the
boolean readOnly parameter."

http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/index/IndexReader.html#open(org.apache.lucene.store.Directory,%20boolean)

simon

On Fri, Mar 27, 2009 at 4:55 PM,  <sp...@gmx.eu> wrote:
>> > How can I open it "readonly"?
>>
>> See the javadocs for IndexReader.
>
> I did it already for 2.3 - cannot find readonly
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Unable to improve performance

Posted by sp...@gmx.eu.
> > How can I open it "readonly"?
> 
> See the javadocs for IndexReader.

I did it already for 2.3 - cannot find readonly


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Unable to improve performance

Posted by Ian Lea <ia...@gmail.com>.
>> Are you opening your IndexReader with readOnly=true?  If not, you're
>> likely hitting contention on the "isDeleted" method.
>
> How can I open it "readonly"?

See the javadocs for IndexReader.

--
Ian.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Unable to improve performance

Posted by sp...@gmx.eu.
> Are you opening your IndexReader with readOnly=true?  If not, you're
> likely hitting contention on the "isDeleted" method.

How can I open it "readonly"?


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Unable to improve performance

Posted by Paul Taylor <pa...@fastmail.fm>.
Michael McCandless wrote:
> Are you opening your IndexReader with readOnly=true?  If not, you're
> likely hitting contention on the "isDeleted" method.
>
> When you run with a "normal" directory, either on a traditional hard
> drive or SSD device, do you use NIOFSDirectory?  That removes
> contention, but, it only works on non-Windows platform due to a
> long-standing bug in Sun's JRE.
>   
It was a long lunch, actually Im just creating an IndexSearcher directly 
on a file

i.e Searcher searcher = new IndexSearcher(indexDir + "/track_index");

I was struggling to see how to create an NIOFSDirectory until I realised 
I needed Lucene 2.9, which Ive done as follows

Searcher searcher = new IndexSearcher(IndexReader.open(new 
NIOFSDirectory(new File(indexDir + "/track_index"),null),true)));

Anyway the end result is query times have been reduced from 43 seconds 
to 23 seconds, so a pretty good result. (although I dont really 
understand why the RAMDirectory method didnt perform at least this well 
because it would have no file io contention)

thanks alot Paul


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Unable to improve performance

Posted by Michael McCandless <lu...@mikemccandless.com>.
Are you opening your IndexReader with readOnly=true?  If not, you're
likely hitting contention on the "isDeleted" method.

When you run with a "normal" directory, either on a traditional hard
drive or SSD device, do you use NIOFSDirectory?  That removes
contention, but, it only works on non-Windows platform due to a
long-standing bug in Sun's JRE.

Likely the OS is caching stuff in RAM, anyway, so you don't see much
improvement when you explicitly load into a RAMDir.

Mike

On Fri, Mar 27, 2009 at 7:07 AM, Paul Taylor <pa...@fastmail.fm> wrote:
> Hi
>
> I am trying to run the performance tests against lucene, and am suprised
> about the results.
>
> I have a test that creates a queue of queries, and a number of threads. The
> threads run concurrently getting the next query available, peforming a query
> on the index and taking the top hits. The index is 2GB in size, and was
> originally created froma database table of about 7 millions rows.
>
> I ran the test a number of times with 30 threads, and max memory of 3500mb I
> was processing 10,000 records in about 43 seconds ( 233 queries/second) ,
> the index was stored on a solid state drive running on a MacBook Pro (2.66
> Ghz Intel Core 2 Duo, 4GB DDR). I dont really have a view on whether this is
> a good result or not but I was keen to try a few other things to see if I
> could improve performance further, but all my efforts have had minimal
> effect.
>
> I tried creating a RAMDirectory based on the file index, once the index had
> been created (4 min 20 seconds) it again took
> I copied the index to a slower external convention hard drive and it still
> took 43 seconds.
>
> Reducing/increasing the memory allocated and the number of threads had
> minimal impact.
>
> The main thing Im suprised about is I was expecting a massive difference in
> holding the index in memory instead on disk
>
> thanks Paul
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org