You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Otis Gospodnetic <ot...@yahoo.com> on 2006/04/26 18:34:08 UTC

Lucene search benchmark/stress test tool

Hi,

I'm about to write a little command-line Lucene search benchmark tool.  I'm interested in benchmarking search performance and the ability to specify concurrency level (# of parallel search threads) and response timing, so I can calculate min, max, average, and mean times.  Something like 'ab' (Apache Benchmark) tool, but for Lucene.

Has anyone already written something like this?

Thanks,
Otis




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene search benchmark/stress test tool

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Marvin,
I wrote my Lucene search benchmarker, but will have to check with my employer about contributing it to Lucene.  It's rather simple - I used Java 1.5 concurrency package's ThreadedPoolExecutor for executing N parallel search requests, measured elaphsed time for each request, and then when all searches were done, I calculated min/max/median/percentile/etc.

Otis

----- Original Message ----
From: Marvin Humphrey <ma...@rectangular.com>
To: java-user@lucene.apache.org
Sent: Sunday, April 30, 2006 8:28:20 PM
Subject: Re: Lucene search benchmark/stress test tool


On Apr 26, 2006, at 9:34 AM, Otis Gospodnetic wrote:

> I'm about to write a little command-line Lucene search benchmark  
> tool.  I'm interested in benchmarking search performance and the  
> ability to specify concurrency level (# of parallel search threads)  
> and response timing, so I can calculate min, max, average, and mean  
> times.  Something like 'ab' (Apache Benchmark) tool, but for Lucene.
>
> Has anyone already written something like this?

I'm about to.  The predecessor to the indexing benchmarker tests I  
recently published results for was enormously helpful while  
streamlining the indexing process.  Now that I'm considering  
modifications to search logic and file format which may have a  
substantial impact on search-time performance, I'll need a search  
benchmarker to complement the indexing benchmarker.  I'll be writing  
a both a Perl/KinoSearch and a Java Lucene version, and they will use  
the Reuters corpus.

Where are you at with your app?

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene search benchmark/stress test tool

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Apr 26, 2006, at 9:34 AM, Otis Gospodnetic wrote:

> I'm about to write a little command-line Lucene search benchmark  
> tool.  I'm interested in benchmarking search performance and the  
> ability to specify concurrency level (# of parallel search threads)  
> and response timing, so I can calculate min, max, average, and mean  
> times.  Something like 'ab' (Apache Benchmark) tool, but for Lucene.
>
> Has anyone already written something like this?

I'm about to.  The predecessor to the indexing benchmarker tests I  
recently published results for was enormously helpful while  
streamlining the indexing process.  Now that I'm considering  
modifications to search logic and file format which may have a  
substantial impact on search-time performance, I'll need a search  
benchmarker to complement the indexing benchmarker.  I'll be writing  
a both a Perl/KinoSearch and a Java Lucene version, and they will use  
the Reuters corpus.

Where are you at with your app?

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene search benchmark/stress test tool

Posted by Doug Cutting <cu...@apache.org>.
Sunil Kumar PK wrote:
> I want to know is there any possibility or method to merge the weight
> calculation of index 1 and its search in a single RPC instead of doing the
> both function in separate steps.

To score correctly, weights from all indexes must be created before any 
can be searched.  This is to compute a global IDF used in all searches.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene search benchmark/stress test tool

Posted by Sunil Kumar PK <pk...@gmail.com>.
Hi,

I have added some code in the Lucene 1.9 - source code for Lucene
RemoteParallelMultisearcher performance benchmark.

 I have recorded the time to execute the  'searchables[i].docFreq(term)' (in
MultiSearcher.java) method in both client and server, and for  '
searchable.search' (in ParallelMultiSearcher.java) method also.i have also
recorded the total time taken to get hits object.

I have tested different complex boolean queries and taken the average time
for each queries.  But while doing this i am stucked with some doubts.
Please find my doubts listed below.

What I have understood from Lucene Remote Parallel Multi Searcher Search
Procedure is first compute the weight for the Query in each Index
sequentially (one by one, eg: - calculate "query weight" of index1 first and
then index2) and then perform searching of each index one by one and merge
the results.

I want to know is there any possibility or method to merge the weight
calculation of index 1 and its search in a single RPC instead of doing the
both function in separate steps.

Another query I have to clear is In RemoteParallelMultiSearcher the method
"docFreq (Term term)" is not parallelized, why it is not
parallelized, and please specify any reason for that.


Regards

Sunil


On 4/26/06, Otis Gospodnetic <ot...@yahoo.com> wrote:
>
> Hi,
>
> I'm about to write a little command-line Lucene search benchmark
> tool.  I'm interested in benchmarking search performance and the ability to
> specify concurrency level (# of parallel search threads) and response
> timing, so I can calculate min, max, average, and mean times.  Something
> like 'ab' (Apache Benchmark) tool, but for Lucene.
>
> Has anyone already written something like this?
>
> Thanks,
> Otis
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>