You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Paul Querna <ch...@force-elite.com> on 2005/03/27 01:07:18 UTC

Performanmce of MultiSearcher?

Hello,

I am working on using Lucence based indexes for the ASF's mod_mbox. 
Current versions of mod_mbox support MIME, and I am trying to add full 
text searching. (Then we can completely remove Eyebrowse)

Currently I am hacking around with the C++ (CLucence) Implementation, 
but I intend to migrate to Lucence4c shortly.

I was structuring one Lucence Index per-mailing list.  To search All 
mailing lists, I was planning on using a MultiSearcher.

Currently, the ASF public mail archives use about 17 Gigs, uncompressed, 
in the raw mbox format.

There are also about ~300 mailing lists in the public archives.

Can a multi-searcher quickly search 300 different indexes?  I am 
thinking that it will not.  300 separate indexes is lots of files to 
scan, even if Lucence is fast.  Any experience from other users would be 
helpful.

Would it better to have a Single Main Index, for all of the lists, and 
include the List Names as a keyed field?

I suspect most searches would be restricted to one or two lists, but I 
would like good performance if I wanted to search all of the ASF lists.

Ideas/Comments?  Anyone willing to help me write some C :) ?

Thanks,

-Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org