You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Paul Querna <ch...@force-elite.com> on 2006/11/12 10:04:16 UTC

[PATCH] clucene search for mod_mbox

This is a work-in-progress patch, integrating CLuence[1] as a full text
search engine for mod_mbox.

For each directory full of mbox files, there is a .mbox_search_index
containing the CLucene index.  This index is created by mod-mbox-util,
when it is called with the -s argument.  This indexing is done
separately from the main mailing list cache updating.

Performance for searching the entire httpd archives only takes a couple
milliseconds once indexed on my MacBookPro.

TODOs:
- Split the patch into consumable bits/commits
- Make a proper Search Engine Result Page (SERP)
- Make Ajaxy Search on the left pane when reading a specific month
- Add OpenSearch support
- Clean up the search indexer to use a unified utilities function for
fetching the mime-decoded version of an email.
- Lots of code clean up

[1] - http://clucene.sourceforge.net/index.php/Main_Page

Re: [PATCH] clucene search for mod_mbox

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On 11/12/06, Paul Querna <ch...@force-elite.com> wrote:
> This is a work-in-progress patch, integrating CLuence[1] as a full text
> search engine for mod_mbox.

+1 (concept).  Since lucene4c isn't going anywhere soon, it makes
sense to look at other integration avenues.

Bonus points if you remove the silly automake crap while you're at it.
 =)  -- justin