You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Antony Bowesman <ad...@teamware.com> on 2006/10/13 09:47:00 UTC

Email and attachments

Hi,

I am a newbie with Lucene and I am working out the best way to index email data.

An earlier poster talked about index attachments with two alternatives: 
However, there is a third alternative:

Each message/attachment is indexed as a separate Document with the email header 
data included in all Documents.  The drawback of this approach seems to be that 
it is not possible to make AND searches between two body parts in different 
documents directly in Lucene (or is it?).

One advantage of this approach is that it is then possible to use a different 
Analyzer for each Document, which is useful when the attachments contain data in 
different languages.

If combining all attachments to a single body field, it's only possible to use 
the index or Document analyzer.

Has anyone used this type of approach and does it work?

Regards
Antony



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org