You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Helmut Jarausch <ja...@igpm.rwth-aachen.de> on 2007/12/10 17:39:22 UTC

content depending Analyzing

Hi,

I'm new to Lucene. I've seen similar questions to mine
but didn't get an answer to my question:

I'd like to index books from our library.
Among other field there are

LANG  which contains a code specifying the language
      the book is written in

TOC   the table of contents

When indexing I have to specify an Analyzer (esp. for
the TOC field) but that depends on the LANG field.
As far as I understood from the LiA book, an Analyzer
implements a 'TokenStream(String fieldName, Reader reader)"
But for me that's too late. When tokenizing the TOC
field I would need access to the LANG field to decide
how to tokenize.

Is there a solution to this problem?

Many thanks for your help,

Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: content depending Analyzing

Posted by Daniel Naber <lu...@danielnaber.de>.
On Montag, 10. Dezember 2007, Helmut Jarausch wrote:

>  an Analyzer
> implements a 'TokenStream(String fieldName, Reader reader)"
> But for me that's too late. When tokenizing the TOC
> field I would need access to the LANG field to decide
> how to tokenize.

IndexWriter contains an addDocument() call that also takes an analyzer. If 
you always use that call, the analyzer in the IndexWriter constructor will 
never be called. This way you can create your Document object and always 
use the appropriate analyzer.

I hope you are aware that you need to use the same analyzer for searching. 
This is a bit difficult with multiple analyzers, unless you ask the users 
what language they want to search in.

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org