You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Helmut Jarausch <ja...@igpm.rwth-aachen.de> on 2007/12/10 17:39:22 UTC
content depending Analyzing
Hi,
I'm new to Lucene. I've seen similar questions to mine
but didn't get an answer to my question:
I'd like to index books from our library.
Among other field there are
LANG which contains a code specifying the language
the book is written in
TOC the table of contents
When indexing I have to specify an Analyzer (esp. for
the TOC field) but that depends on the LANG field.
As far as I understood from the LiA book, an Analyzer
implements a 'TokenStream(String fieldName, Reader reader)"
But for me that's too late. When tokenizing the TOC
field I would need access to the LANG field to decide
how to tokenize.
Is there a solution to this problem?
Many thanks for your help,
Helmut Jarausch
Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: content depending Analyzing
Posted by Daniel Naber <lu...@danielnaber.de>.
On Montag, 10. Dezember 2007, Helmut Jarausch wrote:
> an Analyzer
> implements a 'TokenStream(String fieldName, Reader reader)"
> But for me that's too late. When tokenizing the TOC
> field I would need access to the LANG field to decide
> how to tokenize.
IndexWriter contains an addDocument() call that also takes an analyzer. If
you always use that call, the analyzer in the IndexWriter constructor will
never be called. This way you can create your Document object and always
use the appropriate analyzer.
I hope you are aware that you need to use the same analyzer for searching.
This is a bit difficult with multiple analyzers, unless you ask the users
what language they want to search in.
Regards
Daniel
--
http://www.danielnaber.de
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org