You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Egor Pahomov <ep...@griddynamics.com> on 2012/03/21 12:49:00 UTC

Impossibility to pass filedName to analysers through TokenizerChain::getStream()

    I have different stop-word dictionaries per field, but all these fields
are captured by the single dynamic field i.e. single field type i.e. single
analyser.

    It seems I need an improved TokenFilter, which is aware of the field
name, which it analyzes. Now filedName is passed into
TokenizerChain.getStream(), but it's not used there. How I can pass
filedName to token filters?

    I'm thinking of adding a new method TokenStream create(String field,
TokenStream input) into TokenFilterFactory interface, then implement it in
BaseTokenFilterFactory via calling the single argument create(TokenStream
input). After that I'd be able to pass fieldName to TokenFilterFactory in
TokenizerChain.getStream(String fieldName, Reader reader). As an
alternative I can introduce FieldAwareTokenFilterFactory interface with two
args create() and use "instanceof" in TokenizerChain.getStream().
    Is it a good solution for my problem?

    Egor