You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Polina Litvak <pl...@casebank.com> on 2004/07/09 16:53:47 UTC
NullAnalyzer still tokenizes fields
I tried to create my own analyzer so it returns fields as they are
(without any tokenizing done), using code posted on lucene-user a short
while a go:
private static class NullAnalyzer
extends Analyzer
{
public TokenStream tokenStream(String fieldName, Reader reader)
{
return new CharTokenizer(reader)
{
protected boolean isTokenChar(char c)
{
return true;
}
};
}
}
After testing this analyzer I found out that fields I pass to it still
get tokenized.
E.g. I have a field with value ABCD-EF. When passing it through the
analyzer, the only characters that end up in isTokenChar() are "A, B, C,
D, E, F". So looks like "-" gets filtered out before it even gets to
isTokenChar().
Did anyone encounter this problem ?
Any help will be greatly appreciated!
Thanks,
Polina