You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Owen Densmore <ow...@backspaces.net> on 2005/02/03 15:26:30 UTC
Right way to make analyzer
Is this the right way to make a porter analyzer using the standard
tokenizer? I'm not sure about the order of the filters.
Owen
class MyAnalyzer extends Analyzer {
public TokenStream tokenStream(String fieldName, Reader reader) {
return new PorterStemFilter(
new StopFilter(
new LowerCaseFilter(
new StandardFilter(
new StandardTokenizer(reader))),
StopAnalyzer.ENGLISH_STOP_WORDS));
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Right way to make analyzer
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Feb 3, 2005, at 9:26 AM, Owen Densmore wrote:
> Is this the right way to make a porter analyzer using the standard
> tokenizer? I'm not sure about the order of the filters.
>
> Owen
>
> class MyAnalyzer extends Analyzer {
> public TokenStream tokenStream(String fieldName, Reader reader) {
> return new PorterStemFilter(
> new StopFilter(
> new LowerCaseFilter(
> new StandardFilter(
> new StandardTokenizer(reader))),
> StopAnalyzer.ENGLISH_STOP_WORDS));
> }
> }
Yes, that is correct.
Analysis starts with a tokenizer, and chains the output of that to the
next filter and so on.
I strongly recommend, as you start tinkering with custom analysis, to
use a little bit of code to see how your analyzer works on some text.
The Lucene Intro article I wrote for java.net has some code you can
borrow to do this, as does Lucene in Action's source code. Also, Luke
has this capability - which is a tool I also highly recommend.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org