You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Harpreet S Walia <ha...@sansuisoftware.com> on 2002/06/11 07:27:51 UTC
How does simple analyser work
Hi,
Are there any resources available which explain how the simple analyser processes the data given to it .
what i want to know is that suppose i have a set of words , what exact rules are applied to tokenize and index these words and how can i customize them.
My requirement is that the words be broken only by spaces and not at any other character . I understand that this can be done by writing a parser in JAVACC . but is there any simpler way of achieving this .
I would really appriciate the help .
Thanks and regards
Harpreet
Re: How does simple analyser work
Posted by Otis Gospodnetic <ot...@yahoo.com>.
--- Harpreet S Walia <ha...@sansuisoftware.com> wrote:
> Hi,
>
> Are there any resources available which explain how the simple
> analyser processes the data given to it .
> what i want to know is that suppose i have a set of words , what
> exact rules are applied to tokenize and index these words and how can
> i customize them.
>
> My requirement is that the words be broken only by spaces and not at
> any other character . I understand that this can be done by writing
> a parser in JAVACC . but is there any simpler way of achieving this .
Actually, this can be done by writing your own custom Analyzer.
Check this:
./org/apache/lucene/analysis/standard/StandardAnalyzer.java
./org/apache/lucene/analysis/Analyzer.java
./org/apache/lucene/analysis/de/GermanAnalyzer.java
./org/apache/lucene/analysis/SimpleAnalyzer.java
./org/apache/lucene/analysis/StopAnalyzer.java
./org/apache/lucene/analysis/WhitespaceAnalyzer.java
Maybe this last one is what you are looking for.
Otis
__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>